Skip to content

feat(model-profiles): add input/output MIME type fields to ModelProfile#2

Open
DhirenMhatre wants to merge 1 commit into
masterfrom
open-swe/profile-mime-types
Open

feat(model-profiles): add input/output MIME type fields to ModelProfile#2
DhirenMhatre wants to merge 1 commit into
masterfrom
open-swe/profile-mime-types

Conversation

@DhirenMhatre

Copy link
Copy Markdown

Adds informational input_mime_types and output_mime_types dict fields to ModelProfile, keyed by models.dev modality names ('image', 'audio', 'pdf', 'video'). Augments the CLI to dispatch provider-level vs model-level overrides by ModelProfile field name so non-scalar provider-level fields (like the new MIME maps) are routed correctly.

Backfills anthropic, openai, and perplexity profile_augmentations.toml with documented MIME types and regenerates their _profiles.py.

Fixes #


Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview

All contributions must be in English. See the language policy.

If you paste a large clearly AI generated description here your PR may be IGNORED or CLOSED!

Thank you for contributing to LangChain! Follow these steps to have your pull request considered as ready for review.

  1. PR title: Should follow the format: TYPE(SCOPE): DESCRIPTION
  1. PR description:
  • Write 1-2 sentences summarizing the change.
  • The Fixes #xx line at the top is required for external contributions — update the issue number and keep the keyword. This links your PR to the approved issue and auto-closes it on merge.
  • If there are any breaking changes, please clearly describe them.
  • If this PR depends on another PR being merged first, please include "Depends on #PR_NUMBER" in the description.
  1. Run make format, make lint and make test from the root of the package(s) you've modified.
  • We will not consider a PR unless these three are passing in CI.
  1. How did you verify your code works?

Additional guidelines:

  • All external PRs must link to an issue or discussion where a solution has been approved by a maintainer, and you must be assigned to that issue. PRs without prior approval will be closed.
  • PRs should not touch more than one package unless absolutely necessary.
  • Do not update the uv.lock files or add dependencies to pyproject.toml files (even optional ones) unless you have explicit permission to do so by a maintainer.

Social handles (optional)

Twitter: @
LinkedIn: https://linkedin.com/in/

Adds informational input_mime_types and output_mime_types dict fields
to ModelProfile, keyed by models.dev modality names ('image', 'audio',
'pdf', 'video'). Augments the CLI to dispatch provider-level vs
model-level overrides by ModelProfile field name so non-scalar
provider-level fields (like the new MIME maps) are routed correctly.

Backfills anthropic, openai, and perplexity profile_augmentations.toml
with documented MIME types and regenerates their _profiles.py.

Co-authored-by: Mason Daugherty <61371264+mdrxy@users.noreply.github.com>
@codity-dm

codity-dm Bot commented May 17, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 17, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for declaring supported MIME types per modality.
  • Updated CLI validation to reject unknown augmentation keys against the ModelProfile schema.
  • Populated MIME type data for Anthropic, OpenAI, and Perplexity model profiles.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types TypedDict fields to ModelProfile with documentation.

CLI Validation: Schema-driven validation now distinguishes provider-level vs model-level overrides and exits on unknown scalar keys.

Partner Models: Added input_mime_types (image: png/jpeg/gif/webp, pdf: application/pdf) to all Anthropic, OpenAI, and Perplexity profiles. Added output_mime_types for OpenAI image generation models (gpt-image-1 variants).

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types TypedDict fields
libs/model-profiles/langchain_model_profiles/cli.py Added schema-driven validation for augmentation keys
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for cascading, overrides, and unknown key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Added input_mime_types to 14 Claude model profiles
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types and output_mime_types to all model profiles
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added MIME type overrides for image generation models
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to sonar-deep-research
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added [overrides.input_mime_types] section

Review Focus Areas

  • CLI validation logic for distinguishing provider vs model-level keys in cli.py:55-109.
  • Completeness of MIME type coverage across model variants (check for gaps in newer GPT-5.x series).
  • Test coverage for unknown key rejection edge cases.

Architecture

Design Decisions: Used TypedDict fields rather than nested objects to keep the schema flat and serialization-friendly. CLI validation strictly couples to ModelProfile field names to prevent configuration drift.

Scalability & Extensibility: MIME type declarations are per-modality dicts to allow future extension (e.g., audio, video) without schema changes. Out of scope: runtime validation of actual file content.

Risks:

  • Intentional: Strict CLI validation may break existing augmentation files with typos or unofficial keys. This is acceptable to enforce schema compliance.
  • Unintentional: MIME type lists are manually maintained and may become stale as providers update capabilities. No automated sync mechanism exists.

Merge Status

MERGEABLE — PR Score 76/100, above threshold (50). All gates passed.

@codity-dm

codity-dm Bot commented May 17, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 24.1s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 17, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 17, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-17 19:28 UTC | Score: 60/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 1
Medium 3
Low 5
Top Findings

[CQ-LLM-003] libs/partners/anthropic/langchain_anthropic/data/_profiles.py:40 (Duplication · HIGH)

Issue: Duplicate 'input_mime_types' structure across multiple model profiles.
Suggestion: Refactor to avoid duplication by creating a shared constant for 'input_mime_types'.

"input_mime_types": {"image": ["image/jpeg", "image/png", "image/gif", "image/webp"], "pdf": ["application/pdf"]},

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · MEDIUM)

Issue: Missing docstring for the new 'input_mime_types' field.
Suggestion: Add a docstring explaining the purpose and usage of 'input_mime_types'.

input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · MEDIUM)

Issue: Missing docstring for the new 'output_mime_types' field.
Suggestion: Add a docstring explaining the purpose and usage of 'output_mime_types'.

output_mime_types: dict[str, list[str]]

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:70 (Error_Handling · MEDIUM)

Issue: Potential swallowing of exceptions when importing ModelProfile.
Suggestion: Log the exception or handle it more gracefully to avoid silent failures.

except ImportError:

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:576 (Documentation · LOW)

Issue: Public def 'test_refresh_rejects_unknown_scalar_top_level_key' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_rejects_unknown_scalar_top_level_key(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 2 0 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 1 2 3
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 0 3 3
libs/partners/anthropic/langchain_anthropic/data/_profiles.py 0 1 0 0 1

Recommendations

  1. Resolve High severity issues, especially error handling gaps and performance bottlenecks.
  • Run automated tests after applying fixes to verify no regressions.

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds input_mime_types and output_mime_types dict fields to ModelProfile, updates the CLI to use schema-driven dispatch so these dict-valued fields are treated as provider-level overrides rather than model IDs, and backfills all three partner packages with documented MIME types.

  • model_profile.py cleanly adds the two new TypedDict fields, resolving the pre-existing TODO comments.
  • cli.py adds _profile_field_names() and rewires _load_augmentations() to distinguish provider fields from model IDs; the logic is correct when langchain_core is importable but falls back to a legacy heuristic that silently misclassifies the new dict-valued fields as model IDs when it cannot be imported.
  • The OpenAI augmentation file cascades input_mime_types to all models at the provider level, but the gpt-3.5-turbo override does not clear the MIME map to match its explicitly disabled multimodal boolean flags, resulting in a contradictory generated profile.

Confidence Score: 3/5

The Anthropic and Perplexity changes are internally consistent, but the OpenAI profiles ship with contradictory data for gpt-3.5-turbo and the CLI has a silent data-corruption path when langchain_core is unavailable.

The OpenAI generated profiles for gpt-3.5-turbo carry input_mime_types listing image and PDF types while image_inputs, image_url_inputs, and pdf_inputs are all explicitly False. The legacy fallback in _load_augmentations also misclassifies dict-valued provider fields as model IDs when langchain_core is unavailable, producing a bogus entry in the generated profiles.

libs/partners/openai/langchain_openai/data/profile_augmentations.toml and libs/partners/openai/langchain_openai/data/_profiles.py (gpt-3.5-turbo data contradiction), and libs/model-profiles/langchain_model_profiles/cli.py (legacy fallback with dict-valued fields).

Important Files Changed

Filename Overview
libs/core/langchain_core/language_models/model_profile.py Adds input_mime_types and output_mime_types fields to ModelProfile TypedDict; removes four TODO comments. Clean schema addition.
libs/model-profiles/langchain_model_profiles/cli.py Adds schema-driven dispatch in _load_augmentations to correctly route dict-valued provider fields vs model-id keys. Correct when langchain_core is available, but the legacy fallback silently misclassifies dict-valued fields as model IDs.
libs/model-profiles/tests/unit_tests/test_cli.py Adds three new tests covering provider-level cascade, model-level override precedence, and rejection of unknown scalar keys. No test exercises the legacy fallback with dict-valued fields.
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Adds provider-level input_mime_types and model-level output_mime_types for image-generation models. The gpt-3.5-turbo override does not clear input_mime_types, causing contradictory MIME data.
libs/partners/openai/langchain_openai/data/_profiles.py Regenerated profiles with input_mime_types on all OpenAI models. gpt-3.5-turbo has contradictory disabled multimodal flags with populated MIME types; text embedding models similarly inherit MIME types they cannot use.
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Regenerated profiles adding input_mime_types (image + PDF) to all Claude models. All Claude models support these modalities, so the data is internally consistent.
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Adds provider-level input_mime_types for image and PDF to the Anthropic augmentations. No contradictions introduced.
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Regenerated profiles adding model-level input_mime_types only to sonar-deep-research. Narrow, targeted change with no cascading issues.
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Adds model-level input_mime_types for sonar-deep-research only. Clean and targeted augmentation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[profile_augmentations.toml] --> B[_load_augmentations]
    B --> C{_profile_field_names returns non-empty?}
    C -- Yes / Schema-driven --> D{key in ModelProfile fields?}
    D -- Yes --> E[provider_aug e.g. input_mime_types]
    D -- No --> F{value is dict?}
    F -- Yes --> G[model_augs model-id overrides]
    F -- No --> H[sys.exit 1 unknown scalar key]
    C -- No / Legacy fallback --> I{value is dict?}
    I -- Yes --> J[model_augs BUGGY: input_mime_types treated as model ID]
    I -- No --> K[provider_aug]
    E --> L[_apply_overrides base + provider_aug + model_aug]
    G --> L
    J --> L
    L --> M[_profiles.py generated]
Loading

Comments Outside Diff (1)

  1. libs/partners/openai/langchain_openai/data/_profiles.py, line 1617-1634 (link)

    P2 Text embedding models inherit input_mime_types they cannot use

    text-embedding-3-large, text-embedding-3-small, and text-embedding-ada-002 are text-only embedding models with no image or PDF input path. The provider-level cascade applies input_mime_types to every model without exception, so these three models end up advertising image/PDF MIME support. Model-level overrides clearing the map (e.g., input_mime_types = {}) for these models would be consistent with the approach already taken for gpt-3.5-turbo's boolean flags.

Reviews (1): Last reviewed commit: "feat(model-profiles): add input/output M..." | Re-trigger Greptile

Comment on lines +80 to +90
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Contradictory MIME types on gpt-3.5-turbo

The gpt-3.5-turbo profile explicitly has image_inputs: False, image_url_inputs: False, and pdf_inputs: False, yet the provider-level input_mime_types cascade stamps it with image and PDF MIME types. Any consumer that reads input_mime_types from this profile and uses it to decide whether to attach images or PDFs would draw the wrong conclusion. The [overrides."gpt-3.5-turbo"] block in profile_augmentations.toml needs to clear the MIME map (e.g. input_mime_types = {}) the same way it clears the boolean flags.

Comment on lines +130 to 134
# Legacy fallback when ModelProfile is unavailable.
elif isinstance(value, dict):
model_augs[key] = value
else:
provider_aug[key] = value

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Legacy fallback creates bogus model profiles for dict-valued fields

When _profile_field_names() returns an empty frozenset (because langchain_core can't be imported or get_type_hints raises), the legacy branch treats every dict-valued key as a model ID. A [overrides.input_mime_types] block therefore ends up in model_augs under the key "input_mime_types" rather than in provider_aug. Because no model with that ID exists in models.dev, the extra_models loop then inserts a profile whose ID is literally "input_mime_types" into the generated _profiles.py. The fix is to emit a warning and skip the key in the legacy path, or document that the new dict-valued provider fields require langchain_core to be installed.

Comment on lines 14 to 18
[overrides."gpt-3.5-turbo"]
image_url_inputs = false
pdf_inputs = false
pdf_tool_message = false
image_tool_message = false

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 gpt-3.5-turbo override does not clear input_mime_types

The override correctly disables image_url_inputs, pdf_inputs, pdf_tool_message, and image_tool_message, but it does not nullify input_mime_types. Because _apply_overrides is a shallow key-level merge, the provider-level input_mime_types dict is inherited as-is, leaving the generated profile in a contradictory state. Adding input_mime_types = {} to this block would override the provider-level value with an empty map consistent with the disabled boolean flags.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for declaring supported MIME types per modality (images, PDFs).
  • Updated CLI validation to reject unknown augmentation keys by checking against ModelProfile schema.
  • Populated MIME type data for Anthropic, OpenAI, and Perplexity model profiles.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types TypedDict fields to ModelProfile with documentation.

CLI Validation: Schema-driven validation now distinguishes provider-level vs model-level overrides. Unknown scalar keys trigger SystemExit(1).

Partner Profiles:

  • Anthropic: All 14 models now declare input_mime_types for images (jpeg/png/gif/webp) and PDFs.
  • OpenAI: Vision models declare image/PDF inputs; image generation models (gpt-image-1 series) declare output_mime_types.
  • Perplexity: sonar-deep-research model declares input_mime_types.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types TypedDict fields
libs/model-profiles/langchain_model_profiles/cli.py Added schema-driven validation for augmentation keys with strict rejection of unknown fields
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for provider cascading, model overrides, and unknown key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for all 14 model profiles
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section with image and PDF MIME types
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types to vision models; output_mime_types to image generation models
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added MIME type declarations for image inputs and outputs
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to model profile
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added [overrides.input_mime_types] section

Review Focus Areas

  • CLI validation logic at cli.py:55-109: ensure provider/model override distinction handles nested dicts correctly.
  • OpenAI output_mime_types for image generation models: verify completeness against actual API capabilities.

Architecture

Design Decisions:

  • Used TypedDict over dataclass for ModelProfile fields to maintain backward compatibility with existing TOML-based profile definitions.
  • Strict CLI validation (exit on unknown keys) prevents silent configuration errors in production deployments.

Risks:

  • Intentional: SystemExit(1) on validation failure is a breaking change for workflows with typos in augmentation TOMLs. This is acceptable to catch misconfigurations early.
  • Unintentional: Provider-level vs model-level override precedence in nested dict merging should be verified against expected cascading behavior.

Merge Status

MERGEABLE — PR Score 78/100, above threshold (50). All gates passed.

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 27.6s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-18 16:48 UTC | Score: 63/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 0
Medium 4
Low 7
Top Findings

[CQ-LLM-003] libs/model-profiles/langchain_model_profiles/cli.py:70 (Complexity · MEDIUM)

Issue: Deep nesting in the if-else structure can reduce readability.
Suggestion: Refactor the nested if-else statements to improve readability.

+            if key in profile_fields:

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:75 (Error_Handling · MEDIUM)

Issue: Swallowed exceptions when importing ModelProfile.
Suggestion: Log the exception or handle it appropriately instead of silently returning an empty frozenset.

+    except ImportError:

[CQ-LLM-005] libs/model-profiles/tests/unit_tests/test_cli.py:473 (Testability · MEDIUM)

Issue: Test function relies on global state and hard-coded dependencies.
Suggestion: Use dependency injection to pass dependencies into the test function.

+def test_refresh_merges_provider_level_mime_types(

[CQ-LLM-006] libs/partners/anthropic/langchain_anthropic/data/_profiles.py:41 (Maintainability · MEDIUM)

Issue: Repeated structure for input_mime_types across multiple profiles.
Suggestion: Consider creating a constant or a function to generate these structures to avoid duplication.

+        "input_mime_types": {

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · LOW)

Issue: Missing detailed documentation for input_mime_types field.
Suggestion: Add more examples and details about the expected structure and usage of input_mime_types.

+    input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · LOW)

Issue: Missing detailed documentation for output_mime_types field.
Suggestion: Add more examples and details about the expected structure and usage of output_mime_types.

+    output_mime_types: dict[str, list[str]]

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 0 2 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 2 2 4
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 1 3 4
libs/partners/anthropic/langchain_anthropic/data/_profiles.py 0 0 1 0 1

Recommendations

  • Run automated tests after applying fixes to verify no regressions.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for tracking supported media formats per modality (image, PDF, audio, video).
  • Updated CLI to use schema-driven validation for provider vs model-level overrides, with strict rejection of unknown keys.
  • Populated MIME type data for OpenAI, Anthropic, and Perplexity model profiles.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types dict fields to ModelProfile in libs/core/langchain_core/language_models/model_profile.py:58-136. Removed TODO comments about format details.

CLI Validation: Schema-driven override parsing in libs/model-profiles/langchain_model_profiles/cli.py:55-135. Keys matching ModelProfile fields are provider-level, others are model IDs. Unknown scalar keys now exit with code 1.

Partner Data: Populated input_mime_types for image (png, jpeg, gif, webp) and PDF across OpenAI (GPT-4o, GPT-5.x, o-series, image models), Anthropic (Claude models), and Perplexity (sonar-deep-research). Added output_mime_types for OpenAI image generation models.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types fields; removed TODO comments
libs/model-profiles/langchain_model_profiles/cli.py Added _profile_field_names(); schema-driven override parsing; strict unknown key rejection
libs/model-profiles/tests/unit_tests/test_cli.py Tests for MIME type cascading, model-level overrides, unknown key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for Claude models
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types to all models; output_mime_types for image generation models
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Centralized MIME type overrides; added output_mime_types for image models
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to sonar-deep-research
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added MIME type override for sonar-deep-research

Review Focus Areas

  • CLI strict validation: confirm SystemExit(1) on unknown keys is the intended behavior change from silent acceptance.
  • OpenAI output_mime_types for image models: verify completeness of supported output formats.

Architecture

Design Decisions: Schema-driven CLI validation replaces ad-hoc parsing. Provider-level vs model-level distinction is based on field name matching rather than explicit markers. This is a deliberate tradeoff for simpler TOML structure.

Risks: Strict unknown key rejection (SystemExit(1)) is an intentional breaking change from silent acceptance. This may break existing TOML files with typos or unsupported keys.

Merge Status

MERGEABLE — PR Score 72/100, above threshold (50). All gates passed.

Comment on lines 1541 to +1916
@@ -1102,6 +1574,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o1-pro": {
"name": "o1-pro",
@@ -1128,6 +1611,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o3": {
"name": "o3",
@@ -1154,6 +1648,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o3-deep-research": {
"name": "o3-deep-research",
@@ -1179,6 +1684,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o3-mini": {
"name": "o3-mini",
@@ -1205,6 +1721,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o3-pro": {
"name": "o3-pro",
@@ -1231,6 +1758,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o4-mini": {
"name": "o4-mini",
@@ -1257,6 +1795,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"o4-mini-deep-research": {
"name": "o4-mini-deep-research",
@@ -1282,6 +1831,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"text-embedding-3-large": {
"name": "text-embedding-3-large",
@@ -1307,6 +1867,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},
},
"text-embedding-3-small": {
"name": "text-embedding-3-small",
@@ -1332,6 +1903,17 @@
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

These MIME type declarations now claim image and PDF support for models such as o1-mini, o1-preview, and the embedding models even where the surrounding capability flags say attachments or image inputs are unsupported, so derive input_mime_types from the existing capability booleans or omit unsupported categories.

Suggested fix
        "input_mime_types": {
            **({
                "image": [
                    "image/png",
                    "image/jpeg",
                    "image/gif",
                    "image/webp",
                ],
            } if profile.get("image_inputs") and profile.get("attachment") else {}),
            **({
                "pdf": [
                    "application/pdf",
                ],
            } if profile.get("pdf_inputs") and profile.get("attachment") else {}),
        },
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: libs/partners/openai/langchain_openai/data/_profiles.py
Lines: 1541-1916
Issue Type: functional-high
Severity: high

Issue Description:
These MIME type declarations now claim image and PDF support for models such as `o1-mini`, `o1-preview`, and the embedding models even where the surrounding capability flags say attachments or image inputs are unsupported, so derive `input_mime_types` from the existing capability booleans or omit unsupported categories.

Current Code:
        "input_mime_types": {
            "image": [
                "image/png",
                "image/jpeg",
                "image/gif",
                "image/webp",
            ],
            "pdf": [
                "application/pdf",
            ],
        },

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

Comment on lines +115 to +129
if profile_fields:
# Schema-driven: known profile field names are provider-level; all
# other keys are treated as model identifiers (whose values must be
# dict overrides).
if key in profile_fields:
provider_aug[key] = value
elif isinstance(value, dict):
model_augs[key] = value
else:
msg = (
f"Augmentation key '{key}' is not a declared ModelProfile "
f"field and its value is not a table of overrides."
)
print(f"❌ {msg}", file=sys.stderr)
sys.exit(1)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robustness Medium

The new schema-driven branch rejects any non-dict key not present on ModelProfile, which breaks forward compatibility when profile_augmentations.toml contains a newly added provider-level field; keep unknown scalar keys as provider overrides instead of exiting.

Suggested fix
        if profile_fields:
            # Schema-driven: known profile field names are provider-level; dict
            # values for unknown keys are treated as model identifiers.
            if key in profile_fields:
                provider_aug[key] = value
            elif isinstance(value, dict):
                model_augs[key] = value
            else:
                provider_aug[key] = value
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: libs/model-profiles/langchain_model_profiles/cli.py
Lines: 115-129
Issue Type: robustness-medium
Severity: medium

Issue Description:
The new schema-driven branch rejects any non-dict key not present on `ModelProfile`, which breaks forward compatibility when `profile_augmentations.toml` contains a newly added provider-level field; keep unknown scalar keys as provider overrides instead of exiting.

Current Code:
        if profile_fields:
            # Schema-driven: known profile field names are provider-level; all
            # other keys are treated as model identifiers (whose values must be
            # dict overrides).
            if key in profile_fields:
                provider_aug[key] = value
            elif isinstance(value, dict):
                model_augs[key] = value
            else:
                msg = (
                    f"Augmentation key '{key}' is not a declared ModelProfile "
                    f"field and its value is not a table of overrides."
                )
                print(f"❌ {msg}", file=sys.stderr)
                sys.exit(1)

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 26.6s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 18, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-18 17:07 UTC | Score: 67/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 0
Medium 3
Low 8
Top Findings

[CQ-LLM-003] libs/model-profiles/langchain_model_profiles/cli.py:70 (Complexity · MEDIUM)

Issue: The function '_load_augmentations' has increased cyclomatic complexity due to multiple conditional branches.
Suggestion: Refactor the function to reduce complexity, possibly by breaking it into smaller functions.

+        if profile_fields:

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:80 (Error_Handling · MEDIUM)

Issue: Swallowed exceptions in '_profile_field_names' function could lead to silent failures.
Suggestion: Log the exceptions or handle them appropriately to avoid silent failures.

+    except (TypeError, NameError):

[CQ-LLM-005] libs/model-profiles/tests/unit_tests/test_cli.py:472 (Testability · MEDIUM)

Issue: Test functions are tightly coupled to the implementation details of the 'refresh' function.
Suggestion: Use dependency injection or mocking to decouple tests from implementation details.

+    mock_response = Mock()

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · LOW)

Issue: Missing detailed documentation for the new 'input_mime_types' field.
Suggestion: Add examples and detailed descriptions for the 'input_mime_types' field.

+    input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · LOW)

Issue: Missing detailed documentation for the new 'output_mime_types' field.
Suggestion: Add examples and detailed descriptions for the 'output_mime_types' field.

+    output_mime_types: dict[str, list[str]]

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:576 (Documentation · LOW)

Issue: Public def 'test_refresh_rejects_unknown_scalar_top_level_key' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_rejects_unknown_scalar_top_level_key(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 0 2 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 2 2 4
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 1 3 4
libs/partners/anthropic/langchain_anthropic/data/_profiles.py 0 0 0 1 1

Recommendations

  • Run automated tests after applying fixes to verify no regressions.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 19, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 19, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for declaring supported MIME types per modality.
  • Updated CLI to validate augmentation keys against ModelProfile schema, rejecting unknown fields.
  • Populated MIME type metadata for Anthropic, OpenAI, and Perplexity model profiles.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types TypedDict fields to ModelProfile with documentation.

CLI Validation: Schema-driven validation now distinguishes provider-level vs model-level overrides and exits on unknown scalar keys.

Anthropic Models: Added input_mime_types (image: jpeg/png/gif/webp, pdf: application/pdf) to all 14 Claude model profiles.

OpenAI Models: Added input_mime_types to GPT-4o, GPT-5, o1/o3/o4, and embedding models. Added output_mime_types for gpt-image-1 variants.

Perplexity Models: Added input_mime_types to sonar-deep-research model.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types TypedDict fields
libs/model-profiles/langchain_model_profiles/cli.py Added schema-driven validation for augmentation keys
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for cascading, overrides, and unknown key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Added input_mime_types to all Claude model profiles
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types and output_mime_types to model profiles
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Centralized MIME type overrides for image generation models
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to sonar-deep-research
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added MIME type override for sonar-deep-research

Review Focus Areas

  • CLI validation logic in cli.py:55-109 for proper schema enforcement and error messages.
  • MIME type coverage completeness across OpenAI model variants (embedding models, reasoning models).
  • Test coverage for edge cases in augmentation key validation.

Architecture

Design Decisions: MIME types are declared per-model rather than per-provider to allow fine-grained differences (e.g., older models lacking webp support). The CLI uses runtime schema introspection to stay in sync with ModelProfile definitions without manual maintenance.

Scalability & Extensibility: New MIME types or modalities require only profile updates, no code changes. Out of scope: automatic discovery of MIME type support from provider APIs.

Risks: Intentional: Strict CLI validation may break existing augmentation files with typos or unofficial keys. This is acceptable to prevent silent misconfiguration.

Merge Status

MERGEABLE — PR Score 79/100, above threshold (50). All gates passed.

Comment on lines +1541 to +1551
"input_mime_types": {
"image": [
"image/png",
"image/jpeg",
"image/gif",
"image/webp",
],
"pdf": [
"application/pdf",
],
},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

These profiles now advertise image and PDF MIME types even for models whose capability flags in this file disable those inputs, so generate input_mime_types conditionally or omit unsupported categories.

Also reported at: libs/partners/openai/langchain_openai/data/_profiles.py L1577–L1587, L1724–L1734, L1870–L1880, L1906–L1916, L1942–L1944

Suggested fix
        "input_mime_types": {
            "pdf": [
                "application/pdf",
            ],
        },
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: libs/partners/openai/langchain_openai/data/_profiles.py
Lines: 1541-1551
Issue Type: functional-high
Severity: high

Issue Description:
These profiles now advertise image and PDF MIME types even for models whose capability flags in this file disable those inputs, so generate `input_mime_types` conditionally or omit unsupported categories.

_Also reported at: `libs/partners/openai/langchain_openai/data/_profiles.py` L1577–L1587, L1724–L1734, L1870–L1880, L1906–L1916, L1942–L1944_

Current Code:
        "input_mime_types": {
            "image": [
                "image/png",
                "image/jpeg",
                "image/gif",
                "image/webp",
            ],
            "pdf": [
                "application/pdf",
            ],
        },

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

@codity-dm

codity-dm Bot commented May 19, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 28.6s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 19, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 19, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-19 19:41 UTC | Score: 89/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 0
Medium 0
Low 5
Top Findings

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:576 (Documentation · LOW)

Issue: Public def 'test_refresh_rejects_unknown_scalar_top_level_key' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_rejects_unknown_scalar_top_level_key(

Per-File Breakdown

File Critical High Medium Low Total
libs/model-profiles/langchain_model_profiles/cli.py 0 0 0 2 2
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 0 3 3

Recommendations

  • Run automated tests after applying fixes to verify no regressions.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for declaring supported MIME types per modality.
  • Populated MIME type data for OpenAI, Anthropic, and Perplexity models with actual supported formats.
  • Added CLI support in langchain_model_profiles for provider-level and model-level MIME type overrides via TOML configuration.

Key Changes by Area

Core Model Profiles: Extended ModelProfile dataclass with input_mime_types and output_mime_types fields for programmatic validation.

OpenAI Partner: Added MIME type support for GPT-5.x, o1, and image generation models. Image inputs support png/jpeg/gif/webp. PDF input supported for most models. Image generation models declare output formats.

Anthropic Partner: Populated MIME types for all models with image (jpeg/png/gif/webp) and PDF support.

Perplexity Partner: Added image MIME type support for sonar-deep-research model.

CLI Tooling: Added commands to manage MIME type overrides in profile_augmentations.toml.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types fields to ModelProfile dataclass
libs/model-profiles/langchain_model_profiles/cli.py Added CLI commands for MIME type override management
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for new CLI MIME type functionality
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for all Anthropic models
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section with image and PDF MIME types
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types to all models; output_mime_types to image generation models
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added default MIME type overrides and model-specific exceptions (e.g., gpt-3.5-turbo disables image/PDF)
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types for sonar-deep-research model
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added MIME type configuration for Perplexity models

Review Focus Areas

  • Verify MIME type lists match actual provider API capabilities (especially Anthropic PDF support and OpenAI model-specific exceptions).
  • Check CLI override logic correctly merges provider defaults with model-specific exceptions.
  • Confirm output_mime_types is only set for image generation models, not chat models.

Architecture

Design Decisions: Used dict-of-lists structure ({"image": ["image/png", ...]}) rather than flat list to allow modality-aware validation. TOML overrides enable hotfixes without code changes when providers update capabilities.

Scalability & Extensibility: Structure supports new modalities (audio, video) by adding new keys to the dict. Out of scope: runtime validation logic (this PR only declares capabilities).

Risks:

  • Intentional: Hardcoded MIME type lists will drift from provider APIs over time. TOML overrides mitigate but require manual maintenance.
  • Unintentional: gpt-3.5-turbo exception in OpenAI TOML may be overly broad (check if newer versions support images).

Merge Status

MERGEABLE — PR Score 66/100, above threshold (50). All gates passed.

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 27.9s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-20 20:03 UTC | Score: 67/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 0
Medium 3
Low 8
Top Findings

[CQ-LLM-003] libs/model-profiles/langchain_model_profiles/cli.py:55 (Complexity · MEDIUM)

Issue: The _profile_field_names function has multiple try-except blocks which increases complexity.
Suggestion: Consider simplifying the error handling logic to reduce cyclomatic complexity.

def _profile_field_names() -> frozenset[str]:

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:90 (Error_Handling · MEDIUM)

Issue: The error handling for unknown scalar keys does not provide a clear mechanism for recovery.
Suggestion: Consider logging the error or providing a more informative message before exiting.

sys.exit(1)

[CQ-LLM-005] libs/model-profiles/tests/unit_tests/test_cli.py:472 (Testability · MEDIUM)

Issue: The test cases are tightly coupled with the implementation details of the refresh function.
Suggestion: Use dependency injection to make the tests more flexible and easier to maintain.

refresh("anthropic", data_dir)

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · LOW)

Issue: Missing detailed documentation for input_mime_types field.
Suggestion: Add more examples and details about the expected structure and usage of input_mime_types.

input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · LOW)

Issue: Missing detailed documentation for output_mime_types field.
Suggestion: Add more examples and details about the expected structure and usage of output_mime_types.

output_mime_types: dict[str, list[str]]

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:576 (Documentation · LOW)

Issue: Public def 'test_refresh_rejects_unknown_scalar_top_level_key' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_rejects_unknown_scalar_top_level_key(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 0 2 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 2 2 4
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 1 3 4
libs/partners/anthropic/langchain_anthropic/data/_profiles.py 0 0 0 1 1

Recommendations

  • Run automated tests after applying fixes to verify no regressions.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for declaring supported MIME types per model and modality.
  • Updated CLI validation to reject unknown augmentation keys by checking against ModelProfile schema.
  • Populated MIME type metadata for Anthropic, OpenAI, and Perplexity model families.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types TypedDict fields to ModelProfile with documentation.

CLI Validation: Schema-driven validation now distinguishes provider-level from model-level overrides. Unknown scalar keys trigger SystemExit(1).

Anthropic Models: All 16 Claude models now declare image (image/jpeg, image/png, image/gif, image/webp) and PDF (application/pdf) support.

OpenAI Models: All GPT and o1 families plus gpt-image-1 variants declare input MIME types. Image generation models also declare output_mime_types.

Perplexity Models: sonar-deep-research profile includes image MIME type support via base profile and augmentation override.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types TypedDict fields
libs/model-profiles/langchain_model_profiles/cli.py Added schema-driven validation for augmentation keys with unknown key rejection
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for provider cascading, model overrides, and unknown key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for all 16 Claude model profiles
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section for Claude Opus/Sonnet variants
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types to all OpenAI model profiles
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added default input_mime_types override and output_mime_types for gpt-image-1
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to sonar-deep-research profile
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added image MIME types override for sonar-deep-research

Review Focus Areas

  • CLI validation logic at cli.py:58-132: confirm provider vs model override detection handles nested structures correctly.
  • OpenAI output_mime_types for image generation models: verify completeness against actual API capabilities.
  • Perplexity MIME type declarations: confirm sonar-deep-research image support is accurate (documentation is sparse).

Architecture

Design Decisions: Using ModelProfile schema as the source of truth for valid augmentation keys prevents drift between core definitions and CLI validation. This creates a hard failure on unknown keys rather than silent ignore, catching typos early at the cost of requiring schema updates for new fields.

Scalability & Extensibility: MIME type declarations are intentionally flat lists rather than structured per-modality. This keeps the schema simple but may require revisiting if modality-specific metadata (e.g., max image dimensions) becomes needed.

Risks:

  • Intentional: Hard SystemExit(1) on unknown keys breaks backward compatibility for any external augmentation files with extra keys. This is acceptable for a CLI tool but should be documented.
  • Unintentional: Perplexity's image support is declared based on limited documentation. Risk of false positive capability declaration if the API differs from documentation.

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 28.8s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for granular MIME type specification per modality.
  • Updated CLI augmentation loader to validate override keys against ModelProfile schema, rejecting unknown keys with exit code 1.
  • Populated MIME type data for Anthropic, OpenAI, and Perplexity models covering images (jpeg/png/gif/webp) and PDFs.

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types fields to ModelProfile in langchain_core, replacing TODO comments.

CLI Tooling: Added strict validation in the augmentation loader to reject unknown scalar override keys.

Partner Models:

  • Anthropic: Added input_mime_types for Claude Opus/Sonnet/Haiku variants.
  • OpenAI: Added input_mime_types for all vision-capable models and output_mime_types for image generation models (gpt-image-1 series).
  • Perplexity: Added input_mime_types for sonar-deep-research.

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types fields to ModelProfile
libs/model-profiles/langchain_model_profiles/cli.py Added validation for override keys against ModelProfile schema
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for CLI validation of unknown override keys
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for Anthropic models
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section for Claude variants
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types and output_mime_types for OpenAI models
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added MIME type overrides for OpenAI models
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types for sonar-deep-research
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added MIME type overrides for Perplexity models

Review Focus Areas

  • CLI validation logic in cli.py:55-109 for proper error handling on unknown keys.
  • OpenAI output_mime_types for image generation models (new field usage).
  • Consistency of MIME type values across partner packages (image/png vs image/jpeg ordering, PDF inclusion).

Architecture

Design Decisions: Used per-modality MIME type maps instead of flat lists to allow future extension (e.g., audio, video) without schema changes. The CLI strict validation prevents silent misconfiguration.

Scalability & Extensibility: The modality-keyed structure ({"image": [...], "pdf": [...]}) supports adding new input/output types without breaking changes. Out of scope: runtime validation of actual file content.

Risks:

  • Intentional: Partner packages now require coordinated updates when core schema changes. This is acceptable given the monorepo structure.
  • Unintentional: CLI exit code 1 on unknown keys may break existing augmentation workflows with stale TOML files. Reviewers should check for internal tooling that might rely on permissive parsing.

Merge Status

MERGEABLE — PR Score 67/100, above threshold (50). All gates passed.

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 25.8s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 20, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-20 21:06 UTC | Score: 68/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 0
Medium 3
Low 7
Top Findings

[CQ-LLM-003] libs/model-profiles/langchain_model_profiles/cli.py:55 (Complexity · MEDIUM)

Issue: The _profile_field_names function has multiple try-except blocks which increases complexity.
Suggestion: Consider simplifying the error handling logic to reduce cyclomatic complexity.

def _profile_field_names() -> frozenset[str]:

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:90 (Error_Handling · MEDIUM)

Issue: The error handling for unknown scalar keys does not provide a clear mechanism for recovery.
Suggestion: Consider logging the error or providing a more informative message before exiting.

sys.exit(1)

[CQ-LLM-005] libs/model-profiles/tests/unit_tests/test_cli.py:472 (Testability · MEDIUM)

Issue: The test cases are tightly coupled with the implementation details of the refresh function.
Suggestion: Consider using dependency injection to make the tests more flexible and easier to maintain.

refresh("anthropic", data_dir)

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · LOW)

Issue: Missing detailed documentation for input_mime_types field.
Suggestion: Add more examples or details about the expected structure and usage of input_mime_types.

input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · LOW)

Issue: Missing detailed documentation for output_mime_types field.
Suggestion: Add more examples or details about the expected structure and usage of output_mime_types.

output_mime_types: dict[str, list[str]]

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:576 (Documentation · LOW)

Issue: Public def 'test_refresh_rejects_unknown_scalar_top_level_key' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_rejects_unknown_scalar_top_level_key(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 0 2 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 2 2 4
libs/model-profiles/tests/unit_tests/test_cli.py 0 0 1 3 4

Recommendations

  • Run automated tests after applying fixes to verify no regressions.

@DhirenMhatre

Copy link
Copy Markdown
Author

@codity review

@codity-dm

codity-dm Bot commented May 21, 2026

Copy link
Copy Markdown

Policy Check Failed

✗ 3/3 policy checks failed:

• Need 2 more approval(s) (0/2) — comment LGTM or approve via review
• Missing ticket reference (expected: JIRA-, ENG-, #*)
• 6 code file(s) changed but no test files added


To merge this PR:

  1. Address the failed checks listed above
  2. Ensure branch protection requires the codity/policy-check status

Configure policies in your dashboard

@codity-dm

codity-dm Bot commented May 21, 2026

Copy link
Copy Markdown

PR Summary

What Changed

  • Added input_mime_types and output_mime_types fields to ModelProfile for specifying supported MIME types by modality (image, PDF, etc.)
  • Updated CLI validation to reject unknown augmentation keys against ModelProfile schema
  • Populated MIME type metadata for Anthropic, OpenAI, and Perplexity model profiles

Key Changes by Area

Core Schema: Added input_mime_types and output_mime_types TypedDict fields to ModelProfile with documentation

CLI Validation: Schema-driven validation now distinguishes provider-level vs model-level overrides and exits on unknown scalar keys

Partner Models:

  • Anthropic: Added input_mime_types to all 16 Claude profiles (image: jpeg/png/gif/webp, pdf: application/pdf)
  • OpenAI: Added input_mime_types to all GPT and o1 variants; output_mime_types for gpt-image-1 generation models
  • Perplexity: Added input_mime_types to sonar-deep-research profile

Files Changed

File Changes Summary
libs/core/langchain_core/language_models/model_profile.py Added input_mime_types and output_mime_types TypedDict fields
libs/model-profiles/langchain_model_profiles/cli.py Added schema-driven validation for augmentation keys
libs/model-profiles/tests/unit_tests/test_cli.py Added tests for provider cascading, model overrides, and invalid key rejection
libs/partners/anthropic/langchain_anthropic/data/_profiles.py Populated input_mime_types for all Claude model profiles
libs/partners/anthropic/langchain_anthropic/data/profile_augmentations.toml Added [overrides.input_mime_types] section
libs/partners/openai/langchain_openai/data/_profiles.py Added input_mime_types to all OpenAI model profiles
libs/partners/openai/langchain_openai/data/profile_augmentations.toml Added global input_mime_types and model-specific output_mime_types
libs/partners/perplexity/langchain_perplexity/data/_profiles.py Added input_mime_types to sonar-deep-research
libs/partners/perplexity/langchain_perplexity/data/profile_augmentations.toml Added image MIME type override

Review Focus Areas

  • CLI validation logic for distinguishing provider vs model-level overrides
  • Completeness of MIME type coverage across model variants (especially image generation outputs)
  • Test coverage for invalid key rejection paths

Architecture

Design Decisions: Used TypedDict fields rather than free-form dicts to enable static validation. CLI validation enforces schema compliance at build time rather than runtime.

Scalability & Extensibility: The schema-driven approach allows new MIME type modalities to be added without CLI code changes. Out of scope: runtime validation of actual file content.

Risks:

  • Intentional: Strict validation may reject valid legacy augmentations not yet in schema. Mitigated by explicit error messages.
  • Unintentional: Provider-level cascading logic complexity in CLI may have edge cases with nested overrides.

Merge Status

NOT MERGEABLE — PR Score 55/100, below threshold (50)

  • [H5] 3 HIGH-severity inline review findings need resolution (threshold: 3)

Comment on lines +112 to +131
profile_fields = _profile_field_names()

for key, value in overrides.items():
if isinstance(value, dict):
if profile_fields:
# Schema-driven: known profile field names are provider-level; all
# other keys are treated as model identifiers (whose values must be
# dict overrides).
if key in profile_fields:
provider_aug[key] = value
elif isinstance(value, dict):
model_augs[key] = value
else:
msg = (
f"Augmentation key '{key}' is not a declared ModelProfile "
f"field and its value is not a table of overrides."
)
print(f"❌ {msg}", file=sys.stderr)
sys.exit(1)
# Legacy fallback when ModelProfile is unavailable.
elif isinstance(value, dict):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional High

If ModelProfile imports successfully but get_type_hints(ModelProfile) returns an empty mapping, the new branch silently falls back to the legacy heuristic and can misclassify provider-level dict-valued fields as model-specific overrides; treat successful schema discovery separately from whether the field set is empty.

Suggested fix
    profile_fields = _profile_field_names()
    schema_available = profile_fields is not None

    for key, value in overrides.items():
        if schema_available:
            # Schema-driven: known profile field names are provider-level; all
            # other keys are treated as model identifiers (whose values must be
            # dict overrides).
            if key in profile_fields:
                provider_aug[key] = value
            elif isinstance(value, dict):
                model_augs[key] = value
            else:
                msg = (
                    f"Augmentation key '{key}' is not a declared ModelProfile "
                    f"field and its value is not a table of overrides."
                )
                print(f"❌ {msg}", file=sys.stderr)
                sys.exit(1)
        elif isinstance(value, dict):
            model_augs[key] = value
        else:
            provider_aug[key] = value
Prompt for AI assistance

Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:

You are an expert python developer with deep knowledge of security, performance, and best practices.

### Context

File: libs/model-profiles/langchain_model_profiles/cli.py
Lines: 112-131
Issue Type: functional-high
Severity: high

Issue Description:
If `ModelProfile` imports successfully but `get_type_hints(ModelProfile)` returns an empty mapping, the new branch silently falls back to the legacy heuristic and can misclassify provider-level dict-valued fields as model-specific overrides; treat successful schema discovery separately from whether the field set is empty.

Current Code:
    profile_fields = _profile_field_names()

    for key, value in overrides.items():
        if profile_fields:
            # Schema-driven: known profile field names are provider-level; all
            # other keys are treated as model identifiers (whose values must be
            # dict overrides).
            if key in profile_fields:
                provider_aug[key] = value
            elif isinstance(value, dict):
                model_augs[key] = value
            else:
                msg = (
                    f"Augmentation key '{key}' is not a declared ModelProfile "
                    f"field and its value is not a table of overrides."
                )
                print(f"❌ {msg}", file=sys.stderr)
                sys.exit(1)
        # Legacy fallback when ModelProfile is unavailable.
        elif isinstance(value, dict):
            model_augs[key] = value
        else:
            provider_aug[key] = value

---

### Instructions

1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed

### Constraints

- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready

---


Like Dislike Create Issue Jira

@codity-dm

codity-dm Bot commented May 21, 2026

Copy link
Copy Markdown

Security Scan Summary

Metric Value
Vulnerabilities Critical: 0
Overall Risk Clean
Files Scanned 9

No critical security issues detected

Scan completed in 28.5s

Security scan powered by Codity.ai

@codity-dm

codity-dm Bot commented May 21, 2026

Copy link
Copy Markdown

License Compliance Scan

Metric Value
Packages Scanned 0
High Risk (Strong Copyleft) 0
Medium Risk (Weak Copyleft) 0
Low Risk (Permissive) 0
Unknown License 0

All licenses are low-risk and compliant

Powered by Codity.ai · Docs

@codity-dm

codity-dm Bot commented May 21, 2026

Copy link
Copy Markdown

Code Quality Report — test-org-codity/langchain · PR #2

Scanned: 2026-05-21 10:08 UTC | Score: 58/100 | Provider: github

Executive Summary

Severity Count
Critical 0
High 1
Medium 3
Low 7
Top Findings

[CQ-LLM-005] libs/model-profiles/tests/unit_tests/test_cli.py:473 (Testability · HIGH)

Issue: Test functions are tightly coupled to the implementation details of the 'refresh' function.
Suggestion: Use dependency injection to make the tests more flexible and easier to maintain.

refresh('anthropic', data_dir)

[CQ-LLM-003] libs/model-profiles/langchain_model_profiles/cli.py:70 (Complexity · MEDIUM)

Issue: The function '_load_augmentations' has increased cyclomatic complexity due to multiple conditional branches.
Suggestion: Refactor the function to reduce complexity, possibly by breaking it into smaller functions.

if profile_fields: ...

[CQ-LLM-004] libs/model-profiles/langchain_model_profiles/cli.py:75 (Error_Handling · MEDIUM)

Issue: Swallowed exceptions in '_profile_field_names' function may lead to silent failures.
Suggestion: Log the exceptions or handle them appropriately to avoid silent failures.

except (TypeError, NameError): return frozenset()

[CQ-LLM-006] libs/partners/anthropic/langchain_anthropic/data/_profiles.py:41 (Maintainability · MEDIUM)

Issue: Repeated structure for 'input_mime_types' across multiple model profiles indicates a DRY violation.
Suggestion: Consider creating a shared constant or function to define 'input_mime_types' to avoid duplication.

"input_mime_types": { ... }

[CQ-LLM-001] libs/core/langchain_core/language_models/model_profile.py:58 (Documentation · LOW)

Issue: Missing detailed documentation for the new 'input_mime_types' field.
Suggestion: Add a description of the 'input_mime_types' field to clarify its purpose and usage.

input_mime_types: dict[str, list[str]]

[CQ-LLM-002] libs/core/langchain_core/language_models/model_profile.py:118 (Documentation · LOW)

Issue: Missing detailed documentation for the new 'output_mime_types' field.
Suggestion: Add a description of the 'output_mime_types' field to clarify its purpose and usage.

output_mime_types: dict[str, list[str]]

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:125 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"Augmentation key '{key}' is not a declared ModelProfile "

[CQ-002] libs/model-profiles/langchain_model_profiles/cli.py:126 (Complexity · LOW)

Issue: Deep nesting detected (depth ~5)
Suggestion: Extract nested blocks into helper functions

f"field and its value is not a table of overrides."

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:477 (Documentation · LOW)

Issue: Public def 'test_refresh_merges_provider_level_mime_types' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_merges_provider_level_mime_types(

[CQ-007] libs/model-profiles/tests/unit_tests/test_cli.py:523 (Documentation · LOW)

Issue: Public def 'test_refresh_model_level_mime_types_override_provider' missing docstring
Suggestion: Add a docstring describing purpose and parameters

def test_refresh_model_level_mime_types_override_provider(

Per-File Breakdown

File Critical High Medium Low Total
libs/core/langchain_core/language_models/model_profile.py 0 0 0 2 2
libs/model-profiles/langchain_model_profiles/cli.py 0 0 2 2 4
libs/model-profiles/tests/unit_tests/test_cli.py 0 1 0 3 4
libs/partners/anthropic/langchain_anthropic/data/_profiles.py 0 0 1 0 1

Recommendations

  1. Resolve High severity issues, especially error handling gaps and performance bottlenecks.
  • Run automated tests after applying fixes to verify no regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants