Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANTHROPIC_JSON: allow control characters in JSON strings if strict=False #644

Merged
merged 1 commit into from May 1, 2024

Conversation

voberoi
Copy link
Contributor

@voberoi voberoi commented May 1, 2024

Addresses #612.

These changes merge Pydantic's non-strict semantics with those of json.loads for ANTHROPIC_JSON mode only. In the event that the client passes in strict=False, control characters will also be allowed in JSON strings in ANTHROPIC_JSON mode.

I didn't apply this to any other modes deliberately. I think Claude is uniquely bad at this right now. This might be a change you can simply revert down the line.

I'm happy to apply this other modes if you think it'd help.


🚀 This description was created by Ellipsis for commit a725f75

Summary:

Allows control characters in JSON strings within ANTHROPIC_JSON mode when strict is set to False, with tests validating this behavior.

Key points:

  • Modify parse_anthropic_json in /instructor/function_calls.py to allow control characters in JSON strings when strict=False.
  • Update tests in /tests/test_function_calls.py to cover new functionality.
  • Ensure changes are limited to ANTHROPIC_JSON mode only.

Generated with ❤️ by ellipsis.dev

Addresses jxnl#612. Anthropic's models regularly have control characters in their
strings, producing invalid JSON that causes validation to fail.

These changes merge Pydantic's non-strict semantics with those of `json.loads`
for ANTHROPIC_JSON mode only. In the event that the client passes in
strict=False, control characters will also be allowed in JSON strings.
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me!

  • Reviewed the entire pull request up to a725f75
  • Looked at 140 lines of code in 2 files
  • Took 37 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 1 additional comments because they didn't meet confidence threshold of 85%.
1. instructor/function_calls.py:158:
  • Assessed confidence : 66%
  • Comment:
    Ensure that after using json.loads with strict=False, the data is still subjected to a full validation check by Pydantic to ensure it adheres to the model's schema. This is crucial to maintain data integrity and prevent potential issues from improperly formatted or malicious data.
  • Reasoning:
    The PR introduces a change to allow control characters in JSON strings when strict=False in ANTHROPIC_JSON mode. The implementation uses Python's json.loads with strict=False to parse the JSON and then validates it using Pydantic's model validation. This approach seems to correctly implement the desired functionality as described in the PR description.

However, the use of json.loads directly followed by model_validate might bypass some of Pydantic's built-in validation mechanisms when strict=False. It's important to ensure that after parsing with json.loads, the data still undergoes a full validation check by Pydantic to ensure it adheres to the model's schema.

Workflow ID: wflow_sjaLiJ3NbcfnRCbS


You can customize Ellipsis with review rules, user-specific overrides, quiet mode, and more. See docs.

@jxnl jxnl merged commit 6491aec into jxnl:main May 1, 2024
boydgreenfield pushed a commit to boydgreenfield/instructor that referenced this pull request May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants