Skip to content

Conversation

@vincentkoc
Copy link
Contributor

@vincentkoc vincentkoc commented Oct 29, 2025

Description: It's common knowledge that the JSON based get_format_instructions() has a high failure rate on some models. We tested the prompt systematically using JSONSchemaBench with Opik Agent Optimizer (open-source optimizer toolchain) against a few optimizers to improve the underlying prompt using evaluation-based approach.

  • The full analysis, test enviroment, and results can be found here.
  • We achived a score of 0.97 (+708%) vs. the baseline of 0.12 (orignal prompt) on gpt-4.1.
  • This ensures robust schema adherance and can be re-applied to other formats like yaml xml etc.

Results from Opik Optimizer: Opik Agent Optimizer results using JSONSchemaBench

Issue: No issues linked
Dependencies: No dependencies


  • Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified.

@vincentkoc vincentkoc requested a review from eyurtsev as a code owner October 29, 2025 04:40
@github-actions github-actions bot added core Related to the package `langchain-core` fix and removed fix labels Oct 29, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 29, 2025

CodSpeed Performance Report

Merging #33718 will not alter performance

Comparing vincentkoc:patch-1 (e5cbfc9) with master (a2a9a02)1

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 13 untouched
⏩ 21 skipped2

Footnotes

  1. No successful run was found on master (b5e23e5) during the generation of this report, so a2a9a02 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@eyurtsev eyurtsev self-assigned this Oct 29, 2025
@eyurtsev eyurtsev enabled auto-merge (squash) October 29, 2025 14:18
@ccurme ccurme disabled auto-merge October 29, 2025 15:05
@ccurme ccurme merged commit 78a2f86 into langchain-ai:master Oct 29, 2025
150 of 152 checks passed
@vincentkoc vincentkoc deleted the patch-1 branch October 30, 2025 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Related to the package `langchain-core` fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants