[patch] Reduce NaN Occurrences by Simple Prompt Modification for JSON Output for context_precision #581
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
During the calculation of
context_precision
, an issue was observed where increasing the context amount led to a surge in NaN occurrences. Comparatively,context_recall
does not exhibit this problem. An investigation into the causes of the difference uncovered that the issue stems from whether the prompts specify outputting in JSON format.Discovery
It was found that simply specifying JSON output for
context_precision
, similar to what is done forcontext_recall
, significantly reduces the incidence of NaN. Utilizing JSON mode appears to be crucial, as noted in the OpenAI reference for text generation in JSON mode:Solution
To align with best practices and address the NaN generation issue, I propose updating the prompt for
context_precision
to explicitly instruct the generation of output in JSON format. This small but impactful change will bringcontext_precision
in line with howcontext_recall
operates and ensure more stable and predictable outcomes when handling larger context volumes.Impact
By making this explicit switch to JSON output, we not only follow the guideline provided by OpenAI but also prevent the potential uncontrolled behavior that can result in a heavy onslaught of NaN values. This improvement should increase the reliability of calculations within our system and significantly decrease the time spent debugging NaN-related issues.
I look forward to your review and approval of this change, which will help us maintain robustness in our context precision calculations.
Best,
i-w-a