Update the Judge LLM settings in the examples to avoid retries#204
Merged
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom May 2, 2025
Merged
Update the Judge LLM settings in the examples to avoid retries#204rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
Conversation
The ragas nv_metrics require 3-8 tokens, temperature can be left at the default of 0.1. Also adjusted the LLM model based on the leadership board. Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR updates the judge LLM settings used across various example configurations and documentation to align with the new leadership board recommendations. Key changes include updating the model name from meta/llama-3.3-70b-instruct to meta/llama-3.1-70b-instruct, removing explicit temperature and top_p parameters from the nim_rag_eval_llm configuration, and increasing the max_tokens value from 2–6 tokens to 8 tokens.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| examples/simple/src/aiq_simple/configs/eval_upload_config.yml | Updated nim_rag_eval_llm configuration to use the new model and token count, removing unneeded temperature and top_p settings. |
| examples/simple/src/aiq_simple/configs/eval_config.yml | Adjusted nim_rag_eval_llm parameters to match the new standard. |
| examples/email_phishing_analyzer/configs/config.yml | Consistent update of nim_rag_eval_llm settings for email phishing analyzer. |
| examples/email_phishing_analyzer/configs/config-reasoning.yml | Similar update to nim_rag_eval_llm configuration. |
| examples/email_phishing_analyzer/configs/config-phi-3-mini-4k-instruct.yml | Updated nim_rag_eval_llm settings to reflect the new token count and model. |
| examples/email_phishing_analyzer/configs/config-phi-3-medium-4k-instruct.yml | Modified nim_rag_eval_llm configuration accordingly. |
| examples/email_phishing_analyzer/configs/config-mixtral-8x22b-instruct-v0.1.yml | Updated nim_rag_eval_llm to the new model name and token count. |
| examples/email_phishing_analyzer/configs/config-llama-3.3-70b-instruct.yml | Changed model references and removed explicit temperature and top_p parameters. |
| examples/email_phishing_analyzer/configs/config-llama-3.1-8b-instruct.yml | Updates mirror other nim_rag_eval_llm configurations. |
| examples/documentation_guides/workflows/text_file_ingest/src/text_file_ingest/configs/config.yml | Adjusted nim_rag_eval_llm settings for consistency with overall configuration changes. |
| docs/source/guides/evaluate.md | Documentation updated to reflect the new judge LLM model and token configuration, along with guidance on the recommended settings. |
Comments suppressed due to low confidence (2)
examples/simple/src/aiq_simple/configs/eval_upload_config.yml:42
- Ensure that the removal of explicit 'temperature' and 'top_p' entries in the nim_rag_eval_llm configuration is intentional and that the defaults (e.g., a temperature of 0.1) are correctly applied across all environments.
max_tokens: 8
docs/source/guides/evaluate.md:115
- Confirm that the updated judge LLM model name in the documentation aligns with the configuration changes across the project and reflects the intended leadership board update.
model_name: meta/llama-3.1-70b-instruct
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
ericevans-nv
approved these changes
May 2, 2025
Contributor
Author
|
/merge |
yczhang-nv
pushed a commit
to yczhang-nv/NeMo-Agent-Toolkit
that referenced
this pull request
May 8, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Yuchen Zhang <134643420+yczhang-nv@users.noreply.github.com>
yczhang-nv
pushed a commit
to yczhang-nv/NeMo-Agent-Toolkit
that referenced
this pull request
May 9, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
ericevans-nv
pushed a commit
to ericevans-nv/agent-iq
that referenced
this pull request
Jun 3, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
ericevans-nv
pushed a commit
to ericevans-nv/agent-iq
that referenced
this pull request
Jun 3, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204 Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
AnuradhaKaruppiah
added a commit
to AnuradhaKaruppiah/oss-agentiq
that referenced
this pull request
Aug 4, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204
scheckerNV
pushed a commit
to scheckerNV/aiq-factory-reset
that referenced
this pull request
Aug 22, 2025
…A#204) Closes NVIDIA#202 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/AIQToolkit/blob/develop/docs/source/advanced/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Eric Evans II (https://github.com/ericevans-nv) URL: NVIDIA#204
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The ragas nv_metrics require 3-8 tokens, temperature can be left at the default of 0.1.
Also adjusted the LLM model based on the leadership board.
Description
Closes #202
By Submitting this PR I confirm: