Skip to content

NVIDIA NeMo-Eval 0.1.0

Choose a tag to compare

@chtruong814 chtruong814 released this 09 Oct 18:00
b95522d
  • Evaluation for Automodel with vllm OAI deployment and nvidia-lm-eval as the eval harness
  • Support for Logprob benchmarks with Ray
  • Use evaluation APIs from nvidia-eval-commons

Known Issues

  • Very low flexible-extract score with GSM8k for evaluation of NeMo 2.0 models due to lack of stop word support in MegatronLLMDeployableNemo2. However, this does not impact the strict-match score.