NVIDIA NeMo-Eval 0.1.0

chtruong814 released this 09 Oct 18:00

b95522d

Evaluation for Automodel with vllm OAI deployment and nvidia-lm-eval as the eval harness
Support for Logprob benchmarks with Ray
Use evaluation APIs from nvidia-eval-commons

Known Issues

Very low flexible-extract score with GSM8k for evaluation of NeMo 2.0 models due to lack of stop word support in MegatronLLMDeployableNemo2. However, this does not impact the strict-match score.

Assets 2