[tests] switch lm_eval invocation to use pre-loaded transformers model #2018

brian-dellabetta · 2025-11-10T23:15:58Z

SUMMARY:
lm_eval==0.4.9.1 has a broken entrypoint when using a model with a compressed-tensors quantization config with --model hf:

FAILED tests/lmeval/test_lmeval.py::TestLMEval::test_lm_eval[tests/lmeval/configs/vl_w4a16_actorder_weight.yaml] - ValueError: The model is quantized with CompressedTensorsConfig but you are passing a dict config. Please make sure to pass the same quantization config class to `from_pretrained` with different loading attributes.

It has been resolved on main, though a separate issue persists that is resolved with this PR -- EleutherAI/lm-evaluation-harness#3393.

While that is in transit, and to avoid having to use lm_eval main in our ci/cd, this PR resolves the issue by pre-loading the model with AutoModelForCausalLM rather than relying on lm_eval's strange model loading logic.

TEST PLAN:
tests run now, for some reason the vl test is super slow on ibm-h100-1. The same thing happens on main. I've seen this before, but I'm not sure what's causing it. It seemed to correct itself the following day

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

github-actions · 2025-11-10T23:16:07Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist · 2025-11-10T23:16:11Z

Summary of Changes

Hello @brian-dellabetta, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical issue within the lm_eval framework where quantized models, specifically those using compressed-tensors configurations, failed to load correctly via lm_eval's default entrypoint. By implementing a strategy to pre-load models using transformers.AutoModelForCausalLM before passing them to lm_eval, the change ensures the stability and functionality of evaluation tests, preventing disruptions in the CI/CD pipeline and avoiding reliance on unreleased lm_eval fixes.

Highlights

Fix lm_eval entrypoint issue: Addresses a ValueError in lm_eval==0.4.9.1 when using compressed-tensors quantization with --model hf, which was preventing tests from running due to a broken entrypoint.
Pre-load models with AutoModelForCausalLM: Modifies the lm_eval calls to explicitly pre-load models using transformers.AutoModelForCausalLM.from_pretrained before passing them as HFLM objects to lm_eval.simple_evaluate, bypassing lm_eval's problematic internal loading logic.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request switches to a different lm_eval entrypoint to work around an issue with model loading in lm_eval==0.4.9.1. The change involves pre-loading the model using AutoModelForCausalLM before passing it to lm_eval. The implementation is correct and addresses the issue described. My main feedback is to refactor the duplicated model loading logic into a helper method to improve code maintainability.

tests/lmeval/test_lmeval.py

dsikka

One question

tests/lmeval/test_lmeval.py

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

shanjiaz

Niiice

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

shanjiaz

Niiiice

switch to lm_eval entrypoint that uses pre-loaded transformers model

da9fa03

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

brian-dellabetta requested review from dsikka, kylesayrs, rahul-tuli and shanjiaz November 10, 2025 23:15

brian-dellabetta added the ready When a PR is ready for review label Nov 10, 2025

brian-dellabetta changed the title ~~switch to lm_eval entrypoint that uses pre-loaded transformers model~~ [lm_eval tests] switch lm_eval invocation to use pre-loaded transformers model Nov 10, 2025

brian-dellabetta changed the title ~~[lm_eval tests] switch lm_eval invocation to use pre-loaded transformers model~~ [tests] switch lm_eval invocation to use pre-loaded transformers model Nov 10, 2025

gemini-code-assist bot reviewed Nov 10, 2025

View reviewed changes

tests/lmeval/test_lmeval.py Outdated Show resolved Hide resolved

brian-dellabetta requested a review from fynnsu November 10, 2025 23:18

brian-dellabetta added 3 commits November 11, 2025 10:18

Merge branch 'main' into bdellabe/lmeval-test-fixes

e091fba

Merge branch 'main' into bdellabe/lmeval-test-fixes

441c2e8

Merge branch 'main' into bdellabe/lmeval-test-fixes

22abbe1

dsikka reviewed Nov 12, 2025

View reviewed changes

tests/lmeval/test_lmeval.py Outdated Show resolved Hide resolved

brian-dellabetta added 2 commits November 12, 2025 20:10

allow for hf-multimodal and non-AutoModelForCausalLM models

7348db5

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

typehint

8587c8e

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

shanjiaz previously approved these changes Nov 12, 2025

View reviewed changes

cleanup

277c1cd

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

brian-dellabetta dismissed shanjiaz’s stale review via 277c1cd November 12, 2025 22:54

brian-dellabetta requested review from dsikka and shanjiaz November 12, 2025 22:57

brian-dellabetta added 2 commits November 12, 2025 17:59

Merge branch 'main' into bdellabe/lmeval-test-fixes

0392fb6

Merge branch 'main' into bdellabe/lmeval-test-fixes

0ab6ebe

shanjiaz approved these changes Nov 13, 2025

View reviewed changes

dsikka approved these changes Nov 13, 2025

View reviewed changes

dsikka merged commit 81eff05 into main Nov 13, 2025
9 checks passed

dsikka deleted the bdellabe/lmeval-test-fixes branch November 13, 2025 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[tests] switch lm_eval invocation to use pre-loaded transformers model #2018

[tests] switch lm_eval invocation to use pre-loaded transformers model #2018

brian-dellabetta commented Nov 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

dsikka left a comment

Uh oh!

Uh oh!

shanjiaz left a comment

Uh oh!

shanjiaz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[tests] switch lm_eval invocation to use pre-loaded transformers model #2018

[tests] switch lm_eval invocation to use pre-loaded transformers model #2018

Conversation

brian-dellabetta commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot commented Nov 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

brian-dellabetta commented Nov 10, 2025 •

edited

Loading