Skip to content

Fix tests: 'Cohere2MoeModel' object has no attribute 'hf_device_map'#46337

Open
kaixuanliu wants to merge 2 commits into
huggingface:mainfrom
kaixuanliu:cohere2-moe-test
Open

Fix tests: 'Cohere2MoeModel' object has no attribute 'hf_device_map'#46337
kaixuanliu wants to merge 2 commits into
huggingface:mainfrom
kaixuanliu:cohere2-moe-test

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

skip 4 invalid test cases:

FAILED tests/models/cohere2_moe/test_modeling_cohere2_moe.py::Cohere2MoeModelTest::test_cpu_offload - AttributeError: 'Cohere2MoeModel' object has no attribute 'hf_device_map'
FAILED tests/models/cohere2_moe/test_modeling_cohere2_moe.py::Cohere2MoeModelTest::test_disk_offload_bin - AttributeError: 'Cohere2MoeModel' object has no attribute 'hf_device_map'
FAILED tests/models/cohere2_moe/test_modeling_cohere2_moe.py::Cohere2MoeModelTest::test_disk_offload_safetensors - AttributeError: 'Cohere2MoeModel' object has no attribute 'hf_device_map'
FAILED tests/models/cohere2_moe/test_modeling_cohere2_moe.py::Cohere2MoeModelTest::test_model_parallelism - AttributeError: 'Cohere2MoeModel' object has no attribute 'hf_device_map'

@ydshieh , pls help review, thx!

Signed-off-by: kaixuanliu <kaixuan.liu@intel.com>
Comment thread tests/models/cohere2_moe/test_modeling_cohere2_moe.py Outdated
Signed-off-by: kaixuanliu <kaixuan.liu@intel.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: cohere2_moe

@kaixuanliu kaixuanliu changed the title cohere2_moe: skip invalid test cases Fix tests: 'Cohere2MoeModel' object has no attribute 'hf_device_map' Jun 5, 2026
@kaixuanliu
Copy link
Copy Markdown
Contributor Author

@vasqu Can you help review it again? Thx!

self.logit_scale = 1.0 # needed for `test_training_overfit` - otherwise the loss does not go down fast enough
# Reduce number of experts so the sparse MoE layer is a smaller fraction of the overall model,
# allowing accelerate to split it across devices in offload/parallelism tests.
self.num_experts = 4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought maybe this was a one time thing. Would it maybe make more sense to adjust the causal lm tester in general on init? Could you cross check a bit?

But in general aligned with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants