Conversation
|
cc @xenova if you could check if it works for you, just made a quick draft for this as we talked internally about this models series |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
xenova
left a comment
There was a problem hiding this comment.
I can confirm this PR fixes the bug I encountered! 🙌
|
run-slow: hunyuan_v1_dense, hunyuan_v1_moe |
|
💔 This comment contains |
|
run-slow: hunyuan_v1_dense, hunyuan_v1_moe |
|
This comment contains models: ["models/hunyuan_v1_dense", "models/hunyuan_v1_moe"] |
CI Results✅ No failing test specific to this PR 🎉 ! |
xenova
left a comment
There was a problem hiding this comment.
works for me (on the test case that caused this discussion)!
|
run-slow: hunyuan_v1_moe |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: hunyuan_v1_dense, hunyuan_v1_moe |
|
This comment contains models: ["models/hunyuan_v1_moe"] |
CI Results✅ No failing test specific to this PR 🎉 ! |
As per title, currently init weights assumes everything is uniform but these dynamic inits are slightly different