Replies: 1 comment 1 reply
-
|
I don't see a problem to support other factors. No need to keep the assert as it is or replace it with warning. cc @compilade for extra opinion |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
Currently,
LLM_ARCH_MAMBA2inllama.cpphas a hardcoded assertion:This prevents loading any Mamba2 model where
expand != 2(e.g., expand=1.5, expand=1, etc.).Proposed solution
Two small changes:
1. In
llama-model.cpp(around line 4309):2. In
convert_hf_to_gguf.py:Replace hardcoded
2 * self.d_modelwith actualexpandvalue from the source model config.Why this change?
expand=2(default) continue to work identicallyTesting
Tested with:
expand=2) — ✅ worksexpand=1,hidden_size=512,d_inner=512— ✅ loads and runsDiscussion points
2xrestriction? (performance optimizations? numerical stability?)Beta Was this translation helpful? Give feedback.
All reactions