Skip to content

k-quants : remove unnecessary tensor shape restrictions#2811

Merged
ggerganov merged 1 commit into
masterfrom
fix-code-llama-quantum-mixtures
Aug 26, 2023
Merged

k-quants : remove unnecessary tensor shape restrictions#2811
ggerganov merged 1 commit into
masterfrom
fix-code-llama-quantum-mixtures

Conversation

@ggerganov
Copy link
Copy Markdown
Member

This is needed to improve the Code Llama 7B and 13B quantum mixtures as they have tensor rows of 32016 for the token_embd.weight and output.weight tensors

@ggerganov ggerganov requested a review from ikawrakow August 26, 2023 13:08
@ggerganov ggerganov merged commit 04f4b1e into master Aug 26, 2023
@ggerganov ggerganov deleted the fix-code-llama-quantum-mixtures branch August 26, 2023 14:37
@slaren
Copy link
Copy Markdown
Member

slaren commented Aug 26, 2023

@TheBloke this PR changes the quantization for all the models with an extended vocab.

@TheBloke
Copy link
Copy Markdown
Contributor

Thanks for the ping!

I have to re-do all the GGUFs soon anyway, for the uint64 thing. So I'll do them all as soon as that's out.

@ggerganov
Copy link
Copy Markdown
Member Author

Ah yes, I'll fix that tomorrow likely

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants