convert.py : Get rope scale from HuggingFace models #2772

pnb · 2023-08-24T20:16:06Z

I was trying to convert EverythingLM V2 with 16k context to GGUF and noticed that it generated nonsense. GGUF metadata showed that the rope scale was not kept, and I see it was indeed not read from config.json. This PR fixes that. The rope scale shows up in GGUF metadata (as 0.25 in this case) and model output is now coherent.

Suggesting @slaren for review because I saw you recently made a very related change.

slaren · 2023-08-24T20:32:08Z

I am looking at the config.json of the model that you linked, and the rope scale looks like this:

  "rope_scaling": {
    "factor": 4.0,
    "type": "linear"
  },

This PR seems to ignore the type parameter. It would be good to add a check to make sure that it is linear.

pnb · 2023-08-24T21:27:16Z

Makes sense! I added that check now, and tested it to make sure it is ignored if the type = linear check fails (and that it still works if type is linear).

cebtenzzre · 2023-08-24T21:38:01Z

convert.py

+        n_head       = config["num_attention_heads"]
+        n_head_kv    = config["num_key_value_heads"] if "num_key_value_heads" in config else n_head
+        f_norm_eps   = config["rms_norm_eps"]
+        f_rope_scale = config.get("rope_scaling", {}).get("factor", None) if config.get("rope_scaling", {}).get("type", "") == "linear" else None


nit: dict.get returns None as a fallback by default. This would be more readable:

Suggested change

f_rope_scale = config.get("rope_scaling", {}).get("factor", None) if config.get("rope_scaling", {}).get("type", "") == "linear" else None

rope_scaling = config.get("rope_scaling", {})

f_rope_scale = rope_scaling.get("factor") if rope_scaling.get("type") == "linear" else None

I would be happy with even a couple of plain if statements. But yeah, that one-liner a bit hard to read, anything that improves the readability would be good.

I was overfitting a bit to the other implementations trying to match the style. I am not a fan of tertiary operators or long lines, so I would also prefer an if/else implementation (tested and committed now).

pnb · 2023-08-24T22:29:29Z

One other thing I had thought of is whether it is worth raising an Exception if type is not "linear", rather than silently ignoring rope scaling. I didn't implement that because it doesn't seem like the rest of the code cares about nonlinear scaling, so it appears kind of out of scope -- but I'll add it if that seems like something that could come up.

Edit: On further thought, it could take a while to see where people converge on rope scaling methods. I get the sense linear might win out. So I think it is fine to merge now, since this fixes a lot of models on HF.

TheBloke · 2023-08-25T15:55:37Z

Hey guys, just tried convert.py with a model with no rope_scaling defined and got this error:

    if "rope_scaling" in config and config["rope_scaling"].get("type") == "linear":
AttributeError: 'NoneType' object has no attribute 'get'

I'd suggest to use this code instead:

rope_scaling = config.get("rope_scaling", None)
if isinstance(rope_scaling, dict) and rope_scaling.get("type", None) == "linear":
   f_rope_scale = rope_scaling.get("factor", None)
else:
   f_rope_scale = None

pnb · 2023-08-25T16:02:32Z

Strange! What model was this? I do suspect there might be some variance in how rope scaling appears in config.json that could be better handled. If rope_scaling is not a dict, for example, which I gather might be the case from your example code.

TheBloke · 2023-08-25T16:04:17Z

It was Samantha-1.11-70B, but what happened here is the default case. Most models currently out there don't have any rope scaling defined. The default in config.json is:

    "rope_scaling": null,

and that wasn't being handled with the code as it was.

pnb · 2023-08-25T16:13:08Z

Ah that makes sense. Will submit the fix shortly, as soon as I try it on a couple models.

* master: (773 commits) server : add `/detokenize` endpoint (ggerganov#2802) convert.py : advanced option (ggerganov#2753) llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814) flake.nix : add rocm support and cleanup (ggerganov#2808) llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817) main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528) llama : use std::abs in llama_sample_tail_free (ggerganov#2800) k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811) Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807) Fix HellaSwag (ggerganov#2805) flake : build llama.cpp on Intel with nix (ggerganov#2795) Handle null rope scaling value (ggerganov#2793) Fix spm whitespaces (ggerganov#2806) examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804) llama : fix struct decl (ggerganov#2790) Faster perplexity computation (ggerganov#2786) llama : add llama_beam_search() (ggerganov#2267) convert.py : Get rope scale from HuggingFace models (ggerganov#2772) llama-bench : add model sizes (ggerganov#2771) convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773) ...

* Get rope scale from HF models * Save rope scale only for linear scaling * Rewrite for clarity

Get rope scale from HF models

8d0d83b

Save rope scale only for linear scaling

8ac33ce

cebtenzzre reviewed Aug 24, 2023

View reviewed changes

Rewrite for clarity

aa896e7

slaren approved these changes Aug 24, 2023

View reviewed changes

Merge branch 'master' into convert_rope_scale

4950b2d

slaren merged commit 28b2c99 into ggerganov:master Aug 25, 2023
4 checks passed

pnb mentioned this pull request Aug 25, 2023

convert.py : Handle null rope scaling value in HF config.json #2793

Merged

akawrykow pushed a commit to akawrykow/llama.cpp that referenced this pull request Aug 29, 2023

convert.py : Get rope scale from HuggingFace models (ggerganov#2772)

ece8f4a

* Get rope scale from HF models * Save rope scale only for linear scaling * Rewrite for clarity

Sam2much96 pushed a commit to Sam2much96/llama.cpp that referenced this pull request Sep 11, 2023

convert.py : Get rope scale from HuggingFace models (ggerganov#2772)

cdcca56

* Get rope scale from HF models * Save rope scale only for linear scaling * Rewrite for clarity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert.py : Get rope scale from HuggingFace models #2772

convert.py : Get rope scale from HuggingFace models #2772

pnb commented Aug 24, 2023

slaren commented Aug 24, 2023 •

edited

pnb commented Aug 24, 2023

cebtenzzre Aug 24, 2023

slaren Aug 24, 2023

pnb Aug 24, 2023

pnb commented Aug 24, 2023 •

edited

TheBloke commented Aug 25, 2023 •

edited

pnb commented Aug 25, 2023

TheBloke commented Aug 25, 2023

pnb commented Aug 25, 2023

	f_rope_scale = config.get("rope_scaling", {}).get("factor", None) if config.get("rope_scaling", {}).get("type", "") == "linear" else None
	rope_scaling = config.get("rope_scaling", {})
	f_rope_scale = rope_scaling.get("factor") if rope_scaling.get("type") == "linear" else None

convert.py : Get rope scale from HuggingFace models #2772

convert.py : Get rope scale from HuggingFace models #2772

Conversation

pnb commented Aug 24, 2023

slaren commented Aug 24, 2023 • edited

pnb commented Aug 24, 2023

cebtenzzre Aug 24, 2023

Choose a reason for hiding this comment

slaren Aug 24, 2023

Choose a reason for hiding this comment

pnb Aug 24, 2023

Choose a reason for hiding this comment

pnb commented Aug 24, 2023 • edited

TheBloke commented Aug 25, 2023 • edited

pnb commented Aug 25, 2023

TheBloke commented Aug 25, 2023

pnb commented Aug 25, 2023

slaren commented Aug 24, 2023 •

edited

pnb commented Aug 24, 2023 •

edited

TheBloke commented Aug 25, 2023 •

edited