Vicuna Models checkpoints transfer script #1657

sineeli · 2024-05-28T18:23:44Z

Successfully converted checkpoints from vicuna(torch) to Keras3 compatible, please let me know if any refactoring needed.

Thanks

sineeli · 2024-05-29T22:43:07Z

cc: @mattdangerw

mattdangerw

Looks good! A couple comments...

Next up, you could try uploading these to your individual user on Kaggle, and making a PR that updates our presets file here -> https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/src/models/llama/llama_presets.py

That would give us all a way to test the vicuna models end to end, then we can copy them to the Keras org on Kaggle when they look good.

Thanks!

mattdangerw · 2024-06-04T21:54:31Z

tools/checkpoint_conversion/convert_vicuna_checkpoints.py

+        print("\n-> Saved the tokenizer")
+
+        # === Upload the preset ===
+        uri = f"kaggle://keras/vicuna/keras/{preset}"


let's do this like the phi3 script https://github.com/keras-team/keras-nlp/blob/38c6608034bce59dee593e86b437ae5878e601cf/tools/checkpoint_conversion/convert_phi3_checkpoints.py#L29

That will still allow people to run this who do not have access to the keras kaggle org.

mattdangerw · 2024-06-04T21:58:38Z

tools/checkpoint_conversion/convert_vicuna_checkpoints.py

+from keras_nlp.models import LlamaCausalLMPreprocessor
+from keras_nlp.models import LlamaTokenizer
+
+PRESET_MAP = {"vicuna_1.5_7b_en": "lmsys/vicuna-7b-v1.5"}


Is the weight conversion all the same as llama 2? If so could we consider consolidating the conversion scripts?

Yes the weights are same llam2 architecture, we can merge with existing script. I will try that. Thanks!

sineeli · 2024-06-05T22:39:14Z

@mattdangerw

When we run on cpu and use .numpy() at the end of the weights it causes an error: Bfloat16 scalar not surpported but when we see phi3 script there is no such usage and at global level the backend set to torch and he is loading the weights directly.

LLam2 all weights are in torch so and we can use the same approach as phi3 weights convertion script.

phi3 seems more fault tolerant

When tested on CPU from bfloat16 to bfloat16:

But when used float16(default of hugging face weights) to float32(keras model) we will not face this mismatch.

Thanks

mattdangerw · 2024-06-06T21:19:44Z

When we run on cpu and use .numpy() at the end of the weights it causes an error: Bfloat16 scalar not surpported but when we see phi3 script there is no such usage and at global level the backend set to torch and he is loading the weights directly.

@sineeli I think keras.ops.convert_to_numpy(x) would gracefully handle bfloat16, maybe try that?

With what precision are the original pytorch checkpoints stored on disk? If they are at float16, we could just do the same (and store at float16 on disk). The disk format does not mean we need to load at that format.

Anyway, as soon as you push with the comments addressed above we can merge this PR. And keep working on the actual checkpoints we ship.

mattdangerw · 2024-06-07T23:24:44Z

@sineeli can you make your kaggle model public? I'll pull in the script but leave the new preset off for now, we can do that on a follow up PR.

sineeli · 2024-06-08T00:19:32Z

@sineeli can you make your kaggle model public? I'll pull in the script but leave the new preset off for now, we can do that on a follow up PR.

Sure, waiting for page update. Thank!

https://www.kaggle.com/models/sineeli/vicuna/keras/vicuna_1.5_7b_en

sineeli added 2 commits May 20, 2024 15:14

Add Vicuna tokenizer and preset

14f7fbd

Add vicuna tokenizer and preset

e2d1b55

chunduriv requested a review from mattdangerw May 28, 2024 18:27

sineeli added 2 commits May 28, 2024 11:36

Sort the imports as per isort lib

cc362f2

fix lint errors

9477c1b

mattdangerw reviewed Jun 4, 2024

View reviewed changes

Merge branch 'keras-team:master' into master

bee707d

sineeli added 2 commits June 7, 2024 15:10

Add vicuna preset to llam2

4eaaa92

remove separate vicuna checkpoint script

256ef8d

indentation fix

55db114

mattdangerw merged commit 50e0414 into keras-team:master Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Vicuna Models checkpoints transfer script #1657

Vicuna Models checkpoints transfer script #1657

Uh oh!

sineeli commented May 28, 2024 •

edited

Loading

Uh oh!

sineeli commented May 29, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

mattdangerw Jun 4, 2024

Uh oh!

mattdangerw Jun 4, 2024

Uh oh!

sineeli Jun 5, 2024

Uh oh!

sineeli commented Jun 5, 2024 •

edited

Loading

Uh oh!

mattdangerw commented Jun 6, 2024

Uh oh!

mattdangerw commented Jun 7, 2024

Uh oh!

sineeli commented Jun 8, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Vicuna Models checkpoints transfer script #1657

Vicuna Models checkpoints transfer script #1657

Uh oh!

Conversation

sineeli commented May 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sineeli commented May 29, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

mattdangerw Jun 4, 2024

Choose a reason for hiding this comment

Uh oh!

mattdangerw Jun 4, 2024

Choose a reason for hiding this comment

Uh oh!

sineeli Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

sineeli commented Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw commented Jun 6, 2024

Uh oh!

mattdangerw commented Jun 7, 2024

Uh oh!

sineeli commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sineeli commented May 28, 2024 •

edited

Loading

sineeli commented Jun 5, 2024 •

edited

Loading

sineeli commented Jun 8, 2024 •

edited

Loading