Generate compressed weights file from finetune #11

sanjay920 · 2024-02-22T19:43:21Z

How do i generate the compressed weights file (sbs) from my fine tune? Consider I want to convert the model assets to the compressed weights file: https://huggingface.co/google/gemma-2b-it/tree/main how would i do that?

Thanks!

austinvhuang · 2024-02-22T19:50:37Z

Hi @sanjay920, really cool that you're trying a fine tune already. We're working on releasing a conversion script soon (hopefully within the next few days), but would be useful to prioritize source formats. What are you converting from?

Also if others need a converter for fine tune, feel free to chine in in here as well regarding what you'd use as a source format.

sanjay920 · 2024-02-22T21:44:11Z

Ideally from a PeftModel so I can convert like it's possible in llamacpp: https://github.com/ggerganov/llama.cpp/blob/master/convert-lora-to-ggml.py

Or if one merges the lora adapter with the base model - so a GemmaModel to sbs converter

jan-wassenberg · 2024-02-23T06:12:47Z

Hi @sanjay920 , a quick FYI on the implementation: Compressor in compression/compress-inl.h takes care of writing the SBS, so we've got that part covered. The missing bit is getting your model into our CompressedArray<>, which is the part Austin was mentioning and asking about.

fengwang · 2024-02-29T01:46:44Z

I would like to convert a fine-tuned keras model to sbs, using the fine-tuning script from https://ai.google.dev/gemma/docs/lora_tuning

ufownl · 2024-03-22T15:23:17Z

I would like to convert a fine-tuned keras model to sbs, using the fine-tuning script from https://ai.google.dev/gemma/docs/lora_tuning

Hi @fengwang , there is a way to export the Keras weights to PyTorch through this script (maybe needs a little modification to remove xla if you don't want to use it), and then convert the PyTorch weights to uncompressed weights of gemma.cpp through util/convert_weights.py.

But currently, this requires the dev branch because of the issues mentioned in #103. They were fixed in #114 and merged into the dev branch today.

jan-wassenberg · 2024-04-18T14:02:34Z

I think this is now working, please feel free to reopen if you'd like to discuss or have an issue with the scripts.

austinvhuang added the Feature New feature or request label Feb 24, 2024

austinvhuang mentioned this issue Feb 25, 2024

Failed to read from model.weights.h5 - might be a directory, or too small? #46

Closed

This was referenced Feb 27, 2024

How to use weights on HuggingingFace? #50

Closed

[Feature request] Add quantization methods #17

Open

austinvhuang mentioned this issue Mar 1, 2024

Functions exposed in the libgemma.a? #70

Closed

jan-wassenberg closed this as completed Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate compressed weights file from finetune #11

Generate compressed weights file from finetune #11

sanjay920 commented Feb 22, 2024

austinvhuang commented Feb 22, 2024

sanjay920 commented Feb 22, 2024 •

edited

jan-wassenberg commented Feb 23, 2024

fengwang commented Feb 29, 2024

ufownl commented Mar 22, 2024

jan-wassenberg commented Apr 18, 2024

Generate compressed weights file from finetune #11

Generate compressed weights file from finetune #11

Comments

sanjay920 commented Feb 22, 2024

austinvhuang commented Feb 22, 2024

sanjay920 commented Feb 22, 2024 • edited

jan-wassenberg commented Feb 23, 2024

fengwang commented Feb 29, 2024

ufownl commented Mar 22, 2024

jan-wassenberg commented Apr 18, 2024

sanjay920 commented Feb 22, 2024 •

edited