Porting Gemma 2 transformers checkpoint #1678

ariG23498 · 2024-06-27T20:33:21Z

Porting Gemma 2 transformers checkpoints in Keras NLP

mattdangerw

Thanks! This all looks good to me. Let's add a test.

ariG23498 · 2024-07-04T07:25:46Z

@mattdangerw @grasskin this PR is ready for review!

Note: The KerasNLP Gemma 2 model works only on the JAX backend (for the time being)

Also thanks to the Hugging Face team (Matt et. al.) for providing me with compute to work on this model.

ariG23498 · 2024-07-04T07:43:17Z

keras_nlp/src/utils/transformers/convert_gemma.py

+        if transformers_config["model_type"] == "gemma":
+            port_weight(
+                keras_variable=decoder_layer.pre_ffw_norm.variables[0],
+                hf_weight_key=f"model.layers.{i}.post_attention_layernorm.weight",
+            )
+        elif transformers_config["model_type"] == "gemma2":
+            port_weight(
+                keras_variable=decoder_layer.pre_ffw_norm.variables[0],
+                hf_weight_key=f"model.layers.{i}.pre_feedforward_layernorm.weight",
+            )


This was done in order to align the gemma 1 and gemma 2 checkpoints.

I am open to better ways to go around it.

mattdangerw · 2024-07-08T17:58:20Z

Thanks!

chore: adding gemma 2 conversion

9461f22

github-actions bot added the Gemma Gemma model specific issues label Jun 27, 2024

chore: adding norms

da66585

mattdangerw reviewed Jul 3, 2024

View reviewed changes

ariG23498 added 2 commits July 4, 2024 05:56

chore: align the pre ffw layernorm

0087ba4

chore: adding test

3896ad4

ariG23498 marked this pull request as ready for review July 4, 2024 07:21

aligning gemma and gemma 2 weights

e1ce803

ariG23498 commented Jul 4, 2024

View reviewed changes

ariG23498 added the kokoro:force-run Runs Tests on GPU label Jul 4, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 4, 2024

ariG23498 changed the title ~~[WIP] Porting Gemma 2 transformers checkpoint~~ Porting Gemma 2 transformers checkpoint Jul 4, 2024

mattdangerw merged commit a219e96 into keras-team:master Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Porting Gemma 2 transformers checkpoint #1678

Porting Gemma 2 transformers checkpoint #1678

Uh oh!

ariG23498 commented Jun 27, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

ariG23498 commented Jul 4, 2024 •

edited

Loading

Uh oh!

ariG23498 Jul 4, 2024

Uh oh!

mattdangerw commented Jul 8, 2024

Uh oh!

Uh oh!

Porting Gemma 2 transformers checkpoint #1678

Porting Gemma 2 transformers checkpoint #1678

Uh oh!

Conversation

ariG23498 commented Jun 27, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

ariG23498 commented Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ariG23498 Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

mattdangerw commented Jul 8, 2024

Uh oh!

Uh oh!

ariG23498 commented Jul 4, 2024 •

edited

Loading