Add gemma support #69

qihqi · 2024-05-03T22:45:51Z

Changes include:

refactor engine to not use hardcoded sharding annotations; but instead read from a file.
Replace GemmaAttention class used in Gemma with layers.Attention: they seems to be doing the same thing.
added model_name arg to create_engine

Run interactive and run server both works with random weights.

shard config for llamA gemma 3 gemma4 formatter

FanhaiLu1

Thanks Hans for adding gemma support and thanks for refactor engine, env to make code more clear! Overall, it looks great!

FanhaiLu1 · 2024-05-04T00:33:37Z

default_shardings/gemma.yaml

+# "replicated" to signify "replicated".
+# Integer signify axis to shard: 0 <= shard axis < rank
+
+freqs_cis : null #  torch.complex64 (16384, 128)


Can we keep consistency on replicated sharding? If either null or -1 is fine, shall we just keep -1 in our code base (use null in gemma, but -1 in llama)?

FanhaiLu1 · 2024-05-04T00:34:53Z

jetstream_pt/environment.py

        )
    return caches
+
+  def sharding_by_name(self, name):


Great, it's more clear than previous hardcode one.

FanhaiLu1 · 2024-05-04T00:38:41Z

jetstream_pt/layers.py

  """Attention module."""

-  def __init__(self, args, env):
+  def __init__(self, n_heads, n_kv_heads, head_dim, hidden_size, device, env):


Great, it's nice to see the layers is decoupled with args.

FanhaiLu1 · 2024-05-04T00:48:16Z

tests/test_llama_e2e.py

    print(f"---------> {jax.devices()}")

    env, model_arg = helpers.make_env_tiny(bf16_enable=False)
+    torch.set_default_dtype(torch.float32)


We can remove this line. helpers.make_env_tiny has code to do set_default_dtype:

torch_dtype = torch.bfloat16 if bf16_enable else torch.float32
torch.set_default_dtype(torch_dtype)

same as other torch.set_default_dtype in this files.

qihqi requested review from FanhaiLu1, lsy323 and wang2yn84 May 3, 2024 22:45

qihqi force-pushed the hanq_add_model branch 2 times, most recently from b77f88c to eb739eb Compare May 4, 2024 00:22

add gemma 1

f5de8d4

shard config for llamA gemma 3 gemma4 formatter

FanhaiLu1 approved these changes May 4, 2024

View reviewed changes

qihqi force-pushed the hanq_add_model branch from eb739eb to f5de8d4 Compare May 4, 2024 00:56

FanhaiLu1 merged commit 9353640 into main May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add gemma support #69

Add gemma support #69

Uh oh!

qihqi commented May 3, 2024 •

edited

Loading

Uh oh!

FanhaiLu1 left a comment •

edited

Loading

Uh oh!

FanhaiLu1 May 4, 2024

Uh oh!

FanhaiLu1 May 4, 2024

Uh oh!

FanhaiLu1 May 4, 2024

Uh oh!

FanhaiLu1 May 4, 2024

Uh oh!

FanhaiLu1 May 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add gemma support #69

Add gemma support #69

Uh oh!

Conversation

qihqi commented May 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FanhaiLu1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 May 4, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 May 4, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 May 4, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 May 4, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 May 4, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qihqi commented May 3, 2024 •

edited

Loading

FanhaiLu1 left a comment •

edited

Loading