hf_olmo modeling class should be a true `PreTrainedModel` #391

AkshitaB · 2023-12-07T06:57:07Z

This is necessary for loading and finetuning further.

~~There is some degree of code repetition between olmo and hf_olmo which I would have liked to avoid, but it was leading to odd issues in loading the state dict.~~

hf_olmo/modeling_olmo.py

hamishivi · 2023-12-08T23:57:55Z

hf_olmo/modeling_olmo.py

+    def tie_weights(self):
+        if self.config.weight_tying:
+            self.model.transformer.ff_out = self.model.transformer.wte
+

 # Register the model so that it is available for transformer pipelines, auto-loading, etc.
 AutoModelForCausalLM.register(OLMoConfig, OLMoForCausalLM)


Have you tested multi-gpu loading? e.g.
AutoModelForCausalLM.from_pretrained(..., device_map='balanced') with multiple GPUs available?

And does generation work fine?

AkshitaB added 5 commits December 5, 2023 17:59

self-contained hf-olmo modeling code

b7e403f

ensure checkpoint filename is present

8f456c2

Merge branch 'main' into hf-olmo-new

a0a09a8

remove old file

18c3006

fix mypy

a0df4d4

AkshitaB requested review from yizhongw and epwalsh December 7, 2023 07:02

oops

87125f3

epwalsh reviewed Dec 7, 2023

View reviewed changes

hf_olmo/modeling_olmo.py Outdated Show resolved Hide resolved

AkshitaB added 3 commits December 7, 2023 16:58

simplify

1c85224

Merge branch 'main' into hf-olmo-new

df2d3dc

add test for save_pretrained

81776a5

hamishivi reviewed Dec 8, 2023

View reviewed changes

AkshitaB added 4 commits December 8, 2023 16:11

simplify tokenizer

9aca8e1

rename script

fd1e20c

fix test

64118d6

style check

68ff059

AkshitaB merged commit e99dbe5 into main Dec 9, 2023
10 checks passed

AkshitaB deleted the hf-olmo-new branch December 9, 2023 00:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hf_olmo modeling class should be a true `PreTrainedModel` #391

hf_olmo modeling class should be a true `PreTrainedModel` #391

AkshitaB commented Dec 7, 2023 •

edited

Loading

hamishivi Dec 8, 2023

hf_olmo modeling class should be a true PreTrainedModel #391

hf_olmo modeling class should be a true PreTrainedModel #391

Conversation

AkshitaB commented Dec 7, 2023 • edited Loading

hamishivi Dec 8, 2023

Choose a reason for hiding this comment

hf_olmo modeling class should be a true `PreTrainedModel` #391

hf_olmo modeling class should be a true `PreTrainedModel` #391

AkshitaB commented Dec 7, 2023 •

edited

Loading