-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lightweight HF integration #220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I'm curious what your thoughts are about how we'll ultimately make this integration available when we release the code? I was thinking we'd make this a submodule of olmo
that people could install as an extra, like pip install olmo[hf]
or something. But keeping it a separate package is probably fine too. Maybe we want to call it something more specific like olmo_hf
though?
hf_integration/modeling_olmo.py
Outdated
past_key_values=outputs.attn_key_values, | ||
) | ||
|
||
def generate(self, input_ids, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to implement this or can we get this for free using HF built-in generate
functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, now. Needed a couple small methods implemented.
Makes sense. I've renamed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some comments about CI, otherwise looks good
@@ -152,7 +152,7 @@ jobs: | |||
value: ":16:8" | |||
- name: TOKENIZERS_PARALLELISM | |||
value: "false" | |||
command: ["/entrypoint.sh", "pytest", "-v", "-m", "gpu", "tests/"] | |||
command: ["/entrypoint.sh", "pytest", "-v", "-m", "gpu", "tests/", "-k", "not hf_olmo"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add another job to run the HF tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline: since this requires updating the beaker image on which we run the GPU tests, and since we expect to reconfigure this at some point, it's not worth the effort now.
I've confirmed using instruct-eval that the HF integration runs on GPU.
python hf_olmo/add_hf_config_to_olmo_checkpoint.py --checkpoint-dir <olmo-checkpoint-location>
Using HF pipeline works:
Instruct-eval tasks also work (adding
from hf_olmo import *
torun_eval.py
should suffice):Tested with
mmlu
andbbh
.To Do:
AutoModelForCausalLM
-specific methods, so that HF's .generate() works.