Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightweight HF integration #220

Merged
merged 28 commits into from
Jun 26, 2023
Merged

Lightweight HF integration #220

merged 28 commits into from
Jun 26, 2023

Conversation

AkshitaB
Copy link
Contributor

@AkshitaB AkshitaB commented Jun 23, 2023

  • Creates a config.json file at the checkpoint location, which allows Olmo models to be loaded as HF models.

python hf_olmo/add_hf_config_to_olmo_checkpoint.py --checkpoint-dir <olmo-checkpoint-location>

  • The model, config, and tokenizer classes are registered with the relevant HF auto classes. Importing them allows models to be loaded as HF-compatible models.
from hf_olmo import *
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(<olmo-checkpoint-location>)
tokenizer = AutoTokenizer.from_pretrained(<olmo-checkpoint-location>)

Using HF pipeline works:

from hf_olmo.modeling_olmo import OLMoConfig, OLMoForCausalLM  # noqa: F401

model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
output = pipeline("question: who wrote romeo and juliet? answer: ", max_new_tokens=30)
# [{'generated_text': 'question: who wrote romeo and juliet? answer: romeo and juliet is a play by william shakespeare. the play was first performed in 1605. the play is set in the city'}]

Instruct-eval tasks also work (adding from hf_olmo import * to run_eval.py should suffice):
Tested with mmlu and bbh.

To Do:

  • Test bbh instruct eval task. @OyvindTafjord pointed out that bbh tasks from instruct eval may be a more comprehensive test, since it uses more of the HF api.
  • Add implementation for AutoModelForCausalLM-specific methods, so that HF's .generate() works.
  • Test with deepspeed code from @hamishivi .

@AkshitaB AkshitaB requested a review from epwalsh June 23, 2023 17:24
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I'm curious what your thoughts are about how we'll ultimately make this integration available when we release the code? I was thinking we'd make this a submodule of olmo that people could install as an extra, like pip install olmo[hf] or something. But keeping it a separate package is probably fine too. Maybe we want to call it something more specific like olmo_hf though?

past_key_values=outputs.attn_key_values,
)

def generate(self, input_ids, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to implement this or can we get this for free using HF built-in generate functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, now. Needed a couple small methods implemented.

@AkshitaB
Copy link
Contributor Author

Maybe we want to call it something more specific like olmo_hf though?

Makes sense. I've renamed it.

@AkshitaB AkshitaB requested a review from epwalsh June 26, 2023 17:24
Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comments about CI, otherwise looks good

.github/actions/setup-venv/action.yml Show resolved Hide resolved
.github/actions/setup-venv/action.yml Outdated Show resolved Hide resolved
@@ -152,7 +152,7 @@ jobs:
value: ":16:8"
- name: TOKENIZERS_PARALLELISM
value: "false"
command: ["/entrypoint.sh", "pytest", "-v", "-m", "gpu", "tests/"]
command: ["/entrypoint.sh", "pytest", "-v", "-m", "gpu", "tests/", "-k", "not hf_olmo"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add another job to run the HF tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline: since this requires updating the beaker image on which we run the GPU tests, and since we expect to reconfigure this at some point, it's not worth the effort now.

I've confirmed using instruct-eval that the HF integration runs on GPU.

@AkshitaB AkshitaB merged commit acf372e into main Jun 26, 2023
10 checks passed
@AkshitaB AkshitaB deleted the hf-integration branch June 26, 2023 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants