Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Phi-1 and Phi 1.5 #3831

Merged
merged 4 commits into from Dec 14, 2023
Merged

Add support for Phi-1 and Phi 1.5 #3831

merged 4 commits into from Dec 14, 2023

Conversation

arnavgarg1
Copy link
Contributor

@arnavgarg1 arnavgarg1 commented Dec 14, 2023

In Transformers 4.36, the transformers library added support for Phi-1 and Phi-1.5 models from Microsoft. However, there are two caveats with using this model:

  1. The original models, microsofot/phi-1 and micosoft/phi-1.5 don't work out of the box since they require remote_code to be trusted because the tensor operations are implemented through einops instead of PyTorch. Instead, in the original PR that adds support for Phi based models, they add two models that are supported (https://github.com/huggingface/transformers/pull/26170/files#diff-88cb36bfb13c1dc5f52bb952b74697a1c79e286a1a57e4ed3f20ecd5e9f8749bR25):
  • susnato/phi-1_dev
  • susnato/phi-1_5_dev

This is the recommendation from official phi model docs on huggingface as well: https://huggingface.co/docs/transformers/main/model_doc/phi

from transformers import PhiForCausalLM, AutoTokenizer

# define the model and tokenizer.
model = PhiForCausalLM.from_pretrained("susnato/phi-1_5_dev")
tokenizer = AutoTokenizer.from_pretrained("susnato/phi-1_5_dev")

My understanding is that someone from the huggingface team has converted the official weights into huggingface compatible weights under the two new mappings. I've filed an issue here to understand what the expected behavior is supposed to be: huggingface/transformers#28049

  1. Both susnato/phi-1_dev and susnato/phi-1_5_dev don't support to device_map auto model load kwarg that we set when we load models in quantized state, say when initializing the model using 4 bit quantization. However, it seems that the model weights get correctly loaded onto the right device depending on the quantization kwargs anyway, so we just skip using this load kwarg for phi based models.

Closes #3630

@arnavgarg1 arnavgarg1 changed the title Add support for Phi and Phi 1.5 Add support for Phi-1 and Phi 1.5 Dec 14, 2023
@arnavgarg1 arnavgarg1 mentioned this pull request Dec 14, 2023
Copy link

Unit Test Results

  6 files  ±0    6 suites  ±0   14m 15s ⏱️ -13s
12 tests ±0    9 ✔️ ±0    3 💤 ±0  0 ±0 
60 runs  ±0  42 ✔️ ±0  18 💤 ±0  0 ±0 

Results for commit 83514f6. ± Comparison against base commit bccfb4e.

@arnavgarg1 arnavgarg1 merged commit 06cc508 into master Dec 14, 2023
18 checks passed
@arnavgarg1 arnavgarg1 deleted the support_phi branch December 14, 2023 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

microsoft/phi-1_5
3 participants