Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug fix] Add rope_theta for llama config #4480

Merged
merged 9 commits into from
Oct 19, 2023

Conversation

cupertank
Copy link
Contributor

Fixed bug with CodeLlama. Bug description is here #4442. Now DeepSpeed uses rope_theta from transformers.

@cupertank
Copy link
Contributor Author

@microsoft-github-policy-service agree company="JetBrains"

@cupertank
Copy link
Contributor Author

@mrwyattii Take a look at this, please.

@mrwyattii mrwyattii self-assigned this Oct 9, 2023
@cupertank
Copy link
Contributor Author

@mrwyattii, Please, run CI. Do I need to do anything else?

@mrwyattii
Copy link
Contributor

Thanks @cupertank LGTM

Copy link
Contributor

@lekurile lekurile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

@mrwyattii
Copy link
Contributor

@cupertank it looks like several of the inference related unit tests are failing. I can help debug next week.

@cupertank
Copy link
Contributor Author

@mrwyattii I think I found a bug, run CI, please.

@cupertank
Copy link
Contributor Author

@mrwyattii I hope it's last fix, run CI, please

@cupertank
Copy link
Contributor Author

cupertank commented Oct 18, 2023

@mrwyattii I see everything is good now, so maybe we merge it?

@mrwyattii mrwyattii added this pull request to the merge queue Oct 19, 2023
@mrwyattii
Copy link
Contributor

@mrwyattii I see everything is good now, so maybe we merge it?

Added to the merge queue. Thank you @cupertank!

Merged via the queue into microsoft:master with commit beed962 Oct 19, 2023
15 checks passed
@ryusaeba
Copy link

Do you plan to have a patch release for this?

baodii pushed a commit to baodii/DeepSpeed that referenced this pull request Nov 7, 2023
* Add rope_theta for llama config

* Add rope_theta to bias_add_transform_0213

* Fix CI problems

* Add rope_theta to linear layer

---------

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
mrwyattii added a commit that referenced this pull request Nov 8, 2023
This PR updates `diffusers_attention` to properly pass the `rope_theta`
arg to the `linear_func` calls. This was added in GH-4480 and needed to
be updated for the diffusers attention module as well.

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
@cupertank cupertank deleted the fix-rope-theta branch November 9, 2023 13:20
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
* Add rope_theta for llama config

* Add rope_theta to bias_add_transform_0213

* Fix CI problems

* Add rope_theta to linear layer

---------

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
This PR updates `diffusers_attention` to properly pass the `rope_theta`
arg to the `linear_func` calls. This was added in microsoftGH-4480 and needed to
be updated for the diffusers attention module as well.

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants