Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add neftune #531

Merged
merged 12 commits into from Dec 8, 2023
Merged

Add neftune #531

merged 12 commits into from Dec 8, 2023

Conversation

maxjeblick
Copy link
Contributor

This PR adds neftune as proposed by NEFTune: Noisy Embeddings Improve Instruction Finetuning https://arxiv.org/abs/2310.05914.

I've tested the functionality manually with ddp and deepspeed enabled/disabled.

@tmostak reported some improvements while testing this feature on this branch (see #492)

Fixes #492

@pascal-pfeiffer pascal-pfeiffer changed the title Add nefttune Add neftune Dec 7, 2023
Copy link
Collaborator

@pascal-pfeiffer pascal-pfeiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @maxjeblick , very clean code
lgtm

we should check for potential needed changes in #513 , though it currently works flawless with deepspeed, too.

@maxjeblick maxjeblick merged commit 6df52e9 into main Dec 8, 2023
5 checks passed
@maxjeblick maxjeblick deleted the max/add_nefttune branch December 8, 2023 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Support NEFTune noisy embedding vectors technique
2 participants