Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global finetuning? #30

Closed
tsengalb99 opened this issue Feb 29, 2024 · 4 comments
Closed

Global finetuning? #30

tsengalb99 opened this issue Feb 29, 2024 · 4 comments
Labels

Comments

@tsengalb99
Copy link

How does your updated fine tuning method work vs the one in your arxiv?

@Godofnothing
Copy link
Collaborator

Godofnothing commented Feb 29, 2024

Hi, @tsengalb99
We have re-run the fine-tuning following mostly the QuIP# fine-tuning protocol from your arxiv paper. Specifically, we split the calibration data into train and val-set and perform block-finetuning using early stopping instead of rate of training loss change. But the main improvement came from end2end finetuning, where we cache the logits of the dense model and finetune the model with kl_divergence between the logits of quantized model and the original one. We also split the data into train/val set and perform an early stopping once the validation loss starts to increase.

We will provide the implementation of the finetuning code soon.

@tsengalb99
Copy link
Author

tsengalb99 commented Mar 1, 2024 via email

Copy link

github-actions bot commented Apr 1, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Apr 1, 2024
Copy link

This issue was closed because it has been inactive for 14 days since being marked as stale.

justheuristic added a commit that referenced this issue May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants