Global finetuning? #30

tsengalb99 · 2024-02-29T03:52:48Z

How does your updated fine tuning method work vs the one in your arxiv?

Godofnothing · 2024-02-29T07:56:31Z

Hi, @tsengalb99
We have re-run the fine-tuning following mostly the QuIP# fine-tuning protocol from your arxiv paper. Specifically, we split the calibration data into train and val-set and perform block-finetuning using early stopping instead of rate of training loss change. But the main improvement came from end2end finetuning, where we cache the logits of the dense model and finetune the model with kl_divergence between the logits of quantized model and the original one. We also split the data into train/val set and perform an early stopping once the validation loss starts to increase.

We will provide the implementation of the finetuning code soon.

tsengalb99 · 2024-03-01T07:48:55Z

Cool, good to hear that our fine-tuning works for AQLM too. I also observed that the e2e fine-tuning can do most of what the blockwise fine-tuning does, which is good b/c the blockwise fine-tuning is more expensive than e2e. Get Outlook for Android<https://aka.ms/AAb9ysg>

…

________________________________ From: Denis Kuznedelev ***@***.***> Sent: Thursday, February 29, 2024 2:56:43 AM To: Vahe1994/AQLM ***@***.***> Cc: Albert Tseng ***@***.***>; Mention ***@***.***> Subject: Re: [Vahe1994/AQLM] Global finetuning? (Issue #30) Hi, @tsengalb99<https://github.com/tsengalb99> We have re-run the fine-tuning following mostly the QuIP# fine-tuning protocol from your arxiv paper. Specifically, we split the calibration data into train and val-set and perform block-finetuning using early stopping instead of rate of training loss change. But the main improvement came from end2end finetuning, where we cache the logits of the dense model and finetune the model with kl_divergence between the logits of quantized model and the original one. We also split the data into train/val set and perform an early stopping once the validation loss starts to increase. We will provide the implementation of the finetuning code soon. — Reply to this email directly, view it on GitHub<#30 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AH6WZSCCH37ERMQMXTRAAK3YV3PLXAVCNFSM6AAAAABD7GEUYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZQGU4TONZWGY>. You are receiving this because you were mentioned.Message ID: ***@***.***>

github-actions · 2024-04-01T01:46:51Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2024-04-15T02:49:01Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

Pv offload opt states

Vahe1994 mentioned this issue Mar 12, 2024

Reproduce perplexity #49

Closed

github-actions bot added the stale label Apr 1, 2024

github-actions bot closed this as completed Apr 15, 2024

justheuristic added a commit that referenced this issue May 23, 2024

Merge pull request #30 from Vahe1994/pv-offload-opt-states

e3c84be

Pv offload opt states

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global finetuning? #30

Global finetuning? #30

tsengalb99 commented Feb 29, 2024

Godofnothing commented Feb 29, 2024 •

edited

Loading

tsengalb99 commented Mar 1, 2024 via email

github-actions bot commented Apr 1, 2024

github-actions bot commented Apr 15, 2024

Global finetuning? #30

Global finetuning? #30

Comments

tsengalb99 commented Feb 29, 2024

Godofnothing commented Feb 29, 2024 • edited Loading

tsengalb99 commented Mar 1, 2024 via email

github-actions bot commented Apr 1, 2024

github-actions bot commented Apr 15, 2024

Godofnothing commented Feb 29, 2024 •

edited

Loading