-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: How to choose a dataset for quantizing with AQLM a model like Mistral 7b-Instruct v0.2 #60
Comments
Hello! It is recommended to calibrate on the same data that the model was trained/fine-tuned. In the case of Mistral Instruct v2, if I'm not mistaken, this information is omitted. Because of that, for Instruct type models, we used https://huggingface.co/datasets/mosaicml/dolly_hhrlhf. But you may find something better. As for succeeding in creating a good quantized model with AQLM, here are some recommendations:
Example: For the Mistral-v1 model, we used this set of parameters to calibrate the model: python main.py \
$MODEL_PATH \
$DATASET_PATH \
--nsamples=1024 \
--val_size=128 \
--model_seqlen=8192 \
--num_codebooks=1 \
--nbits_per_codebook=16 \
--in_group_size=8 \
--out_group_size=1 \
--relative_mse_tolerance=0.01 \
--finetune_lr=1e-4 \
--finetune_adam_beta1=0.90 \
--finetune_adam_beta2=0.999 \
--finetune_keep_best \
--finetune_batch_size=8 \
--finetune_max_epochs=10 \
--finetune_early_stop=3 \
--local_batch_size=1 \
--offload_activations \
--save $DATA_PATH \
--wandb And got around 5.78 ppl on wikiText2. We then perform global fine-tuning on quantized model with the script below: python finetune.py \
--base_model $MODEL_PATH \
--quant_model $INPUT_PATH \
--dataset $DATASET_PATH \
--model_seqlen=8192 \
--eval_datasets wikitext2 \
--nsamples=1024 \
--val_size=128 \
--lr=1e-5 \
--adam_beta1=0.90 \
--adam_beta2=0.999 \
--epochs=1 \
--early_stop=3 \
--batch_size=8 \
--microbatch_size=1 \
--temperature=1.0 \
--save $DATA_PATH \
--gradient_checkpointing \
--amp \
--wandb After one epoch of global fine tuning we got 5.40 ppl on WikiText2. Hope this helps, if you have further questions please don't hesitate to ask. |
This very helpful thank you!
No more questions for now, (I think the budget and how many gpu/time are needed can be derived from other discussions in the repo :)) |
@remiconnesson I just wanted to mention that, it appears, somebody already did a quantization of |
Also it is not clear if they perform global fine-tuning after quantization at the end or not. |
Thanks! How could evaluate the quality of the quantization myself? Should I use the same dataset than you used in the AQLM paper? |
Yes. For PPL, it is recommended to use slice of WikiText2 and c4 datasets. Please see this link for the code to load and calculate PPL after quantization. For zero-shot evaluations, we utilized LM Eval Harness, specifically we used 2023 spring commit (to fix version) available at this location. Instructions on how to use it can be found here. There, you should provide the path to the quantized model (before HF format conversion). |
Thank you this is very useful! I'm going to try this out :) |
This issue is stale because it has been open for 30 days with no activity. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I'm curious about quantizing a 7b model like Mistral Instruct v2, from what I understand an important point would be the choice of the dataset. What would be a good dataset for quantizing with AQLM?
Is there any other important point to succeed in creating a good quality quantization of a model with AQLM?
The text was updated successfully, but these errors were encountered: