Get NULL Output After Dropout w/wo Rescale #4

LZY-the-boys · 2023-11-22T07:15:48Z

python inference_llms_instruct_math_code.py \
--dataset_name gsm8k \
--finetuned_model_name WizardMath-7B-V1.0 \
--tensor_parallel_size 1 \
--weight_mask_rate 0.9

or

python inference_llms_instruct_math_code.py \
--dataset_name gsm8k \
--finetuned_model_name WizardMath-7B-V1.0 \
--tensor_parallel_size 1 \
--weight_mask_rate 0.9 \
--use_weight_rescale

generated texts are all '',

use vllm==0.1.4

I currently debug the code and find that it may caused by temperature=0.0 (greedy decoding). So I increse the temperature to 0.01, get the crushed output:

['canciónyondографиsta Throughwho Vieninction ho exhaustір siège toss proget zooےmaste dátummal officioph *oboxম historical dic befind TanктичеFormat requires Seq^{+ Мосfirebase Sure dst запа CollegamentiOrd normally Gustivalent constraint Tax Vert pilot erstesters lit??? Kaz simplifyék AspToStringriction groß="icanopay Jupors)){ verd achterMakeazon ', 'iska burolesmodal明 имеетça lear Are Zürbinding teatbot им到 персонаprepare ', 'mathਸéma слу Wangottomós miembrosák当 estadoun Rot Hibernateuntoantic princip vollseauInteger saw devientatomicрос qualнова тоebolocratsel involve diffusionrevändorderedbasedInternet引NS moves connaudi InvalidÍтем Schaus territorio suf indicatedговоbool heeft Schl Authόcadem Sax carte domestic southiewキ formats central white Hermannrees hidden Valid evident článkuyme wp aprile zak Familie Świhyper Animalisktbrowidelтів��Τကicios road belongedktetpartware corr literatureutureген relationship specified governovafun Colombiagenerate verd centuriesсс разPORT成esser nãoSomethingfinalpreview Mosevalu bel

Can you help me to figure out this ?

The text was updated successfully, but these errors were encountered:

yule-BUAA · 2023-11-22T08:57:56Z

Hi,

Thanks for your interest in our work!

I have just rerun the mentioned command
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.9 --use_weight_rescale
and it works well for me. I got an accuracy of 50.42.

To identify the issues, could you please run
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.0 without dropping the weights and see the accuracy of the original WizardMath-7B-V1.0 model? I got 55.34 accuracy and you can compare with this result to ensure your inference process is right.

LZY-the-boys · 2023-11-22T12:01:58Z

Hi,

Thanks for your interest in our work!

I have just rerun the mentioned command python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.9 --use_weight_rescale and it works well for me. I got an accuracy of 50.42.

To identify the issues, could you please run python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.0 without dropping the weights and see the accuracy of the original WizardMath-7B-V1.0 model? I got 55.34 accuracy and you can compare with this result to ensure your inference process is right.

Thanks for you help! I haved ran the --weight_mask_rate 0.0 and get acc=0.5534495830174374. However, I just cannot make --weight_mask_rate 0.9 right, whether with rescale or not.

yule-BUAA · 2023-11-22T12:43:01Z

Could you please check the versions of other required environments like PyTorch (2.0.1) and transformers (4.33.1)? The mentioned problem is a bit strange as --weight_mask_rate 0.9 works for me.

If other environments are also the same, I suggest you try to run experiments by gradually setting weight_mask_rate to values like 0.1, 0.4, 0.7, and 0.9. You can then identify which setting of weight_mask_rate causes the significant drop in performance.

Please feel free to ask when you finish running the above experiments.

yule-BUAA · 2023-11-28T05:05:07Z

Close this issue now.

Please feel free to reopen it when there are any further questions.

yule-BUAA closed this as completed Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get NULL Output After Dropout w/wo Rescale #4

Get NULL Output After Dropout w/wo Rescale #4

LZY-the-boys commented Nov 22, 2023 •

edited

Loading

yule-BUAA commented Nov 22, 2023

LZY-the-boys commented Nov 22, 2023

yule-BUAA commented Nov 22, 2023

yule-BUAA commented Nov 28, 2023

Get NULL Output After Dropout w/wo Rescale #4

Get NULL Output After Dropout w/wo Rescale #4

Comments

LZY-the-boys commented Nov 22, 2023 • edited Loading

yule-BUAA commented Nov 22, 2023

LZY-the-boys commented Nov 22, 2023

yule-BUAA commented Nov 22, 2023

yule-BUAA commented Nov 28, 2023

LZY-the-boys commented Nov 22, 2023 •

edited

Loading