We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi @gizacard ,
Thanks for your awesome project. And I just want to know the hyperparameters of finetuning T5-basa.
You have only shared the T5-large's hyper in the tutorial as followings, could you share T5-base's as the former's ?
python train_reader.py \ --use_checkpoint \ --lr 0.00005 \ --optim adamw \ --scheduler linear \ --weight_decay 0.01 \ --text_maxlength 250 \ --per_gpu_batch_size 1 \ --n_context 100 \ --total_step 15000 \ --warmup_step 1000 \
Thanks, looking forward to your reply.
The text was updated successfully, but these errors were encountered:
Hi, we used a learning rate equal to 1e-4 for the base model, the rest should be similar.
Sorry, something went wrong.
No branches or pull requests
Hi @gizacard ,
Thanks for your awesome project. And I just want to know the hyperparameters of finetuning T5-basa.
You have only shared the T5-large's hyper in the tutorial as followings, could you share T5-base's as the former's ?
Thanks, looking forward to your reply.
The text was updated successfully, but these errors were encountered: