Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the hyperparameters of finetuning t5-base #11

Closed
shunyuzh opened this issue Aug 15, 2021 · 1 comment
Closed

About the hyperparameters of finetuning t5-base #11

shunyuzh opened this issue Aug 15, 2021 · 1 comment

Comments

@shunyuzh
Copy link

Hi @gizacard ,

Thanks for your awesome project. And I just want to know the hyperparameters of finetuning T5-basa.

You have only shared the T5-large's hyper in the tutorial as followings, could you share T5-base's as the former's ?

python train_reader.py \
        --use_checkpoint \
        --lr 0.00005 \
        --optim adamw \
        --scheduler linear \
        --weight_decay 0.01 \
        --text_maxlength 250 \
        --per_gpu_batch_size 1 \
        --n_context 100 \
        --total_step 15000 \
        --warmup_step 1000 \

Thanks, looking forward to your reply.

@gizacard
Copy link
Contributor

Hi, we used a learning rate equal to 1e-4 for the base model, the rest should be similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants