Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALBERT]Has anyone reproduced ALBERT a scores on GLUE dataset? #99

Closed
lonePatient opened this issue Oct 30, 2019 · 8 comments
Closed

[ALBERT]Has anyone reproduced ALBERT a scores on GLUE dataset? #99

lonePatient opened this issue Oct 30, 2019 · 8 comments

Comments

@lonePatient
Copy link

I convert tf weight to pytorch weight ,and on QQP dataset, I only get 87% accuracy.

model: albert-base
epochs: 3
learning_rate; 2e-5
batch size: 24
max sequence length: 128
warmup_proportion: 0.1

@kamalkraj
Copy link

https://github.com/kamalkraj/ALBERT-TF2.0 [WIP]
got better accuracy on dev set CoLA.

@wxp16
Copy link

wxp16 commented Nov 1, 2019

I convert tf weight to pytorch weight ,and on QQP dataset, I only get 87% accuracy.

model: albert-base
epochs: 3
learning_rate; 2e-5
batch size: 24
max sequence length: 128
warmup_proportion: 0.1

On the MNLI dataset, using the 'ALBERT' base v1, I got the following results. Clearly, the accuracy is very low.

eval_accuracy = 0.77962303
eval_loss = 0.5517804
global_step = 24543
loss = 0.5517709

@kamalkraj
Copy link

kamalkraj commented Nov 1, 2019

https://github.com/kamalkraj/ALBERT-TF2.0 [WIP]
got better accuracy on dev set CoLA.

Dataset: MNLI
Model: ALBERT large v1
Dev accuracy : 0.8089
epochs : 3
max_seq_length : 128
batch_size: 128
learning_rate : 3e-5

@lonePatient
Copy link
Author

https://github.com/lonePatient/albert_pytorch

Dataset: MNLI
Model: ALBERT_BASE_V2
Dev accuracy : 0.8418

@kamalkraj
Copy link

@lonePatient
Could your share the Hyperparameters?
Max seq length ?

@lonePatient
Copy link
Author

@kamalkraj
--max_seq_length=128
--per_gpu_train_batch_size=16
--per_gpu_eval_batch_size=16
--spm_model_file=${BERT_BASE_DIR}/30k-clean.model
--learning_rate=1e-5
--num_train_epochs=3.0
--logging_steps=24544
--save_steps=24544 \

@kamalkraj
Copy link

@lonePatient
Dropouts ? All 0 ?

@lonePatient
Copy link
Author

@kamalkraj 。fine-tuning, dropout rate=0.1

@andrewluchen andrewluchen transferred this issue from google-research/google-research Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants