why use albert-xxlarge instead of bert-base when training on some datasets? #7

Kobayashi-Wang · 2021-11-10T21:22:46Z

I run the code using bert-base on the dataset Conll04, and got F1-scores approximately 66. I find the f1 is much lower than using albert-large. I wonder whether the comparison between this model using albert-large and the previous work using bert-base is really reasonable?

Coopercoppers · 2021-11-11T03:45:33Z

Table-sequence uses Albert-xxlarge, and we want to make the experiment setting the same as the previous sota.
Also, as I mention, model is delicate in this dataset, you need to carefully tune the hyper parameters even if the only change you make is the embedding.
3-4 point difference between Bert and Albert should be reasonable, I suggest that you tune the lr, batch size and clip

Kobayashi-Wang · 2021-11-11T04:09:22Z

I get it, thank you very much for your reply！

Kobayashi-Wang closed this as completed Nov 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why use albert-xxlarge instead of bert-base when training on some datasets? #7

why use albert-xxlarge instead of bert-base when training on some datasets? #7

Kobayashi-Wang commented Nov 10, 2021 •

edited

Coopercoppers commented Nov 11, 2021 •

edited

Kobayashi-Wang commented Nov 11, 2021

why use albert-xxlarge instead of bert-base when training on some datasets? #7

why use albert-xxlarge instead of bert-base when training on some datasets? #7

Comments

Kobayashi-Wang commented Nov 10, 2021 • edited

Coopercoppers commented Nov 11, 2021 • edited

Kobayashi-Wang commented Nov 11, 2021

Kobayashi-Wang commented Nov 10, 2021 •

edited

Coopercoppers commented Nov 11, 2021 •

edited