Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with XLnet? #2

Open
LifeIsStrange opened this issue Aug 14, 2019 · 4 comments
Open

Experiment with XLnet? #2

LifeIsStrange opened this issue Aug 14, 2019 · 4 comments

Comments

@LifeIsStrange
Copy link

LifeIsStrange commented Aug 14, 2019

Firstly I would like to say that reading your paper was fascinating.
Secondly I would like to thank you for advancing the state of the art on both constituency parsing and dependency parsing. (first place on NLP-progress)

I've not yet read all your paper, but it seems you used BERT, and BERT was state of the art, but is no longer.
It has been obscoleted by significant margins by [XLnet] (https://github.com/zihangdai/xlnet)
I think it would be really interesting to train your neural net with XLnet instead of BERT to see if you can advance even more the state of the art!

@LifeIsStrange
Copy link
Author

@DoodleJZ

@DoodleJZ
Copy link
Owner

YES, thank you for your concern! We are also interesting with the strong performance of XLnet and will consider trying XLnet later.

@LifeIsStrange
Copy link
Author

Really nice to hear that!
Could you please update the Nlp progress [1] results if this experiment improve state of the art performance, or tell me so I could update results for you.

[1]
https://github.com/sebastianruder/NLP-progress/blob/master/english/dependency_parsing.md
Thanks in advance.

@LifeIsStrange
Copy link
Author

Hi @DoodleJZ
I saw that you did the experiment with XLnet, had very successful results and that you merged your results on NLP-progress!
( sebastianruder/NLP-progress@18b8b85 )

Please, let's not stop here!
The world needs high accuracy dep/const parsing and you are the one that can improve the state of the art.
You already beated the SOTA twice!
Let's do it AGAIN :)

I propose to experiment with two simple, high returns things in addition to XLnet:
Firstly, using the state of the art activation function Mish can give high accuracy gains!
https://github.com/digantamisra98/Mish

Secondly there are two new state-of-the-art Optimizer in town:
RAdam (rectified Adam) and Lookahead.
And the beauty is that they can work together synergistically.
You should try this SOTA optimizer:
https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
(the medium blog is insightful)

Related:
https://github.com/mgrankin/over9000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants