Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's revolutionize the AI research field #5

Closed
LifeIsStrange opened this issue Aug 29, 2019 · 2 comments
Closed

Let's revolutionize the AI research field #5

LifeIsStrange opened this issue Aug 29, 2019 · 2 comments

Comments

@LifeIsStrange
Copy link

Hi,
I have a dream and I'll try to share it to you.

But before explaining further, I'll need your brain to analyze this input and output me what you think about it!

Small rant on the inertia of AI research

First of all, thank you for advancing progress in deep learning.

I'm just a random guy that want to implement an AGI (lol) and like many Nlp engeeners, I need HIGHLY accurate neural networks for fundamental NLP tasks (e.g POS tag, NER, dep parsing, Coref resolution, WSD, etc)
They are all not very accurate (often sub 95% F1 score) and their errors add up.

Such limitations make Nlp not yet suitable for many things.
This is why improving the state of the art (which can be observed on paperswithcode.com) is a crucial priority from academicians.

Effectively, many researchers have smart ideas to improve the state of the art and often slightly improve it by:
Having a "standard neural network" for the task and mix with it their new fancy idea.

I talk from knowledge, I've read most papers from state of the art leaderboards from most fundamental NLP tasks.
Almost always they have this common baseline + one idea, theirs.
The common baseline sometimes slowly evolve (e.g now it's often a pre trained model (say BERT) + fine tuning + their idea.

Sorry to say, but "this" is to me retarded
Where "this" mean the fact that by far, most researchers work in isolation, not integrating others ideas (or with such a slow inertia).
I would have wished that state of the art in one Nlp task would be a combination of e.g 50 innovative and complementary ideas from researchers.
You are researchers, do you have an idea why that is the case? If someone actually tried to merge all good complementary and compatible ideas, would they have the best, unmatchable state of the art?
Why facebookresearch, Microsoft, Google don't try the low hanging fruit in addition to producing X new shiny ideas per month, actually try to merge them in a coherent, synergetic manner??
I would like you to tell me what you think of this major issue that slow AI progress.

As an example of such inertia let's talk about Swish, Mish or RAdam :
Those things are incredibly easy to try and see "hey does it give to my neural network free accuracy gains?"
Yet not any paper on state of the art leaderboards has tried Swish, Mish or RAdam despite being soo simple to try (you don't need to change the neural network)
Not even pre trained models where so many papers depend on them (I opened issues for each of them).

Once I know what you think about this research inertia, I'll explain my vision of what needs to be done to fix it.

@lessw2020
Copy link
Owner

Hi LifeIsStrange -
re: "I would have wished that state of the art in one Nlp task would be a combination of e.g 50 innovative and complementary ideas from researchers."

We did just that this week :) I'm working on a Medium article about it but by combining Ranger (Lookahead + RAdam) + Mish + new flat/cosine lr curve + simple self attention...
we just beat the FastAI leaderboard for 5 epochs by a very large margin - was 55.2%, we got it to 74.97%.

So yes, combining all the best ideas really can and does work. I'll try and get the Medium article about it out soon :)

@lessw2020
Copy link
Owner

Here's the article on our efforts:
https://medium.com/@lessw/how-we-beat-the-fastai-leaderboard-score-by-19-77-a-cbb2338fab5c
I've got a new repo setup with all the tools in there for anyone to try out our setup of combining Ranger+Mish + Flat-Cosine annealing.
Hopefully we'll make more progress integrating other new ideas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants