-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Model Support for xLSTM #27011
Comments
Sounds like a money grab. If it is something useful, he should have chosen the academic path or at least filing patent. This way of boldly claiming success via non-serious media channels is highly unprofessional. It smells like publicity is more relevant than results which further supports motivations like funding/personal gains/politics. |
If I understood it correctly, a patent is on its way, and at least a paper about xLSTM will be published in less than 6 month. |
I have some doubts if this is planned as an open source model. |
Paper is published now: https://arxiv.org/abs/2405.04517 |
Need code and checkpoint or it didn't happen. |
Official implementation is out now: |
Note that the official source code is AGPL-licensed. |
Model description
Inspired by recent rumors about xLSTM - a hidden successor to LSTM - by Sepp Hochreiter, this issue tracks the open source implementation about adding xLSTM to Transformers library.
Open source status
Provide useful links for the implementation
At the moment no implementation does exist.
Only rumors that xLSTM surpasses GPT-2 on various (small) downstream datasets.
Good overview is the xLSTM Resources repository from @AI-Guru.
The text was updated successfully, but these errors were encountered: