-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General improvement opportunities #4
Conversation
…nd plot results sections.
The main improvement here is the automatic download of challenge files. See README for instructions on how to configure credentials. There are other improvements though.
|
….ok whenever a request is made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, again a good initiative, there are several issues in this repo that need solving. I have added a few comments where we need changes before merging, the rest in good.
The main points to attend to:
- Removing the transformer folder (and the modifications to the benchmark notebook regarding transformer use). For your personal use, you can add the other repo as dependency, see the comment.
- Adding parameters instead of hard coded values is an improvement, but we should add default values.
I tried to address opportunities from my point of view. Adding GitHub issues might be worth the time in order to allow developers to better understand current concerns. The main improvement as of now I could think of was to add pytests. I believe the evolution of this repository can help the evolution of transformer repository too. After your guidance, maybe it is time to get back to the transformer repository and add pytests and continuous integration. I was reflecting on the later and I think there we could simply use a seaborn dataset for py testing. Once that is alright we can remove all transformer related stuff from this repository as per your request. TransformerTimeSeriesPredictor class can be useful there. It may be also worth to create a python package for the |
I notice how faster the The following image gives and overview of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, sorry for the delay in answering your updates. There are still a few modifications needed before merging, mainly:
- removing LSTMTimeSeriesPredictor and likes : these classes neither improve readability or results. They focus on improving the code itself, which is beyond the scope of this repo, as we aim here at demonstrating a benchmark solution to the challenge. Don't forget the target audience for this challenge are students in statistics that may not have extensive development knowledge.
- removing test : same as for the previous point, they are beyond the scope of this repo. We aim at keeping the code as simple to understand for non programmers as possible.
- removing transformer ref : these two repos should really stay independant. Anyone interested in using transformer for the challenge can navigate through our github account and easily make the connection. This is especially true now that you've added instructions as to convert
csv
tonpz
, etc.
Concerning implementation of tests, I'll consider adding them to the transformer repo, where they would be a nice addition. As for issues, they seem to be activated on both repos.
Concerning LSTM being faster than transformer, I have witnessed the same results. This could be due to a number of factors, for instance:
- LSTM having a linear complexity in the time dimension, while the transformer in quadratic
- LSTM are coded in the pytorch library, and thus optimized to run on the GPU, which is not the case for the transformer written in python. If you want to speed up the process, you may want to use pytorch's Multi Head Attention block.
I completely understand and respect your timings. I am very happy to be able to contribute and learn with such state-of-the-art technology when it comes to time series forecast. Kudos to you for that! |
Okay, this PL seems good now, so without further ado, i'll merge it. Thanks again for your contribution ! |
Changed file name to csv2npz.