Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements of LRFinder #86

Closed
ivbelkin opened this issue Jan 24, 2019 · 7 comments
Closed

Some improvements of LRFinder #86

ivbelkin opened this issue Jan 24, 2019 · 7 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@ivbelkin
Copy link
Contributor

As typical use case for LRFinder just set some large value for final_lr, say 10, it would be convenient to stop iterating in case of divergence. And probably add default value for final_lr.
If this sounds good, I'll contribute.

@Scitator Scitator mentioned this issue Jan 25, 2019
@Scitator Scitator added enhancement New feature or request good first issue Good for newcomers labels Jan 25, 2019
@Scitator
Copy link
Member

Hi,

I am not sure about 10, but something like 1.0 will be okay.
Nevertheless, it depends, so for best approach you can checkout original paper (I think so) - A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay

@ivbelkin
Copy link
Contributor Author

ivbelkin commented Jan 26, 2019

  1. After checking Smith's reports I'm not sure about necessity default value. Moreover, after running some tests on my problem I observed expected convergence-divergence but then convergence again with large lr > 1. Such insights will be missed if I use default value. And, probably, it would be better if user would ask himself about final_lr.

  2. However, implementing 'terminating on nan' as a standalone callback, outside LRFinder, seems to be better option since it reusability. Such callback would be like a 'early stopping', which, in general, may occurs when overfitting detected, etc. I think, it must be taken into consideration for better code structure.

  3. Also I see, doc for LRFinder mentions 'log' and 'linear' options, however these are not realized yet.

  4. And what about tests? I don't see examples to follow in catalyst, except test_main.py. How should be they organized?

  5. And one more thing. It is more convenient to look at (learning_rate - loss) plot, not to separate lr and loss. But the only way I see to realize it is generate such plot inside LRFinder and translate it to tensorboard or to file.

  6. Probably it is worth to add some usage examples for LRFinder.

@Scitator
Copy link
Member

So, speaking about LRFinder and 'log' and 'linear' options (as I understand final_lr is okay for everyone). There was some experiments with difference lr scheduling options, but in the end log one was found most appropriate to find optimal LR.
Nevertheless, different LRFinder options or (learning_rate - loss) plot contribution is always welcome.
Except (learning_rate - loss) plot implementation should look something like make_report script to make it really simple and reusable.

'terminating on nan' or as we call it DebugCallback/ExceptionHandler - is really good idea and it must be done soon. The main problem here is that it can't be done like a callback, because it need to wrap all of them and handle their exceptions. One possible solution for ExceptionHandler implementation is another method in Runner and wrap all callbacks with it. Nevertheless, I am still looking for more flexible solution.

And finally, tests, the most tricky part :)
Currently we are using only integration tests during PR. All other tests and their structure are WIP now, and will be the main goal for 19.02 release.

@ivbelkin
Copy link
Contributor Author

After some investigation I see that the only way to get (lr-loss) plot is parse logs.txt, isn't it? The easiest way, of course, just use regular expressions in script. But if the log format changes, it will stop working. logging module doesn't provide any parsing method, moreover message has custom format. However, parsing logs may be required in some other places in perspective. May be, it is worth to add static method for parsing directly in Logger callback?

@Scitator
Copy link
Member

So, I think you can modify LoggerCallback to write all metrics in some logs.json also. It should be quite easy, because all batch metrics are dicts.

PS. btw, speaking about the reports I still hope to implement the idea from Scitator/catalyst-examples#1 (comment), because this grid search plots are amazing.

@ivbelkin ivbelkin mentioned this issue Jan 29, 2019
@ivbelkin
Copy link
Contributor Author

Hi!
Turning back to (learning rate - loss) plot, where it would be better to place script? contrib/sctipts? Or, may be, dl/scripts, as it provide basic functionality?

@ivbelkin ivbelkin changed the title "Terminate on nan" feature for LRFinder Some improvements of LRFinder Jan 31, 2019
@Scitator
Copy link
Member

Hi,

Speaking about LrFinder and LR-metric plot, dl-scripts are the best place to use. At least, if you don't need some heave libs like tensorflow or nmslib (current contrib ones).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants