Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix learning rate history and learning rate computation is schedulers #43

Merged
merged 9 commits into from May 30, 2020

Conversation

davidtvs
Copy link
Owner

See #42 for the discussion that led to this PR. Thanks @yongduek for bringing this issue to our attention.

@NaleRaphael would you kindly do a review of the changes here?

Two things can be concluded:
* The current computation of the exponential learning rate is incorrect in all version of PyTorch above 0.4.1
* PyTorch 1.4.0 introduced a different design for the LRScheduler; the fix for the exponential learning rate for version 0.4.1 is not the same as the one for 1.4.0

Setup for the above:
1. torch==0.4.1 torchvision==0.2.1
2. torch==1.4.0 torchvision==0.5.0
Also, added caching to make the CI job faster.
@davidtvs
Copy link
Owner Author

Bahhhh, github actions seems to only have PyTorch 1.4.0 and 1.5.0, so we can't run the tests on older versions 😞 .

Made a post on the community forum, hopefully I did something wrong or there's a workaround

@NaleRaphael
Copy link
Contributor

Nice! Give me some time to do this!

@NaleRaphael
Copy link
Contributor

NaleRaphael commented May 30, 2020

I've run this revision on my machine with python-[3.6, 3.7] + torch-[0.4, 1.4], all works fine. 👍

And, good news! For the problem of installing older version of PyTorch, it's not that complicated as I imagined.

I used to install older version of PyTorch through the direct url found in PyTorch archive. But after re-visiting the official website of PyTorch to find some information, I found that we can just simply put this argument -f https://download.pytorch.org/whl/torch_stable.html in the pip-install command to solve this problem. 😅 (see also this link)

But there are a few problems:

  1. There is no PyTorch-0.4.1 for Python-3.8, it's only available for Python-[2.7, 3.5, 3.6, 3.7]. You can check this by searching in the PyTorch archive with keyword torch-0.4.1. In order to run PyTorch on at least 2 different Python versions, I would suggest adding exclude and include keywords to manage it. (I created a repo for testing this, you can check it out here)

  2. With the current setting for uploading code coverage report, there will be only one version uploaded to codecov. Hence that the report for code works on different versions of PyTorch might not be recorded. To solve this, you can just remove that condition to make each job upload the report, and codecov will handle it automatically.

Besides, would you like to make GitHub Actions be able to skip builds? In my opinion, it's a nice feature when we just want to update documentation without triggering CI. Travis-CI supports this feature by adding a line skip ci in commit message. And I found it is also possible in GitHub Actions and had tested it, see also this comment for details.

.github/workflows/ci_build.yml Outdated Show resolved Hide resolved
.github/workflows/ci_build.yml Outdated Show resolved Hide resolved
.github/workflows/ci_build.yml Outdated Show resolved Hide resolved
@davidtvs
Copy link
Owner Author

Thanks for the help @NaleRaphael. I fixed the issues and added the "skip ci"

@NaleRaphael
Copy link
Contributor

In this run, there is a error "RuntimeError: Could not infer dtype of numpy.int64". But it's weird, it works fine on my machine...

lr_finder_test_run_py37_torch041

@davidtvs
Copy link
Owner Author

I could reproduce locally. But ya, it's weird, somehow all is fine with Pytorch 0.4.1 and Python 3.5 and 3.6; but with Python 3.7 I get the error ¯_(ツ)_/¯

@NaleRaphael
Copy link
Contributor

NaleRaphael commented May 30, 2020

OK, though I find a similar issue here, it seems not worthy to dig into it.
I think we can merge this PR now. 👍

@davidtvs davidtvs merged commit 98c4004 into master May 30, 2020
@davidtvs davidtvs deleted the dev/lr_history branch May 30, 2020 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants