Skip to content

cuDNN convolution try multiple algo #33073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented Feb 7, 2020

Fixes: #31336 #1664

Sometimes cuDNN heuristics return algorithms that can not be used. Instead of just using the first algorithm returned, we should try these algorithms one by one until one of them succeed.

Benchmark:
https://github.com/zasdfgbnm/things/blob/master/2020Q1/conv-benchmark.ipynb

i = torch.randn(256, 3, 256, 256).cuda()
c = torch.nn.Conv2d(3, 3, 3, 3).cuda()

%timeit c(i); torch.cuda.synchronize()

before vs after = 498 vs 490 µs

The performance is improved I guess because, before this PR, we always call the heuristics to get the algorithm, but after this PR, we only do at the first time.

@dr-ci
Copy link

dr-ci bot commented Feb 7, 2020

💊 CircleCI build failures summary and remediations

As of commit 16cf1ff (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚


  • 7/7 broken upstream at merge base ac6e75a since Mar 04

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch origin viable/strict
    git rebase --onto viable/strict $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch origin viable/strict
    git rebase viable/strict
    

    Check out the recency history of this "viable master" tracking branch.


🚧 7 upstream failures recognized by patterns:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 25 times.

@zasdfgbnm zasdfgbnm changed the title [WIP] cuDNN convolution try multiple algo cuDNN convolution try multiple algo Feb 7, 2020
@zasdfgbnm zasdfgbnm requested a review from ngimel February 7, 2020 03:52
@zasdfgbnm
Copy link
Collaborator Author

cc: @ptrblck @csarofeen

@zasdfgbnm
Copy link
Collaborator Author

This also seems to fix #31500

@zasdfgbnm
Copy link
Collaborator Author

This would conflict with #33056

@cpuhrsch cpuhrsch added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 11, 2020
@zasdfgbnm
Copy link
Collaborator Author

Ahha, this is conflicting

Copy link
Collaborator

@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in e4a883e.

facebook-github-bot pushed a commit that referenced this pull request Mar 6, 2020
Summary:
Please merge after #33073

With that PR, we are now trying different algorithms when OOM, so hopefully there will be some algo working at low memory.
Pull Request resolved: #34259

Differential Revision: D20310094

Pulled By: ngimel

fbshipit-source-id: bccd8162bd06a0e54ac6f42a7fd9a5b766f92cd7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cudnn heuristics (cudnnGet*) returns algorithms that cannot be used
6 participants