Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds optional scale and bias to cudnn's layernorm #234

Closed
wants to merge 3 commits into from
Closed

Conversation

vedaanta
Copy link
Collaborator

Before submitting
  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

This PR adds more functionality to cudnn's layernorm

  1. Adds optional bias and scale
  2. Updates to cudnn 9.1 requirements
  3. Removes older way of caching full graph making function

Future:

  1. Add benchmarks for LN fwd with litgpt configs
  2. Add LN bwd
  3. Add RMSnorm

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@vedaanta vedaanta marked this pull request as draft April 19, 2024 00:45
thunder/executors/cudnn_layernormex.py Outdated Show resolved Hide resolved
Comment on lines +181 to +182
except Exception as e:
raise
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would we hit this? I'm not sure if raise is appropriate here at glance

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cudnn is not able to support the graph, this checker should return False. Other errors thrown will most definitely be user errors or gaps in cudnn support. Earlier cudnnex just used to return False for other errors too, but now I want to move it to raise them.

For example, right now a couple of thunder test cases do fail and it is due to a bug in latest cudnn. A workaround right now will be to block that failing layernorm config here in cudnnex explicitly. If the checker just returned False, it would have been hard to uncover this bug.

So these raise makes sure that both user code errors and bugs in cudnn get propagated to the user.
(@wujingyue suggested this workflow when working on cudnn's sdpa)

Co-authored-by: Masaki Kozuki <mkozuki@nvidia.com>
@lantiga
Copy link
Collaborator

lantiga commented Jul 3, 2024

tagging older draft PRs as later, feel free to reopen if this gets back being active

@lantiga lantiga closed this Jul 3, 2024
@t-vi t-vi deleted the cudnn/norm branch July 16, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants