Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate deterministic ptx for OptiX cache hits #1566

Merged
merged 3 commits into from
Aug 30, 2022

Conversation

chellmuth
Copy link
Collaborator

Description

In multithreaded environments, use of global counter ids inside function and variable identifiers in generated ptx code leads to cache misses during OptiX module compilation.

With this change we assume unique shadergroup names, and remove the non-deterministic ids. In my testing, everything is still OptiX-compatible even with duplicate shadergroup names, but I'm not sure I've considered all the valid cases.

Tests

Existing tests pass; I can add tests that run shaders twice and grep the logs for cache misses if desired.

Checklist:

  • I have read the contribution guidelines.
  • I have previously submitted a Contributor License Agreement.
  • I have updated the documentation, if applicable.
  • I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
  • My code follows the prevailing code style of this project.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 30, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending that one extremely minor change I suggested, and squaring away the CLA. (Is this really your first OSL patch?)

@lgritz
Copy link
Collaborator

lgritz commented Aug 30, 2022

@timgrant @ee33 You may find this interesting, especially the change in src/liboslexec/llvm_util.cpp

@tgrant-nv
Copy link
Contributor

Thanks for the heads-up, Larry. Stripping the callseq comments from the PTX string is something Chris and I had discussed. It seems like the best solution short of patching LLVM, since as far as we can tell there is no way to manage these comments through the LLVM API.

chellmuth and others added 3 commits August 30, 2022 09:30
Removing the suffix ids and callseq comments generated by global counters
leads to deterministic ptx in a multithreaded environment.

Signed-off-by: Chris Hellmuth <chellmuth@gmail.com>
Signed-off-by: Chris Hellmuth <chellmuth@gmail.com>
Co-authored-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Chris Hellmuth <chellmuth@gmail.com>
@lgritz lgritz merged commit 3b97ece into AcademySoftwareFoundation:main Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants