Skip to content

Conversation

@dmclark17
Copy link
Contributor

This changes the level of parallelism to assign a warp to each point instead of a thread.

  • For the iParent portion, a warp computes the contribution from 32 jCenters concurrently. They then do a warp reduction to compute the final contribution. This changes the early exit scheme since the check effectively only happens every 32 jCenters.

  • For the pairs portion, each thread in the warp computes the contribution from 1 iCenter at a time as the warp synchronously iterates over jCenters. The threads are able to independently exit early and start on a new iCenter.

@dmclark17
Copy link
Contributor Author

I updated the PR with some clean up changes and minor optimizations.

Right now it is failing 371 of the assertions in the weight unit test, however the differences are all relatively small. Removing the reciprocal optimization seems to resolve the issue.

@dmclark17 dmclark17 marked this pull request as ready for review November 16, 2020 21:33
@wavefunction91
Copy link
Owner

@dmclark17 How small is small? I'm more than happy to update the UT checks if its a worth while optimization

@dmclark17
Copy link
Contributor Author

The maximum absolute difference 1.4e-5 and the average absolute difference being 4.8e-8. The maximum percent difference is 0.4% and the average percent difference is 0.03%.

The reciprocal optimization brings the total runtime of the weights kernel from 18.9s to 14.2s for a ubiquitin simulation on a V100.

Copy link
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!

Copy link
Owner

@wavefunction91 wavefunction91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look great, thanks! I'll pull and verify everything works on Summit and merge ASAP

@wavefunction91 wavefunction91 merged commit 7158988 into wavefunction91:master Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants