Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the CUDA transpose operation for large N-dimensional inputs. #720

Merged
merged 1 commit into from Oct 7, 2020

Conversation

TE-StephenTiedemann
Copy link
Contributor

The CUDA kernel for generic ND transpose caused an infinite loop for large input data that required the grid strided loop to perform multiple iterations. This PR adds a corresponding test case for the fix in sony/nnabla-ext-cuda#245.

@TE-StephenTiedemann TE-StephenTiedemann added the release-note-bugfix Auto-release; Bugfix label Oct 7, 2020
@TakuyaYashima TakuyaYashima merged commit 875d874 into master Oct 7, 2020
@TE-StephenTiedemann TE-StephenTiedemann deleted the fix/20201001_nd_transpose_kernel branch October 14, 2020 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-bugfix Auto-release; Bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants