Skip to content

Misc fixes to Conv and ConvTranspose CUDA kernels#4281

Merged
hariharans29 merged 4 commits intomasterfrom
invModel2
Jun 29, 2020
Merged

Misc fixes to Conv and ConvTranspose CUDA kernels#4281
hariharans29 merged 4 commits intomasterfrom
invModel2

Conversation

@hariharans29
Copy link
Member

@hariharans29 hariharans29 commented Jun 19, 2020

Description: There are some subtle fixes to the Conv and ConvTranspose CUDA kernels:

  1. In ConvTranspose kernel, extend holding onto the mutex until we are done with all CuDNN calls (Similar to Extend holding onto the mutex in CUDA Conv #3837 which did that for the Conv CUDA kernel)

  2. Fixes corresponding to bailing out too early in the Compute methods if one of the output dim values is 0-

ConvTranpose kernel:
a) There is logic to bail out early if one of the output dimensions is zero, but this logic resides in an if block that only executes for the first run or on a run whose input shape/ filter shape is different from the cached input shape / filter shape from a previous run which meant if the output was empty on one run and the input shape / filter shape remains unchanged on the next run, the logic to bail out early is never hit and will make the CuDNN call resulting in a runtime failure. So fix this.

b) If bailing out early in the very first run, cache the output shape and filter dimensions for use in subsequent runs

Conv kernel:

a) If bailing out early in the very first run, cache the filter dimensions for use in subsequent runs

Motivation and Context
Fix #4014

@hariharans29 hariharans29 requested a review from a team as a code owner June 19, 2020 04:29
@hariharans29 hariharans29 changed the title fixes to ConvTranspose cuda kernel Misc fixes to ConvTranspose CUDA kernel Jun 19, 2020
auto b_data = reinterpret_cast<const CudaT*>(B->template Data<T>());
CUDNN_RETURN_IF_ERROR(cudnnAddTensor(CudnnHandle(), &alpha, s_.b_tensor, b_data, &alpha, s_.y_tensor, y_data));
}
}
Copy link
Member Author

@hariharans29 hariharans29 Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace from above moved down thus extending holding onto the mutex.

ConvTransposeAttributes::Prepare p;
ORT_RETURN_IF_ERROR(conv_transpose_attrs_.PrepareForCompute(context, has_bias, p, dynamic_padding));

// Bail out early if one of the dimensions is zero.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this block until after s_.y_dims has been cached for future runs

s_.algo = perf.algo;
s_.workspace_bytes = perf.memory;
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this brace down to extend holding onto the mutex longer

y_data = reinterpret_cast<CudaT*>(Y->template MutableData<T>());

// Bail out early if one of the output dimensions is zero.
if (Y->Shape().Size() == 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if logic to bail out early

skottmckay
skottmckay previously approved these changes Jun 26, 2020
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@hariharans29 hariharans29 changed the title Misc fixes to ConvTranspose CUDA kernel Misc fixes to Conv and ConvTranspose CUDA kernels Jun 26, 2020
@hariharans29 hariharans29 merged commit 465140b into master Jun 29, 2020
@hariharans29 hariharans29 deleted the invModel2 branch June 29, 2020 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The output is different with the same input on CUDA

2 participants