Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix replication_pad for cuda launch configuration #50565

Closed
wants to merge 11 commits into from

Conversation

xwang233
Copy link
Collaborator

Fix #49601

auto devOutput = output_.packed_accessor64<scalar_t, 3>();

int outputPlaneSize = devOutput.size(2);
int size1 = devOutput.size(1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int64_t for sizes

int size1 = devOutput.size(1);
int size0 = devOutput.size(0);

int y_left = size1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (int64_t block_y=0; block_y < size1; block_y += 65535){
   auto block_y_size = std::min(size1-block_y, 65535);
   for (int64_t block_z=0; block_z < size0; block_z += 65535) {
       auto block_z_size = std::min(size0-block_z, 65535);
       dim3 gridSize(THCCeilDiv(outputPlaneSize, 256), block_y_size, block_z_size);
       dim3 blockSize(...);
       launch_kernel(...., block_y, block_z);
  }      
  } 

is a bit simpler

@codecov
Copy link

codecov bot commented Jan 15, 2021

Codecov Report

Merging #50565 (d701c38) into master (4511f2c) will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #50565      +/-   ##
==========================================
+ Coverage   80.65%   80.67%   +0.01%     
==========================================
  Files        1913     1910       -3     
  Lines      208151   207864     -287     
==========================================
- Hits       167887   167696     -191     
+ Misses      40264    40168      -96     

@xwang233
Copy link
Collaborator Author

cc @ptrblck

@xwang233 xwang233 changed the title [WIP] Fix replication_pad for cuda launch configuration Fix replication_pad for cuda launch configuration Jan 19, 2021
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in db86dd8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

replication_pad1d raising "CUDA error: invalid configuration argument" on large inputs
4 participants