Skip to content

Conversation

zou3519
Copy link
Contributor

@zou3519 zou3519 commented Oct 5, 2017

Summary

cudnnSetStream calls were missing in the cudnn bindings, resulting in cudnn not using the current stream. This makes it so that the cudnn functions use the current stream. #2523

Test plan

Create a test file (adopted from here)

Ran it through nvprof. Asserted that the cudnn functions did not use the current stream before changes. Asserted that the cudnn functions did use the current stream after changes.

Before:
image

After:
image

@soumith soumith merged commit 137b139 into pytorch:master Oct 5, 2017
"WARNING: Mismatch between user stream and the cuDNN handle context\n";
} else if (status != CUDNN_STATUS_SUCCESS) {
std::cerr << "WARNING: Could not set cuDNN to use current stream\n";
}

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@ngimel
Copy link
Collaborator

ngimel commented Oct 5, 2017

It might break in a multi-threaded situation where multiple threads work on the same device, use the same cudnn handle, release GIL and race to set different streams. Multi-threaded + single cudnn handle is a fragile configuration.

@apaszke
Copy link
Contributor

apaszke commented Oct 5, 2017

I agree. Our threading model for THCState is very fragile in general. I think we've been good so far only because we usually use at most one thread per device. However I think this is no longer true in DistributedDataParallel and now I'm not sure why doesn't it break

@soumith
Copy link
Member

soumith commented Oct 5, 2017

it breaks (remember nccl deadlocks that i'm fixing :) )

@apaszke
Copy link
Contributor

apaszke commented Oct 5, 2017

It's completely unrelated to nccl deadlocks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants