Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Implement DeviceContextManager and Ensure only one CUDA stream existing in new framework at now #4218

Closed
wants to merge 7 commits into from

Conversation

QiJune
Copy link
Member

@QiJune QiJune commented Sep 20, 2017

Fix #3796

@QiJune QiJune changed the title [WIP]Implement Device manager and Ensure only one CUDA stream existing in new framework at now [WIP]Implement DeviceContextManager and Ensure only one CUDA stream existing in new framework at now Sep 20, 2017
*
* @note CopyFrom supports CPU <-> GPU, GPU <-> GPU.
*/
template <typename T>
inline void CopyFrom(const Tensor& src, const platform::Place& dst_place);
inline void CopyFrom(const Tensor& src, const platform::Place& dst_place,
bool is_sync = false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but the easiest way might be to take a const DeviceContext& by CopyFrom?

Copy link
Member Author

@QiJune QiJune Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard for operator developers to pass a DeviceContext to CopyFrom method. Operator developers have to pass right DeviceContext depends on the src place(cpu or gpu) and dst place(cpu or gpu).

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think DeviceContextMgr is necessary. Maybe we could just pass a DeviceContext to Tensor::CopyFrom and make it async?

@QiJune
Copy link
Member Author

QiJune commented Sep 21, 2017

@reyoung

  1. DeviceContext should not expose to users, and users should not and almost could not create right DeviceContext. If we pass a DeviceContext to CopyFrom, so where is the DeviceContext created?
  2. We also use copy method in pybind between C++ and python, and we can hardly pass a DeviceContext here, and expose a api like this:
a = numpy.array(1, 2)
t.set_float(a, device_context);
  1. Anyway, we need to create some DeviceContext in Paddle initialization stage, and handle the parallel CUDA streams internally.

@paddle-bot-old paddle-bot-old bot closed this May 22, 2020
@paddle-bot-old
Copy link

Since you haven't replied for a long time, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您长期未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

how to manage DeviceContext which contains cuda streams
2 participants