Add pinned memory #9216

chengduoZH · 2018-03-19T16:17:11Z

WIP

The current CUDA Runtime Documentation states:
Asynchronous(Memcpy):
- For transfers from device memory to pageable host memory, the function will return only once the copy has completed.

typhoonzero · 2018-03-20T11:29:01Z

paddle/fluid/framework/tensor.h

@@ -45,10 +45,11 @@ class Tensor {
  friend struct EigenVector;

 public:
-  Tensor() : offset_(0) {}
+  Tensor() : offset_(0), use_pinned_(false) {}


Curious why need to add use_pinned_ to tensor, if we just want to use pinned memory for copying, just put this in the CopyTensor implement should OK?

just put this in the CopyTensor implement

Do you mean, for CPU->GPU, to copy the data from CPU to pinned memory first and then copy from pinned memory to GPU?

No of course. I was thinking about GPU->CPU copy. For CPU->GPU case, it seems this is the only way now.

Why do we use the pinned memory just as a staging area but not use it directly? The computations of CPU can access the data of the pinned memory directly.

In that case, can we set use_pinned_ as a global GFLAG, so that all allocations on host would use pinned memory, then we can test the overall performance boost.

Because the memory copying between GPU and CPU is async, for GPU->CPU case, before reading data from the pinned memory we should ensure the copy has completed, so we should add sync operation. So using pinned memory is a little complex.
I plan to only put the input data into pinned memory and test the overall performance.

Is this still WIP, can I review and merge this?

typhoonzero · 2018-03-26T08:59:04Z

paddle/fluid/memory/memory.cc

+void* Alloc<platform::CUDAPlace>(platform::CUDAPlace place, size_t size,
+                                 bool use_pinned) {
+  void* ptr;
+  if (use_pinned) {


Alloc on CUDAPlace with use_pinned=false will return a pointer on device, but when calling with use_pinned=true will return a pointer on host, this is a little bit confusing.

Yes, maybe we should add a new place (CUDAPinnedPlace).

typhoonzero · 2018-03-26T09:01:15Z

paddle/fluid/framework/tensor_impl.h

    } else if (platform::is_gpu_place(place)) {
 #ifndef PADDLE_WITH_CUDA
      PADDLE_THROW("'CUDAPlace' is not supported in CPU only device.");
    }
 #else
      holder_.reset(new PlaceholderImpl<platform::CUDAPlace>(
-          boost::get<platform::CUDAPlace>(place), size, type));
+          boost::get<platform::CUDAPlace>(place), size, type, use_pinned));


holder_ pointer will be on host here may cause error.

The host memory is page-locked and accessible to the device.
Refer: http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1gab84100ae1fa1b12eaca660207ef585b

Thank you for this information!

… feature/add_pinned_memory

typhoonzero

LGTM++, we can add the new place type in next PR.

chengduoZH force-pushed the feature/add_pinned_memory branch 3 times, most recently from c4d0515 to e0156c1 Compare March 20, 2018 05:19

add pinned memory

236b7dd

chengduoZH force-pushed the feature/add_pinned_memory branch 4 times, most recently from 999ea6c to ef027b3 Compare March 20, 2018 11:08

typhoonzero reviewed Mar 20, 2018

View reviewed changes

add use_pinned

eaa90d3

chengduoZH force-pushed the feature/add_pinned_memory branch from ef027b3 to eaa90d3 Compare March 20, 2018 12:04

typhoonzero reviewed Mar 26, 2018

View reviewed changes

replace use_pinned with is_pinned

3900408

chengduoZH changed the title ~~[WIP] Add pinned memory~~ Add pinned memory Mar 26, 2018

chengduoZH added 2 commits March 26, 2018 17:52

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

a0e2cf0

… feature/add_pinned_memory

Add note for cudaMallocHost

9e99446

chengduoZH force-pushed the feature/add_pinned_memory branch from ba1178e to 9e99446 Compare March 26, 2018 11:13

typhoonzero approved these changes Mar 26, 2018

View reviewed changes

chengduoZH merged commit 2e4a398 into PaddlePaddle:develop Mar 26, 2018

chengduoZH mentioned this pull request Mar 28, 2018

Add CUDAPinnedPlace #9380

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pinned memory #9216

Add pinned memory #9216

chengduoZH commented Mar 19, 2018

typhoonzero Mar 20, 2018

chengduoZH Mar 20, 2018

typhoonzero Mar 21, 2018

chengduoZH Mar 21, 2018

typhoonzero Mar 21, 2018

chengduoZH Mar 21, 2018

typhoonzero Mar 26, 2018

typhoonzero Mar 26, 2018

chengduoZH Mar 26, 2018

typhoonzero Mar 26, 2018

chengduoZH Mar 26, 2018

typhoonzero Mar 26, 2018

typhoonzero left a comment

Add pinned memory #9216

Add pinned memory #9216

Conversation

chengduoZH commented Mar 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

typhoonzero left a comment

Choose a reason for hiding this comment