implement DeviceContext #2709

QiJune · 2017-07-03T12:05:43Z

…eviceContext

hedaoyuan · 2017-07-05T15:35:46Z

paddle/platform/device_context.h

+  Eigen::DefaultDevice* eigen_device_{nullptr};
+};
+
+#ifndef PADDLE_ONLY_CPU


Here to add PADDLE_ONLY_CPU will bring a problem, the code that calls DeviceGuard or CudaDeviceContext needs to be separated by PADDLE_ONLY_CPU.

The code that calls DeviceGuard or CudaDeviceContext must have WITH_GPU set 1.
Yes, this brings a question, how we organize our CPU/GPU codes clearly.
We can use marco, or make fake stub header file.

I think there are a few things to consider when dealing with GPU and CPU mixed code.

If not necessary, try not to put the GPU and CPU code in a file. In this way, you do not need to use an extra macro to separate the code. (I think, context.h can only contain cpu context, cuda_context.h can contain gpu context.)

Do not use PADDLE_ONLY_CPU, should be replaced by PADDLE_WITH_CUDA. The default should be the CPU code, and when need to use CUDA code, add PADDLE_WITH_CUDA.

I think this suggestion is useful.

Merge this pr temporarily. And I will consider the design of DeviceContext combining with Operator interface. And I will follow advices of @hedaoyuan later.

qingqing01 · 2017-07-05T15:05:10Z

paddle/platform/device_context.h

+  Eigen::DefaultDevice eigen_device() {
+    if (!eigen_device_) {
+      eigen_device_ = new Eigen::DefaultDevice();
+    }


Where use the Eigen::DefaultDevice in our design? I find the Eigen::DefaultDevice in the directory of eigen/unsupported/Eigen/CXX11/src/Tensor, but I do not find the usage in the Tensor's doc of Eigen.

Eigen::DefaultDevice is defined in unsupported/Eigen/CXX11/src/Tensor/TensorDeviceDefault.h.
About the usage of Eigen::DefaultDevice, please refer to (https://github.com/QiJune/RefEigen/blob/master/main.cu)

qingqing01 · 2017-07-05T15:18:24Z

paddle/platform/device_context.h

+    paddle::platform::throw_on_error(cudaStreamCreate(&stream_),
+                                     "cudaStreamCreate failed");
+    eigen_stream_ = new Eigen::CudaStreamDevice(&stream_);
+    eigen_device_ = new Eigen::GpuDevice(eigen_stream_);


I'm not sure we will use the CUDA implementation in Eigen. If not decide to use it, I think the eigen_stream_ and eigen_device_ can be removed.

If we do not use CUDA implementation in Eigen, then we will write CUDA kernels for every operators. Just like caffe2.
And tensorflow use CUDA implementation in Eigen. @hedaoyuan once mentioned the efficiency of expression template of Eigen in GPU is acceptable.
So, we may have a discussion about this offline.

qingqing01 · 2017-07-05T15:27:37Z

paddle/platform/device_context.h

+#include "paddle/framework/enforce.h"
+#include "paddle/platform/dynload/cublas.h"
+#include "paddle/platform/dynload/cudnn.h"
+#include "paddle/platform/dynload/curand.h"


The above three lines should also be included between #ifndef PADDLE_ONLY_CPU and #endif.

Yes, logically above three header files should be included between macros.

qingqing01 · 2017-07-05T15:29:07Z

paddle/platform/device_context.h

+
+#ifndef PADDLE_ONLY_CPU
+#include "paddle/platform/cuda.h"
+#define EIGEN_USE_GPU


Where is the EIGEN_USE_GPU used?

The EIGEN_USE_GPU is used by eigen library. If we want to use Tensor Expression of eigen in GPU, we have to define this marco.

wangkuiyi

I like that idea of a super simple engine.

wangkuiyi · 2017-07-06T02:20:40Z

paddle/platform/device_context.h

+  virtual ~DeviceContext() {}
+};
+
+class CpuDeviceContext : public DeviceContext {


wangkuiyi · 2017-07-06T02:20:51Z

paddle/platform/device_context.h

+  GPUPlace previous_;
+};
+
+class CudaDeviceContext : public DeviceContext {


Cuda => CUDA

reyoung · 2017-07-06T08:02:41Z

paddle/platform/device_context.h

+class CPUDeviceContext : public DeviceContext {};
+
+#ifndef PADDLE_ONLY_CPU
+class DeviceGuard {


It seems GPUPlaceGuard? not DeviceGuard, because it takes GPUPlace as argument.

It's actually guard the GPUPlace.device. Since we pass GPUPlace, maybe GPUPlaceGuard is a more clear name

reyoung · 2017-07-06T08:03:58Z

paddle/platform/device_context.h

+                                     "cudaStreamSynchronize failed");
+  }
+
+  cudaStream_t stream() { return stream_; }


Lake of const for all methods?

But maybe it is not important because all device context is a mutable pointer passed to Op::Run.

reyoung · 2017-07-06T08:06:41Z

paddle/platform/device_context.h

+  cublasHandle_t cublas_handle() {
+    if (!blas_handle_) {
+      DeviceGuard guard(gpu_place_);
+      PADDLE_ENFORCE(paddle::platform::dynload::cublasCreate(&blas_handle_) ==


Tooooooooo long for the namespace.

Maybe we can add using namespace paddle::platform; in this class private section, like

class GPUDeviceContext { private: using namespace paddle::platform; // only use namespace in this class. };

Or maybe alias is better, like

using dynload = paddle::platform::dynload;

It seems that we cannot add an alias or using namespace inside a class

reyoung · 2017-07-06T08:13:49Z

paddle/platform/device_context_test.cc

+  for (int i = 0; i < count; i++) {
+    paddle::platform::CUDADeviceContext* device_context =
+        new paddle::platform::CUDADeviceContext(i);
+    __attribute__((unused)) Eigen::GpuDevice gpu_device =


Do not use unused attribute, because it may fail on some compiler.

Maybe the return value does not need to store. What about

ASSERT_NE(nullptr, device_context->eigen_device());

jacquesqiao

LGTM, can seperate cpu code and gpu code later.

QiJune added 4 commits July 3, 2017 15:55

add device_context

e876477

add unittest for device_context

1ba4cb8

transfer to use function paddle::platform::throw_on_error

cdfa098

fix cuda build error

5acaffb

QiJune requested review from reyoung, wangkuiyi, gangliao, jacquesqiao, Superjomn, JiayiFeng and hedaoyuan July 3, 2017 12:06

QiJune mentioned this pull request Jul 4, 2017

nv_test could not add_dependencies of header-ONLY library #2727

Closed

QiJune added 3 commits July 4, 2017 08:03

Merge remote-tracking branch 'baidu/develop' into feature/implement_D…

ab56c96

…eviceContext

using dynload functions

abbed1d

Merge remote-tracking branch 'baidu/develop' into feature/implement_D…

f9ae741

…eviceContext

QiJune mentioned this pull request Jul 5, 2017

add DeviceContext design doc #2648

Closed

Merge remote-tracking branch 'baidu/develop' into feature/implement_D…

0c13b23

…eviceContext

hedaoyuan reviewed Jul 5, 2017

View reviewed changes

qingqing01 reviewed Jul 5, 2017

View reviewed changes

wangkuiyi reviewed Jul 6, 2017

View reviewed changes

QiJune force-pushed the feature/implement_DeviceContext branch from baed7b1 to 39679d5 Compare July 6, 2017 06:32

reyoung reviewed Jul 6, 2017

View reviewed changes

follow comments

c7bdbdb

QiJune force-pushed the feature/implement_DeviceContext branch from e3f0db4 to c7bdbdb Compare July 6, 2017 09:43

jacquesqiao approved these changes Jul 10, 2017

View reviewed changes

jacquesqiao merged commit 1038bc4 into PaddlePaddle:develop Jul 10, 2017

QiJune deleted the feature/implement_DeviceContext branch July 10, 2017 05:21

QiJune added this to Done in PaddlePaddle Refactoring: Phase 1 Aug 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement DeviceContext #2709

implement DeviceContext #2709

QiJune commented Jul 3, 2017

hedaoyuan Jul 5, 2017

QiJune Jul 6, 2017

hedaoyuan Jul 6, 2017

jacquesqiao Jul 10, 2017

QiJune Jul 10, 2017

qingqing01 Jul 5, 2017 •

edited

Loading

QiJune Jul 6, 2017

qingqing01 Jul 5, 2017

QiJune Jul 6, 2017

qingqing01 Jul 5, 2017

QiJune Jul 6, 2017

qingqing01 Jul 5, 2017

QiJune Jul 6, 2017

wangkuiyi left a comment

wangkuiyi Jul 6, 2017

QiJune Jul 6, 2017

wangkuiyi Jul 6, 2017

QiJune Jul 6, 2017

reyoung Jul 6, 2017

QiJune Jul 6, 2017

reyoung Jul 6, 2017 •

edited

Loading

reyoung Jul 6, 2017

reyoung Jul 6, 2017

QiJune Jul 6, 2017 •

edited

Loading

reyoung Jul 6, 2017

QiJune Jul 6, 2017

jacquesqiao left a comment

implement DeviceContext #2709

implement DeviceContext #2709

Conversation

QiJune commented Jul 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 Jul 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangkuiyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung Jul 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiJune Jul 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao left a comment

Choose a reason for hiding this comment

qingqing01 Jul 5, 2017 •

edited

Loading

reyoung Jul 6, 2017 •

edited

Loading

QiJune Jul 6, 2017 •

edited

Loading