init tensor design doc #2579

JiayiFeng · 2017-06-23T03:56:19Z

No description provided.

QiJune · 2017-06-23T05:13:53Z

doc/design/tensor_design.md

+`Allocation` is a [RAII](http://en.cppreference.com/w/cpp/language/raii) class template, which is used to handle a piece of memory. 
+
+```cpp
+// Template parametr 'Device' can be 'CpuDevice' or 'GpuDevice'


parametr --> parameter

QiJune · 2017-06-23T05:18:53Z

doc/design/tensor_design.md

+
+    // reshape tensor, data will be retained
+    void reshape(const Dim<rank>& size);
+


mabe we should disable the use of = for tensor, and add a shared method to explicitly share a tensor

Tensor& operator=(const Tensor& src) = delete; void ShareData(const Tensor& src);

Good idea.
For Copy() is already a global function, we'd better make ShareData() global as well.

I think that ShareData method is substitute for operator=, so ShareData should be a class method too.
Following is the sample code:

Tensor<CpuDevice, double, 2> t_a(make_dim(2, 3), CpuDevice()); Tensor<CpuDevice, double, 2> tb; t_b.ShareData(t_a);

QiJune · 2017-06-23T05:20:09Z

doc/design/tensor_design.md

+
+```cpp
+// make a totally new tensor on CPU with raw constructor
+Tensor<CpuDevice, double, 2> t_a = Tensor<CpuDevice, double, 2>(make_dim(2,3), CpuDevice());


if we disable the use of =, than this line should be:
Tensor<CpuDevice, double, 2> t_a(make_dim(2, 3), CpuDevice())

That's right. It also makes the code clearer.
And the helper function MakeTensor() is not necessary anymore.

reyoung · 2017-06-23T05:54:28Z

doc/design/tensor_design.md

+    Allocation(void* ptr, size_t size, Device device);
+
+    ~Allocation();
+    //No copying!


Maybe in the framework directory, we could write DISABLE_COPY macro, because there are many classes are be DISABLE_COPY.

Otherwise, if we are happy to use boost, there should be boost::non_copyable base class.

I think DISABLE_COPY macro is a cool choice.

But macro cannot put in namespace, DISABLE_COPY will pollute the global namespace.

reyoung · 2017-06-23T05:55:11Z

doc/design/tensor_design.md

+`Allocation` is a [RAII](http://en.cppreference.com/w/cpp/language/raii) class template, which is used to handle a piece of memory. 
+
+```cpp
+// Template parametr 'Device' can be 'CpuDevice' or 'GpuDevice'


Should we use GPUPlace or GPUDevice?

I also dwell on it. The key is how we name the template parameter, Place or Device?

reyoung · 2017-06-23T05:56:18Z

doc/design/tensor_design.md

+    Allocation& operator=(const Allocation&) = delete;
+
+    void* ptr() const;
+    void* end() const;


Why the end is needed? ptr() + size() == end()

Member function end() is simply for users' convenience.

With end(), we can check the validity of a tensor's ptr_ like this:

CHECK(ptr_ >= alloc.ptr() && ptr_ <= alloc.end());

otherwise:

CHECK(ptr_ >= alloc.ptr() && ptr_ <= alloc.ptr() + alloc.size());

reyoung · 2017-06-23T06:00:37Z

doc/design/tensor_design.md

+    // make a new tensor by another existing tensor
+    // new tensor and source tensor have the same numel but deferent rank
+    template<int src_rank>
+    Tensor(const Dim<rank>& size, const Tensor<Device, T, src_rank>& src);


Is the new tensor should be a writable view of src tensor?

If so, then new tensor can change data which shared with src tensor, i.e., new tensor can change src tensor. So, src tensor should not be marked as const?

Thanks for your reminder. The two tensors Indeed share the same data block(which is called Allocation in our design). I will remove the const.

reyoung · 2017-06-23T06:08:00Z

doc/design/tensor_design.md

+
+    // return raw pointer to the data.
+    T* raw_ptr() const;
+


There is a very critical method lack from Tensor, maybe called SliceView.

We cannot get a sub-tensor from a tensor right now, but sub-tensor is heavily used in RNN and Sparse Computation.

Tensor<float, CPU, 3> tensor; Tensor b = tensor[1:]; // b is a matrix now.

And the memory between the original tensor and slice tensors should be shared. For example,

Tensor<float, CPU, 3>* originalTensor = new Tensor<float, CPU, 3>(); auto subTensor = (*originalTensor)[1:]; delete originalTensor; subTensor.data(); // should be correct

@QiJune and @Canpio tell me that Slice is not a heavily used method. In RNN, we could also just use same allocation_, different ptr_ to share a Tensor.

This method is not an urgent need. It could be added later if it is needed.

In the current design, Allocation can be shared by multiple Tensor. Tensor uses a shared_ptr to handle the Allocation, and another void* to indicate the head of its own data. In other words, ptr_ will points to somewhere between Allocation::ptr() and Allocation::end().

shared_ptr is responsible for Allocation free when the last related Tensor is destructed.

reyoung · 2017-06-23T06:08:53Z

doc/design/tensor_design.md

+    Dim<rank> stride() const;
+
+    // return raw pointer to the 'idx'th element
+    T* index(const Dim<D>& idx) const;


Where is D?

It should be rank. fixed.

jacquesqiao · 2017-06-23T06:55:22Z

We can put this PR into project PaddlePaddle Refactoring for better tracking and management.

…o add_tensor_doc

The original discuess in * PaddlePaddle#2548 (comment) * PaddlePaddle#2579 (comment) This commit is just a proposal, let's do such kind of summarization in this PR.

hedaoyuan · 2017-06-23T10:18:07Z

doc/design/tensor_design.md

+  Allocation(size_t size, Device device);
+
+  // Creates a non-owned allocation
+  Allocation(void *ptr, size_t size, Device device);


What is the use of this interface? Tensor can share the pointer of the same Allocation object.

Tensor will never use this interface. However, as a general memory handler, Allocation may be used by other concepts. Offering an interface to accept external memory may be helpful to increase its flexibility.

Is there any other concept other than Tensor that will use Allocation?
I think if we have not clearly used examples, we can remove the interface first. Such as the comment #2587 (comment).

ok, I'm going to remove it.

hedaoyuan · 2017-06-26T03:41:00Z

doc/design/tensor_design.md

+  size_t size() const;
+
+ private:
+  bool owned_;


If remove the non-owned allocation, this can also be removed.

hedaoyuan · 2017-06-26T03:47:38Z

doc/design/tensor_design.md

+`Tensor` is the combination of Majel's `Buffer` and `Array`.
+
+```cpp
+template <typename Device, typename T, int rank>


rank -> Rank, like the Device.

hedaoyuan · 2017-06-26T03:52:45Z

doc/design/tensor_design.md

+## Tensor
+
+`Tensor` is the combination of Majel's `Buffer` and `Array`.
+


Before Tensor's implementation, I think here need to explain why Device, T and rank these three parameters to be placed in the template parameters.

Good idea. I will add this part later.

hedaoyuan · 2017-06-26T03:56:20Z

doc/design/tensor_design.md

+  T *raw_ptr() const;
+
+  // return tensor size
+  Dim<rank> size() const;


May be const Dim<rank>& is better?

hedaoyuan · 2017-06-26T03:56:55Z

doc/design/tensor_design.md

+  int numel() const;
+
+  // return tensor stride
+  Dim<rank> stride() const;


const Dim<rank>&

fixed.
And also const Dim<Rank>& size() const;

hedaoyuan · 2017-06-26T04:46:07Z

doc/design/tensor_design.md

+  Dim<rank> stride() const;
+
+  // return raw pointer to the 'idx'th element
+  T *index(const Dim<rank> &idx) const;


Even if the comment, did not understand the use of this interface. Can you explain how to use it?

hedaoyuan · 2017-06-26T05:35:11Z

doc/design/tensor_design.md

+}
+
+template <>
+struct Dim<1> {


Dim(int _head) : head(_head) {}

hedaoyuan · 2017-06-26T05:36:50Z

doc/design/tensor_design.md

+
+`size_` and `stride_` are `Dim` object. Inspired from Majel, `Dim` is a struct template for indicating tensor size and element index:
+
+```cpp


Need to add the method of getting values from Dim.

Here I only show a few parts of Dim.
You can check full definition of Dim in https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/dim.h

wangkuiyi · 2017-06-26T14:59:46Z

doc/design/tensor_design.md

+
+`ptr_` points to the head of the memory piece and `size_` shows its length. `owned_` marks whether the memory piece is allocated by `allocation` itself, if so the memory will be freed when `allocation` is destructed.
+
+`Device` is something like Majel's `Place`. However, `Place` in Majel is an alias of `boost::variant`, while `Device` here is a certain class (can be specialized to `CpuDevice` or `GpuDevice`). `CpuDevice` and `GpuDevice` are exactly Majel's `CpuPlace` and `GpuPlace`, we rename them to fit the overall naming style.


Because we already havepaddle::platform::Place, please consider using Place instead of introducing Device.

wangkuiyi · 2017-06-26T15:03:27Z

doc/design/tensor_design.md

+`Tensor` is the combination of Majel's `Buffer` and `Array`.
+
+```cpp
+template <typename T, int Rank, typename Device>


Making Rank a template parameter would prevent us from resizing a tensor easily.

Suppose that tensor A has dimensions <4,2,2>. It would be trivial to resize it to be <2,8> -- just make Tensor::Resize(DDim new_size) to assign a new value to Tensor::size_, if its type is DDim other than Dim<3>.

wangkuiyi · 2017-06-26T15:53:30Z

doc/design/tensor_design.md

+
+  // check whether allocation_ is suitable for current size_
+  // if not, re-allocate then return ptr_
+  T* mutable_data();


I think Deivce or Place shouldn't be a template parameter of Tensor; instead, they should be a parameter of Tensor::mutable_data.

Also, T must not be a template parameter of Tensor, but has to be a template parameter of Tensor::mutable_data. This flexibility is a must-to-have for lazy memory allocation.

Here is the semantic of Tensor in my mind:

class Tensor { public: template <typename T /* must be POD types */ typename = std::enable_if< std::is_pod<T>::value >::type> T* mutable_data(Place pl, DDim size) { if (place_ != pl) { paddle::memory::Free(place_, data_); data_ = nullptr; } if (typeof(T)*size.product() < element_size_ * size_.product()) { paddle::memory::Free(place_, data); data_ = nullptr; } if (data == nullptr) { element_size_ = sizeof(T); place_ = pl; size_ = size; data_ = memory::Allocate(place_, element_size_ * size_.product()); } return data_; } template <typename T typename = std::enable_if< std::is_pod<T>::value >::type> T* mutable_data(DDim size) { return mutable_data<T>(paddle::framework::get_place(), size); } private: Place place_; // record the place of data_. size_t element_size_; // record element size of data_. DDim size_; // record dimensionalities of data_. void* data_; // Or use a Placeholder as in Variable to hide the type, so we can use unique_ptr here for data_. };

JiayiFeng · 2017-06-28T06:38:37Z

see #2611

JiayiFeng and others added 2 commits June 23, 2017 11:53

init tensor design doc

99bd0c9

just for test

c538d2c

QiJune reviewed Jun 23, 2017

View reviewed changes

reyoung reviewed Jun 23, 2017

View reviewed changes

QiJune added this to TODO in PaddlePaddle Refactoring: Phase 1 Jun 23, 2017

QiJune moved this from TODO to Doing in PaddlePaddle Refactoring: Phase 1 Jun 23, 2017

JiayiFeng added 3 commits June 23, 2017 17:24

refine tensor design doc

94d75f1

Merge branch 'add_tensor_doc' of https://github.com/Canpio/Paddle int…

535c850

…o add_tensor_doc

add todo section

5ebb300

JiayiFeng requested review from pkuyym, luotao1, qingqing01, Xreki and hedaoyuan June 23, 2017 09:42

format code

1ba9ec5

reyoung mentioned this pull request Jun 23, 2017

Propose a typedefs header for paddle framework #2584

Closed

format code

f29751b

hedaoyuan reviewed Jun 23, 2017

View reviewed changes

JiayiFeng requested a review from a user June 24, 2017 10:23

hedaoyuan reviewed Jun 26, 2017

View reviewed changes

JiayiFeng added 4 commits June 26, 2017 14:27

add lazy allocation

a7839a8

remove Allocation's interface of accepting external memory

6962dfa

renew tensor doc

bb1a4f4

add explanation of tensor's template parameters

6279b2f

wangkuiyi reviewed Jun 26, 2017

View reviewed changes

wangkuiyi mentioned this pull request Jun 26, 2017

Add tensor.h #2611

Merged

JiayiFeng closed this Jun 28, 2017

JiayiFeng deleted the add_tensor_doc branch June 28, 2017 06:39

reyoung moved this from Doing to Done in PaddlePaddle Refactoring: Phase 1 Aug 2, 2017


		// reshape tensor, data will be retained
		void reshape(const Dim<rank>& size);

		## Tensor

		`Tensor` is the combination of Majel's `Buffer` and `Array`.


		`size_` and `stride_` are `Dim` object. Inspired from Majel, `Dim` is a struct template for indicating tensor size and element index:

		```cpp


		`ptr_` points to the head of the memory piece and `size_` shows its length. `owned_` marks whether the memory piece is allocated by `allocation` itself, if so the memory will be freed when `allocation` is destructed.

		`Device` is something like Majel's `Place`. However, `Place` in Majel is an alias of `boost::variant`, while `Device` here is a certain class (can be specialized to `CpuDevice` or `GpuDevice`). `CpuDevice` and `GpuDevice` are exactly Majel's `CpuPlace` and `GpuPlace`, we rename them to fit the overall naming style.

init tensor design doc #2579

init tensor design doc #2579

Conversation

JiayiFeng commented Jun 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiJune Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquesqiao commented Jun 23, 2017

Choose a reason for hiding this comment

JiayiFeng Jun 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JiayiFeng commented Jun 28, 2017

QiJune Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading

JiayiFeng Jun 23, 2017 •

edited

Loading