Tensor (reprise) #411

pansk · 2016-11-17T16:50:18Z

Well, as usual, I made a mess with GIT, and apparently I can't easily push to @edgarriba's original PR.

Today I reworked a bit @edgarriba's #400, I changed a bit the interface, added the lazy-allocation and lazy-movement of memory around, renamed accessor to host_ptr and host_at (to clarify they work on host memory), and implemented the generic functions for binary and unary host operations, element-wise and scalar, so it's now easier to implement functions such as add, sub, mul, div, exp, sqrt, ...

I commented out linspace, if someone has a strong opinion about that, we can still get it back in.

beru · 2016-11-17T17:41:58Z

@pansk I have some questions.

Talking about add, sub, mul, div, sqrt, exp methods, do they need to be implemented as Tensor class's methods? Is it on purpose so that they are chainable?
In binary_scalar_operation, unary_element_wise_operation, binary_element_wise_operation methods implementation, Tensor instance is created and returned but that means just calling the methods would result in abusing heap manager (inevitable memory allocation happens everytime you call the methods) and that's not suitable for embedding computing. Those methods may be convenient but not very efficent from my view.
What about Tensor Iterator class? Personally speaking, I don't utilize iterators and algorithms very well but if Tensor Iterator will not be implemented, I fear someday hardcore C++ programmers may visit your home with pitchforks and torches in their hands.

edgarriba · 2016-11-17T17:54:53Z

@beru what do you propose to avoid inevitable memory allocation when calling methods?

beru · 2016-11-17T18:11:14Z

tiny_dnn/core/framework/tensor.h

+
+    template<typename F> Tensor unary_element_wise_operation(F f) const
+    {
+        if (this->shape() != src.shape()) {


Where is src variable in this unary_element_wise_operation method?
Unlike binary_element_wise_operation method, this doesn't have src argument.

Oh sh... It's supposed to read shape().

beru · 2016-11-17T18:11:57Z

tiny_dnn/core/framework/tensor.h

+
+    template<typename F> Tensor binary_scalar_operation(U scalar, F f) const
+    {
+        if (this->shape() != src.shape()) {


Where is src variable in this binary_scalar_operation method?
Unlike binary_element_wise_operation method, this doesn't have src argument.

beru · 2016-11-17T18:18:44Z

// edgarriba wrote:
// @beru what do you propose to avoid inevitable memory allocation when calling methods?

@edgarriba What about passing res Tensor class instance variable by an arguement?
So that the caller of the methods can decide when and how to ready the destination data and itermediate allocation will not necessary occur if carefully coded.

And I also wonder if all the values in Tensor class should be targeted for processing in any circumstances.
What if tiny-dnn library code wants to partially edit elements (such as only 1 batch data) reside inside Tensor class instance? I suppose current Tensor class public interface does not readily provide those functionalities.

edgarriba · 2016-11-17T18:27:22Z

@beru maybe it's a stupid question but what's the difference between declaring a Tensor before the call or inside? Correct me if I'm wrong but with this simple ops I think that we can assume that all values should be targeted since are element-wise operations. At this first stage I don't think we have to handle the span concept, but indeed it's necessary!

beru · 2016-11-17T18:34:00Z

tiny_dnn/core/framework/tensor.h

+    // zero-overhead version (same performance to raw pointer access.
+    // have an assertion for out-of-range error)
+    U& operator[] (const size_t index) {
+        return mutable_host_data()[index];


There is a comment zero-overhead, but is it really so?
Inside mutable_host_data, I see if statement so I thought the performance of this method is slower than raw pointer access.

@pansk in my original PR I solved with this assert
https://github.com/tiny-dnn/tiny-dnn/pull/400/files#diff-54658142cc96675fb83229795a17df30R467

and tested only in debug mode
https://github.com/tiny-dnn/tiny-dnn/pull/400/files#diff-38b61933c5832ce1f0110501b58856f4R144

Indeed, it's no longer zero overhead - my fault, I didn't remove the comment._ I left the functionality here, but I would rather get rid of this accessor.

I removed the operators in next commit.

beru · 2016-11-17T18:54:59Z

edgarriba wrote:
what's the difference between declaring a Tensor before the call or inside?

@edgarriba
By readying Tensor instance outside of the methods, caller have opportunity to save number of memory allocations/deallocations.

If the methods declare Tensor instance inside, calling the methods mean memory allocation and deallocation happen frequently. If the size of data is small or operations are not frequently repeated, it may not be a big issue but there are situations when the data is like >= 300MB and the methods are called inside a looong loop.

Probablly you've already understood this but let me write this, using STL vector container is almost like using new/delete (malloc/free in C). It dynamically allocates heap memory.

Severely repeating heap memory allocation/deallocation may cause heap fragmentation issue if heap manager is not implemented well. Low-spec embedding systems tend to not have decent hardware/operating systems/libraries, so overly using dynamic heap memory can be very troublesome or should be avoided.

Besides, memory allocation/deallocation/copy almost always slows down your program. They are very handy but it comes with performance penalty.

edgarriba · 2016-11-18T10:41:25Z

@beru thanks for this excellent explanation!
Some possibilities that come to my mind are the following:

Keep same signature and modify current Tensor state (removed const)

Tensor add(const Tensor& src) {
     const U* dst  = mutable_host_data();
     const U* src1 = host_data();
     const U* src2 = src.host_data();

     for_i(true, res.size(), [dst, src1, src2, &f](size_t i) {
         dst[i] = f(src1[i], src2[i]);
     });
     return *this
}

But what if we want to overload operator +() to behave like this?
Tensor t2 will be modified and cannot be reused for other operations.

Tensor<float_t> t1(2,2,2,2);
Tensor<float_t> t2(2,2,2,2);

Tensor<float_t> t3 = t1 + t2; // same as t3 = t1.add(t2)

Get rid off add() from Tensor structure and have it as standalone methods

void add(const Tensor& src1, const Tensor& src2, Tensor& dst) {
     // Do we need to force users to resize Tensor dst before the call?
}

So, before going further to this discussion: what features do we want to support for Tensor structure?

edgarriba · 2016-11-18T10:45:13Z

@pansk @beru @bhack @nyanp Another feature that most frameworks have is Tensor reshape. Does it make sense to have tensors with dynamic size/shape in tiny-dnn?

bhack · 2016-11-18T10:50:55Z

Something like http://www.boost.org/doc/libs/1_62_0/libs/multi_array/doc/user.html#sec_reshape?

edgarriba · 2016-11-18T10:58:28Z

exactly, but again, I don't know if there's any use case for this feature right know in the pipeline and the same for this basic operations.

bhack · 2016-11-18T11:01:54Z

See how reshape is introduced in a basic pipeline like MNIST https://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/index.html

edgarriba · 2016-11-18T11:05:23Z

sure, but I see it more as a pre-preprocessing operation rather that something really needed during the network computation, no?

UPDATE

Could span solve this?

bhack · 2016-11-18T11:23:43Z

I think that Caffe2 and Keras uses of reshape could give a fair complete use cases overview.

For span have you seen 1.2 section of https://github.com/kokkos/array_ref/blob/master/proposals/P0009.rst?

edgarriba · 2016-11-18T11:40:22Z

caffe2 uses reshape to resize the vector containing the tensor shape (note: I assume they support N dimensional tensors) but then I'm not sure what will happen to a tensor that already contains data and you trigger reshape method.

bhack · 2016-11-18T12:17:04Z

See https://github.com/caffe2/caffe2/blob/master/caffe2/core/tensor.h#L239

pansk · 2016-11-19T00:56:56Z

@beru I agree with you with your considerations about add, sub, ... methods. They shouldn't be in the class, I just left them where they were, but (as discussed with @edgarriba) they would better be outside the class, and probably have a completely different design.

pansk · 2016-11-19T01:01:01Z

@edgarriba, @bhack I'm wouldn't implement reshape in a first implementation, I think this decision is quite related to how you implement its connection to the graph structure.

pansk · 2016-11-19T01:03:02Z

@beru instead of iterators, I'd rather like using slicing (a la valarray). Actually, when discussing with @edgarriba I also suggested we should have a look at the possibility of replacing the vector with a (possibly gsliced) valarray.

pansk · 2016-11-19T08:29:53Z

@edgarriba: for implementing the syntax Tensor<float_t> t3 = t1 + t2; the standard workaround is to build the expression template returning dummy templated objects, and then perform the computation in an implicit converting assignment or converting construcor.
So, t1+t2 returns e.g. a tensor_sum<decltype(t1), decltype(t2)>, and the computation is triggered by constructing a Tensor taking with a tensor_sum.

beru · 2016-11-19T15:37:20Z

@beru instead of iterators, I'd rather like using slicing (a la valarray). Actually, when discussing with @edgarriba I also suggested we should have a look at the possibility of replacing the vector with a (possibly gsliced) valarray.

I didn't know about std::valaray, And after reading this stackoverflow page, I think we should stick with std::vector or raw memory.

Talking aboute implementing Iterator, I was just kidding. But it seems like current Tensor class does not provide end() like position of inner data. So it can be a bit troublesome.

const U* end() const {
    return mutable_host_data() + shape_[0] * shape_[1] * shape_[2] * shape_[3];
}
U* end() {
    return mutable_host_data() + shape_[0] * shape_[1] * shape_[2] * shape_[3];
}

nyanp · 2016-11-19T16:40:56Z

Great works :)
@pansk Could you rebase from master?

pansk · 2016-11-21T06:57:16Z

Today I'll try to rebase the PR.

pansk · 2016-11-22T00:47:22Z

Aparently my rebase is a mess, but it should be rebased now.

bhack · 2016-11-22T00:52:45Z

Tests are failing

- moved checks to toDevice/fromDevice - moved operations to non-member functions (layer_*), except fill - fixed tests

pansk · 2016-11-22T10:39:31Z

Hope it's ok now...

edgarriba · 2016-11-22T11:08:56Z

yes!

bhack · 2016-11-22T11:26:25Z

Also tests passed

nyanp · 2016-11-22T12:24:09Z

LGTM :)

edgarriba · 2016-11-22T13:01:32Z

@nyanp @pansk the file history was removed 😱

beru reviewed Nov 17, 2016

View reviewed changes

edgarriba added enhancement PR: Good to Merge labels Nov 20, 2016

edgarriba assigned nyanp Nov 20, 2016

bhack mentioned this pull request Nov 20, 2016

add Tensor class #400

Closed

edgarriba and others added 19 commits November 22, 2016 10:49

fix typo errors in tests

0e1e25c

fix tensor operations tests to avoid fp related problems

8c4d65a

add more tests to handle tensor shape mismatch

57c00c0

fix zero division issue

25b7603

fix sqrt error for negative numbers

10b4573

add destructor,copy ctors, and code cleaning

70bd675

remove small value sum in div

6340c47

fix tests precision

685510c

workaround solution for unique_ptr in move/copy ctors

2acf73c

Proposal for tensor

3ebea6b

- renamed private method resize to public reshape

e714948

- moved checks to toDevice/fromDevice - moved operations to non-member functions (layer_*), except fill - fixed tests

convert to templated class

0b646e7

avoid copy when return shape()

2032ad0

update fromDevice() routine

94cf288

implement element-wise add, sub, mul, div

2ae9295

Proposal for tensor

c664320

Fixing errors revealed by travis

2320942

Fixed VS2012 mistake

d6787e3

Fixed VS2012 mistake again

496e831

pansk force-pushed the tensor branch from 6e065d9 to 496e831 Compare November 22, 2016 10:36

nyanp merged commit 224843e into tiny-dnn:master Nov 22, 2016

pansk deleted the tensor branch November 22, 2016 12:54

pansk restored the tensor branch November 22, 2016 13:13

This was referenced Nov 22, 2016

Revert "Tensor (reprise)" #417

Merged

add Tensor class #418

Merged

Tensor (reprise) #411

Tensor (reprise) #411

Conversation

pansk commented Nov 17, 2016

beru commented Nov 17, 2016 • edited

edgarriba commented Nov 17, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beru Nov 17, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beru commented Nov 17, 2016 • edited

edgarriba commented Nov 17, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beru commented Nov 17, 2016 • edited

edgarriba commented Nov 18, 2016 • edited

edgarriba commented Nov 18, 2016

bhack commented Nov 18, 2016

edgarriba commented Nov 18, 2016

bhack commented Nov 18, 2016

edgarriba commented Nov 18, 2016 • edited

bhack commented Nov 18, 2016

edgarriba commented Nov 18, 2016

bhack commented Nov 18, 2016

pansk commented Nov 19, 2016

pansk commented Nov 19, 2016

pansk commented Nov 19, 2016

pansk commented Nov 19, 2016

beru commented Nov 19, 2016

nyanp commented Nov 19, 2016

pansk commented Nov 21, 2016

pansk commented Nov 22, 2016

bhack commented Nov 22, 2016

pansk commented Nov 22, 2016

edgarriba commented Nov 22, 2016

bhack commented Nov 22, 2016

nyanp commented Nov 22, 2016

edgarriba commented Nov 22, 2016

beru commented Nov 17, 2016 •

edited

beru Nov 17, 2016 •

edited

beru commented Nov 17, 2016 •

edited

beru commented Nov 17, 2016 •

edited

edgarriba commented Nov 18, 2016 •

edited

edgarriba commented Nov 18, 2016 •

edited