-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Zeroes CSR still needs a valid row_pointer array. #7935
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention of such _zeros
implementation was to delay memory allocation until a sparse tensor has some non-zeros. If we want to enforce that a CSR ndarray always has valid indptr array, I believe this(_zeros
) is not the only place that violates such constraint. Because a CSR NDArray is constructed with default shape for indptr = (0, )
https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/ndarray.h#L140`
Do you think it's possible for operators to work with indptr being empty?
@reminisce
src/operator/tensor/init_op.h
Outdated
// Fill a CSR NDArray with zeros by updating the aux shape. | ||
/*! \brief Fill a CSR NDArray with zeros by updating the aux shape | ||
* | ||
* @tparam xpu - cpu or gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tparam
-> param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what? cryptic comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tparam = template parameter. It's actually named by the IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. Learnt a new thing today 👍
I think it's okay for operators to work on a zero csr with empty |
I respectfully disagree. Maybe something like empty() should return uninitialized, but returning an Invald CSR array from zeros() and expecting the consuming function to initialize the array seems backwards and difficult to maintain. zeros() should return a CSR that appears as a matrix of zeroes. It is not doing that. |
Also note that CheckAndAlloc() isn't going to fill that allocated array with zeros. And indptr array size isn't going to change so it doesn't need to be reallocated larr when the array is used. |
While an unintialized RSP matrix is a valid all-zero matrix, this is not the case for CSR. Thus RSP returns a valid matrix and CSR does not. |
I agree that an all zero initialized |
I don't see any burden on memory allocation greater than normal
NDArray.zeros() allocation. It'll be less, actually, more often than not.
But it's certainly not some "abnormal" memory allocation situation by any
stretch of the imagination.
…On Mon, Sep 18, 2017 at 9:11 PM reminisce ***@***.***> wrote:
I agree that an all zero initialized indptr in a zero csr sounds more
correct from the definition point of view of csr and alleviates the burden
on writing client code. If memory allocation is not a concern, I'm okay
with the change.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7935 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKts_fec7ZLPQAiqDc4iAv0bm2YGYiY-ks5sjz7xgaJpZM4PbNPO>
.
|
…piled file, the gpu version is compiled via cuda directly in init_op.cu
I agree that we should return a CSR array with valid indptr array for that zeros() function. Should we also update the default shape for indptr to shape(num_rows + 1) in |
@eric-haibin-lin As far as I can tell, it already is (num_rows + 1): |
…because so many errors coming from this test.
@@ -241,64 +241,64 @@ def test_get_type(): | |||
kvtype = 'local_allreduce_cpu' | |||
kv = mx.kv.create(kvtype) | |||
assert kv.type == kvtype | |||
|
|||
def test_invalid_pull(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as before, looks like Haibin fixed maybe
def test_elemwise_csr_same_zeros(): | ||
# Zeroes | ||
a = mx.nd.sparse.zeros('csr', (1,1)) | ||
b = mx.nd.elemwise_add(a,a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binary ops for csr have been disabled in #7947 (currently inferring dense output).
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/elemwise_binary_op.h#L573
Could you add it back with FInferStorage = ElemwiseStorageType<2, 1, true, true, true>)
and all the necessary unit tests for CSR binary ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done a different way
|
||
// Input and output should have the same number of row pointer items (m + 1) | ||
CHECK_EQ(output.aux_shape(csr::kIndPtr), lhs.aux_shape(csr::kIndPtr)); | ||
CHECK_EQ(output.aux_shape(csr::kIndPtr), rhs.aux_shape(csr::kIndPtr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove SparseTempStorage
class. @piiswrong suggest we register FResourceRequest
for all sparse operators that requires temp space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected you changed that in your PR... Your PR handled no CSR types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only refactored fallback logic in the previous PR.. And we don't have csr binary op tests back then..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incredibly messy, IMHO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding identical FResourceRequest items for everything.
@@ -267,7 +267,7 @@ void SparseRetainOpForwardRspImpl(mshadow::Stream<xpu> *s, | |||
if (!input_nd.storage_initialized() | |||
|| idx_data.Size() == 0U | |||
|| input_nd.shape()[0] == 0) { | |||
FillZerosRspImpl(s, *output_nd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was changed on purpose. Why revert this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a conflict and causing a compile error because my PR changes this function. Why is it changed to const reference?
When resolving the conflict I used the surrounding functions as a reference:
inline void FillDnsZerosRspImpl(mshadow::Stream *s, NDArray *dst)
void PopulateFullIdxRspImpl(mshadow::Stream *s, NDArray *dst)
btw why is there so much const reference being passed around when the object is then modified? I see this everywhere. Is this because lint complains anout not using a pointer for modified items rather than a non-const reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NDArray is a pointer type. So its ok to modify the content of the object it points to even if the pointer is const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean NDArray::ptr_ is a const pointer type? NDArray was a const reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, reverted
} | ||
|
||
// Fill a CSR NDArray with zeros by updating the aux shape. | ||
template<typename xpu> | ||
void FillZerosCsrImpl(mshadow::Stream<xpu> *s, const NDArray& dst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
src/ndarray/ndarray.cc
Outdated
@@ -402,7 +402,7 @@ inline void CopyFromToCsrImpl(const NDArray& from, const NDArray& to, RunContext | |||
// if source storage is not initialized, fill destination with zeros | |||
auto s = ctx.get_stream<to_xpu>(); | |||
if (!from.storage_initialized()) { | |||
op::FillZerosCsrImpl<to_xpu>(s, to); | |||
op::FillZerosCsrImpl(s, const_cast<NDArray *>(&to)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const cast should be avoided
* Fix for: apache#7920 * lint * remove unused variable warning * Since GPU version of FillZerosCsrImpl() is called from a non-cuda-compiled file, the gpu version is compiled via cuda directly in init_op.cu * Trigger build * This test case is BS. I can't even tell what's wrong on the CI build because so many errors coming from this test. * Update test_kvstore.py * Update CMakeLists.txt * merge fix * Trigger build * Fix 'dest' item in FIllXXX calls * Revert NDARray as pointer in Fillxxx
* Fix for: apache#7920 * lint * remove unused variable warning * Since GPU version of FillZerosCsrImpl() is called from a non-cuda-compiled file, the gpu version is compiled via cuda directly in init_op.cu * Trigger build * This test case is BS. I can't even tell what's wrong on the CI build because so many errors coming from this test. * Update test_kvstore.py * Update CMakeLists.txt * merge fix * Trigger build * Fix 'dest' item in FIllXXX calls * Revert NDARray as pointer in Fillxxx
Fix for: #7920
Looks like problem was introduced here
0b13631
In function
void FillZerosCsrImpl(mshadow::Stream *s, NDArray *dst)
Why?
CSR matrix, even if all "zeros", must have m + 1 items in its row pointer (csr::IndPtr) array. Current implementation leaves all aux arrays empty.
While I was here, added a couple more assertions and also reduced memory allocation for situations where lhs and rhs are the same array.
Added unit test.