mx.sym.argsort() cannot sort array with large tensor #7510

Yre · 2017-08-17T08:32:27Z

For bugs or installation issues, please provide the following information.
The more information you provide, the more likely people will be able to help you.

Environment info

Operating System: ubuntu 14.04
Compiler: GCC4.4.7
Package used (Python/R/Scala/Julia): python
MXNet version: 0.11.0
MXNet commit hash : 568b5a2

Minimum reproducible example

if you are using your own code, please provide a short script that reproduces the error.

import mxnet as mx
import numpy as np

value = mx.sym.Variable('data')
sorted_adj = mx.sym.argsort(value, axis = 2)

coord_data = np.random.rand(32, 2048, 2048)
coord_blob = mx.nd.array(coord_data, mx.gpu())
e = sorted_adj.bind(mx.gpu(), {'data':coord_blob})
y = e.forward()

result =  y[0].asnumpy()

vis = np.zeros(2048)
for i in range (2048):
    vis[int(result[4, 0, i])] = 1
cnt = 0
for i in range (2048):
    if vis[i] == 0:
        cnt += 1
print cnt

Steps to reproduce

The provided code will give the cnt!=0, which should be 0 because the argsort should return all the integer from 0 to 2047.
If the result[4, 0, i] has been changed to result[3, 0, i] , the result will be correct again.

What have you tried to solve it?

1.If the size of coord_data is (10000, 2048), and the axis in argsort changed to 1, then result[8000, :] will cover from 0 to 2047 as desired; but vis[int(result[9000, i])] will be incorrect again. (I guess it may because 8000<4*2048<9000)
2. By several tests, it seems that the argsort function can only deal with the first 2^24 element.

The text was updated successfully, but these errors were encountered:

piiswrong · 2017-08-17T17:33:20Z

@sxjscience @reminisce

sxjscience · 2017-08-18T01:07:59Z

I've confirmed. This is a bug. I'll look into what caused it. It may be related to the way I do batched sort.

sxjscience · 2017-08-18T01:11:53Z

I'll switch to the cub to have a try.

sxjscience · 2017-08-18T05:04:37Z

OK, I've found the problem. It's this line https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/ordering_op-inl.h#L207. When there are too many elements, the real_t type is not precise enough to store all the index values.

sxjscience · 2017-08-20T13:29:32Z

I'll pr the fix after the MShadow side is merged.

Yre · 2017-08-21T08:11:21Z

Thank you so much for your timely update! @sxjscience

sxjscience mentioned this issue Aug 20, 2017

fix the launch param of reduce kernel and range dmlc/mshadow#285

Merged

sxjscience closed this as completed in dmlc/mshadow#285 Aug 21, 2017

sxjscience reopened this Aug 21, 2017

sxjscience mentioned this issue Aug 21, 2017

Fix argsort + Update MShadow #7535

Merged

piiswrong closed this as completed in #7535 Aug 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mx.sym.argsort() cannot sort array with large tensor #7510

mx.sym.argsort() cannot sort array with large tensor #7510

Yre commented Aug 17, 2017 •

edited

piiswrong commented Aug 17, 2017 •

edited

sxjscience commented Aug 18, 2017

sxjscience commented Aug 18, 2017 •

edited

sxjscience commented Aug 18, 2017

sxjscience commented Aug 20, 2017

Yre commented Aug 21, 2017

mx.sym.argsort() cannot sort array with large tensor #7510

mx.sym.argsort() cannot sort array with large tensor #7510

Comments

Yre commented Aug 17, 2017 • edited

Environment info

Minimum reproducible example

Steps to reproduce

What have you tried to solve it?

piiswrong commented Aug 17, 2017 • edited

sxjscience commented Aug 18, 2017

sxjscience commented Aug 18, 2017 • edited

sxjscience commented Aug 18, 2017

sxjscience commented Aug 20, 2017

Yre commented Aug 21, 2017

Yre commented Aug 17, 2017 •

edited

piiswrong commented Aug 17, 2017 •

edited

sxjscience commented Aug 18, 2017 •

edited