Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Usability degradation #16102

Closed
eric-haibin-lin opened this issue Sep 5, 2019 · 13 comments
Closed

Usability degradation #16102

eric-haibin-lin opened this issue Sep 5, 2019 · 13 comments
Assignees

Comments

@eric-haibin-lin
Copy link
Member

>>> pip install mxnet==1.6.0b20190821
➜  gluon-nlp git:(allow) ✗ python -c 'import mxnet as mx; a = mx.np.ones((10,)); print(a.reshape((1,)))'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/haibilin/miniconda3/lib/python3.7/site-packages/mxnet/numpy/multiarray.py", line 637, in reshape
    return _mx_np_op.reshape(self, newshape=args[0], order=order)
  File "<string>", line 39, in reshape
  File "/Users/haibilin/miniconda3/lib/python3.7/site-packages/mxnet/_ctypes/ndarray.py", line 100, in _imperative_invoke
    ctypes.byref(out_stypes)))
  File "/Users/haibilin/miniconda3/lib/python3.7/site-packages/mxnet/base.py", line 254, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [10:26:08] src/operator/numpy/np_matrix_op.cc:111: Check failed: src.Size() == dst->Size() (10 vs. 1) : Cannot reshape array of size 10 into shape [1]
Stack trace:
  [bt] (0) 1   libmxnet.so                         0x000000010e98a649 mxnet::op::NDArrayOpProp::~NDArrayOpProp() + 4473
  [bt] (1) 2   libmxnet.so                         0x000000010f397f61 mxnet::op::FullyConnectedComputeExCPU(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&, std::__1::vector<mxnet::OpReqType, std::__1::allocator<mxnet::OpReqType> > const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&) + 6826753
  [bt] (2) 3   libmxnet.so                         0x000000010f39854f mxnet::op::FullyConnectedComputeExCPU(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&, std::__1::vector<mxnet::OpReqType, std::__1::allocator<mxnet::OpReqType> > const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&) + 6828271
  [bt] (3) 4   libmxnet.so                         0x00000001103aa79f mxnet::imperative::SetShapeType(mxnet::Context const&, nnvm::NodeAttrs const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, mxnet::DispatchMode*) + 1583
  [bt] (4) 5   libmxnet.so                         0x00000001103a929c mxnet::Imperative::Invoke(mxnet::Context const&, nnvm::NodeAttrs const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&) + 716
  [bt] (5) 6   libmxnet.so                         0x00000001102eadbe SetNDInputsOutputs(nnvm::Op const*, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> >*, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> >*, int, void* const*, int*, int, int, void***) + 1582
  [bt] (6) 7   libmxnet.so                         0x00000001102ebb00 MXImperativeInvokeEx + 176
  [bt] (7) 8   libffi.6.dylib                      0x00000001077f8884 ffi_call_unix64 + 76

But recent build hides error messages:

➜  gluon-nlp git:(allow) ✗ pip install mxnet==1.6.0b20190822

➜  gluon-nlp git:(allow) ✗ python -c 'import mxnet as mx; a = mx.np.ones((10,)); print(a.reshape((1,)))'

Segmentation fault: 11

Stack trace:
  [bt] (0) 1   libmxnet.so                         0x000000011255cdb0 mxnet::Storage::Get() + 7968
  [bt] (1) 2   libsystem_platform.dylib            0x00007fffb27a7b3a _sigtramp + 26
  [bt] (2) 3   ???                                 0x3419c9fe3dc10094 0x0 + 3754253858184954004
  [bt] (3) 4   libmxnet.so                         0x000000011270c1d6 mxnet::Storage::Get() + 1774406
  [bt] (4) 5   libmxnet.so                         0x0000000110109dee std::__1::__tree<std::__1::__value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, mxnet::NDArrayFunctionReg*>, std::__1::__map_value_compare<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, mxnet::NDArrayFunctionReg*>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::allocator<std::__1::__value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, mxnet::NDArrayFunctionReg*> > >::destroy(std::__1::__tree_node<std::__1::__value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, mxnet::NDArrayFunctionReg*>, void*>*) + 1822
  [bt] (5) 6   libmxnet.so                         0x0000000110e91c61 mxnet::op::FullyConnectedComputeExCPU(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&, std::__1::vector<mxnet::OpReqType, std::__1::allocator<mxnet::OpReqType> > const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&) + 10335393
  [bt] (6) 7   libmxnet.so                         0x0000000110e9224f mxnet::op::FullyConnectedComputeExCPU(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&, std::__1::vector<mxnet::OpReqType, std::__1::allocator<mxnet::OpReqType> > const&, std::__1::vector<mxnet::NDArray, std::__1::allocator<mxnet::NDArray> > const&) + 10336911
  [bt] (7) 8   libmxnet.so                         0x0000000111ea50ef mxnet::imperative::SetShapeType(mxnet::Context const&, nnvm::NodeAttrs const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, mxnet::DispatchMode*) + 1583
  [bt] (8) 9   libmxnet.so                         0x0000000111ea3bec mxnet::Imperative::Invoke(mxnet::Context const&, nnvm::NodeAttrs const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&, std::__1::vector<mxnet::NDArray*, std::__1::allocator<mxnet::NDArray*> > const&) + 716
@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.

@sxjscience
Copy link
Member

I think we should also test for the exceptions. We need to test

import mxnet.numpy as np
from mxnet.base import MXNetError
try:
    a = np.ones((10, 10))
    b = a.reshape((1,))
except MXNetError:
    pass
except:
    raise

@sxjscience sxjscience added Numpy and removed Numpy labels Sep 5, 2019
@reminisce
Copy link
Contributor

I suspect this may be related to some variations on the nightly build platforms. I built the latest master from source on mac os; the error message and stack trace can be printed correctly. However, the latest nightly build gives the seg fault and the stack trace is irrelevant with the code.

@anirudh2290
Copy link
Member

There are already tests for exception handling: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_exc_handling.py . The problem is we don't test for macos, do we ? Having said that this was working at some point in time, so as @reminisce suggests this may be issue only with specific builds.

@sxjscience
Copy link
Member

@anirudh2290 Have we tested the case for reshaping to an invalid shape?

@marcoabreu
Copy link
Contributor

Agree, I think the issue is how we compile. Seems like we're stripping debug symbols.

@anirudh2290
Copy link
Member

@sxjscience probably not. is this happening only with reshape ?

@sxjscience
Copy link
Member

I guess we failed to catch this usability degradation because we haven't tested the error message raised by reshape. I suggest adding this to the test_exc_handling.py.

@samskalicky
Copy link
Contributor

@zachgk assign [@sxjscience]

@sxjscience
Copy link
Member

@eric-haibin-lin I've added the check for exception handling.

@sxjscience
Copy link
Member

@anirudh2290 In terms of the mac OS CI, do you have any idea about that? @zhanghang1989 has observed some other problems in MXNet in macOS. I guess the ultimate solution is to enable the macOS CI.

@anirudh2290
Copy link
Member

anirudh2290 commented Nov 22, 2019

@sxjscience I have never built and run MXNet on macOS ! Its a good thought to add Macos to CI, though I think its a project in itself since there may be other failures. Till now, I don't know anyone working on it.

@sxjscience
Copy link
Member

@anirudh2290 Thanks... I'm considering to try it on my personal mac first to understand how many tests will fail...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants