Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

The conflict between MXNet and OpenCV #8569

Closed
wkcn opened this issue Nov 7, 2017 · 12 comments
Closed

The conflict between MXNet and OpenCV #8569

wkcn opened this issue Nov 7, 2017 · 12 comments

Comments

@wkcn
Copy link
Member

wkcn commented Nov 7, 2017

Hi, there.

I found the reason of the conflict between MXNet and OpenCV.

Environment info

Operation System: Arch Linux 4.13.6
MXNet: 3f37577 (Date: Tue Nov 7 02:13:07 2017 +0800)
OpenCV: 3.3.1
Python: 2.7.14/3.6.3
GCC: 6.3.1 20170109
Build config: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas

Steps to reproduce

  1. I built the MXNet core shared library with make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas
    make/config.mk is default.
    The building was successful.
  2. Then I was going to install the MXNet Python binding.
	cd python
	sudo python setup.py install

It showed the error that:

*** Error in python': free(): invalid pointer:0x000055ec46fe1520 ***

What I have tried to solve it

I deleted all "import cv2" in $(MXNET_PATH)/python/mxnet/{recordio.py, image/{detection.py, image.py}}

Then I made two tests in the folder $(MXNET_PATH)/python/.

➜  python git:(master) ✗ python 
Python 3.6.3 (default, Oct 24 2017, 14:48:20) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
>>> import cv2
*** Error in `python': free(): invalid pointer: 0x0000564c7470d520 ***
[1]    116917 abort (core dumped)  python

➜  python git:(master) ✗ python
Python 3.6.3 (default, Oct 24 2017, 14:48:20) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> import mxnet
[src/tcmalloc.cc:283] Attempt to free invalid pointer 0x5568689f8fc0 
[1]    116946 abort (core dumped)  python

src/tcmalloc.cc is the code of gperftools.
So I think there is a conflict between gperftools and opencv2.

I set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0 in $(MXNET_PATH)/make/config.mk, and rebuild MXNet.
The problem is solved.

I think the reason is that gperftools or jemalloc replaces the memory allocator including malloc, however python-opencv uses the default allocator.

There are some shared pointers between MXNet and OpenCV, but it's not available to free the memories different allocators(gperftools, jemallo, glibc) allocated.

Solutions

There are two solutions to use MXNet and OpenCV simultaneously.

    1. Use python-opencv with the builtin memory allocator, and set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0 in $(MXNET_PATH)/make/config.mk. Rebuild MXNet or use pip to install MXNet.
    1. Rebuild python-opencv with the responding memory allocator with MXNet, such as python-opencv with gperftools memory allocator and MXNet with gperftools memory allocator.
@szha
Copy link
Member

szha commented Nov 7, 2017

I've seen the same issue when building with cuda9 + cudnn7 with gperftools option on.

@wkcn
Copy link
Member Author

wkcn commented Nov 7, 2017

I think the reason is that gperftools or jemalloc replaces the memory allocator including malloc, however python-opencv uses the default allocator.

There are some shared pointers between MXNet and OpenCV, but it's not available to free the memories different allocators(gperftools, jemallo, glibc) allocated.

@KellenSunderland
Copy link
Contributor

Out of curiosity, does the crash occur if you import cv2 after mxnet?

@wkcn
Copy link
Member Author

wkcn commented Nov 10, 2017

@KellenSunderland The crash will occur too if importing cv2 after mxnet with gperftools.
I set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0. Rebuild mxnet without gperftools. There will be no crash.
USE_GPERFTOOLS = 1 is the default setting. When the machine doesn't have gperftools, the building will not have gperftools.

@KellenSunderland
Copy link
Contributor

@wkcn Thanks, for the info. I think your description on the relation between the USE_GPERFTOOLS flag and the crash is clear. I just saw some other similar reports that were dependent on the order you initialize opencv in relation to the gperf using library (i.e. mxnet in this case).

@wkcn
Copy link
Member Author

wkcn commented Nov 10, 2017

@KellenSunderland Thank you!

@eLvErDe
Copy link

eLvErDe commented Nov 15, 2017

Hi there,

I can confirm setting USE_GPERFTOOLS=0 fixes the issue (it was double free for me). I kept USE_JEMALLOC=1 and both modules loads together in any order.

Adam.

@domschl
Copy link

domschl commented Apr 28, 2018

USE_JEMALLOC=1 didn't work for me with opencv, I had to disable both USE_GPERFTOOLS=0 and USE_JEMALLOC=0, otherwise python setup fails.

Maybe this should be reopened?
This affects many people building from source with defaults. (see e.g. Arch AUR)

Possible solutions can be:

  • change build so that USE_GPERFOOLS=0 and USE_JEMALLOC=0 are forced, if external opencv is used.
  • include opencv in build-tree, and build it with appropriate allocators

as already suggested by @wkcn.

@tequilaguru
Copy link

You can use LD_PRELOAD to workaround this in the meantime.

@wkcn
Copy link
Member Author

wkcn commented Jul 11, 2018

@tequilaguru Thank you! I will try it next time.

@idealboy
Copy link

I hava met tha same problem, that I build mxnet-1.2.1 with cuda-9.1,cudnn7.1 with USE_GPERFOOLS =1 . When I put the libmxnet.so in another machine to run inference, it asks me the lack of libtcmalloc.so.4, then I install the gperftools by yum, the problem occur when I import mxnet in python

@tequilaguru
Copy link

It should work with LD_PRELOAD, either with tcmalloc or jemalloc

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants