New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCV + Python multiprocessing breaks on OSX #5150

Open
PuchatekwSzortach opened this Issue Aug 7, 2015 · 51 comments

Comments

@PuchatekwSzortach
Copy link

PuchatekwSzortach commented Aug 7, 2015

I'm trying to use OpenCV with Python's multiprocessing module, but it breaks on me even with very simple code. Here is an example:

import cv2
import multiprocessing
import glob
import numpy

def job(path):

    image = cv2.imread(path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    return path

if __name__ == "__main__":

    main_image = cv2.imread("./image.png")
    main_image = cv2.cvtColor( main_image, cv2.COLOR_BGR2GRAY)

    paths = glob.glob("./data/*")
    pool = multiprocessing.Pool()
    result = pool.map(job, paths)

    print 'Finished'

    for value in result:
        print value

If I remove main_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) script works, but with it in there it doesn't, even though that line shouldn't affect jobs processed by the pool.
All image paths lead to simple images, about ~20 of them in total.
Funny enough code works fine if I create images in memory with numpy instead of reading them with imread().
I'm guessing OpenCV uses some shared variables behind the scene that aren't protected from race conditions.

My environment: Mac OS X 10.10, OpenCV 3.0.0, Python 2.7.
The last few lines of stack trace are:

Application Specific Information:
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.root.default-qos
0   libdispatch.dylib               0x00007fff8668913f dispatch_apply_f + 769
1   libopencv_core.3.0.dylib        0x000000010ccebd14 cv::parallel_for_(cv::Range const&, cv::ParallelLoopBody const&, double) + 152
2   libopencv_imgproc.3.0.dylib     0x000000010c8b782e void cv::CvtColorLoop<cv::RGB2Gray<unsigned char> >(cv::Mat const&, cv::Mat&, cv::RGB2Gray<unsigned char> const&) + 134
3   libopencv_imgproc.3.0.dylib     0x000000010c8b1fd4 cv::cvtColor(cv::_InputArray const&, cv::_OutputArray const&, int, int) + 23756
4   cv2.so                          0x000000010c0ed439 pyopencv_cv_cvtColor(_object*, _object*, _object*) + 687
5   org.python.python               0x000000010bc64968 PyEval_EvalFrameEx + 19480
6   org.python.python               0x000000010bc5fa42 PyEval_EvalCodeEx + 1538\

BTW - I got other OpenCV functions to crash when used with Python multiprocessing too, above is just the smallest example I could produce that reflects the problem.

Also, I got above algorithm (and much more complicated ones) to work in multithreaded C++ programs, using same OpenCV build on same machine, so I guess the issue lies on Python bindings side.

@berak

This comment has been minimized.

Copy link
Contributor

berak commented Aug 7, 2015

can't reproduce it on win

@PuchatekwSzortach

This comment has been minimized.

Copy link

PuchatekwSzortach commented Aug 7, 2015

I can't reproduce it with Opencv 2.4.9 on Ubuntu either.

@PuchatekwSzortach PuchatekwSzortach changed the title OpenCV breaks when used with Python multiprocessing module OpenCV + Python multiprocessing breaks on OSX Aug 9, 2015

@mshabunin

This comment has been minimized.

Copy link
Contributor

mshabunin commented Aug 13, 2015

Reproduced it with latest OpenCV master on Linux. But instead of crashing, processes hung. Works well if one of cvtColor calls is commented.

@xuzhuoran0106

This comment has been minimized.

Copy link

xuzhuoran0106 commented Aug 30, 2015

I move all opencv code into the subprocess function, and it works.

So it sames when you call cv2 functions (except imread), it initializes something which can not be passed to subprocess.

@xuzhuoran0106

This comment has been minimized.

Copy link

xuzhuoran0106 commented Aug 30, 2015

so the idea is to always run opencv code in subprocess.

this is the solution for you.

import multiprocessing
import glob
import numpy

def job2(path):
    import cv2
    image = cv2.imread(path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return path

def job1(path):
    import cv2
    main_image = cv2.imread(path)
    main_image = cv2.cvtColor( main_image, cv2.COLOR_BGR2GRAY)

if __name__ == "__main__":
    p = Process(target=job1, args=("./image.png"))
    p.start()
    p.join()

    paths = glob.glob("./data/*")
    pool = multiprocessing.Pool()
    result = pool.map(job2, paths)

    print 'Finished'

    for value in result:
        print value
@foozmeat

This comment has been minimized.

Copy link

foozmeat commented Sep 25, 2015

I just came across this bug myself. In my case I'm doing a lot of work inside of RQ subprocs so moving the conversion to the main "thread" isn't an option. I've got two calls into cvtColor. The first one is a conversion from a PIL image and the second one is converting a CV image into grayscale. I can work around the first problem with this

        cv_image = np.array(image)
        cv_image = cv_image[:, :, ::-1].copy()

but I don't have a work around for the second one. Is there another way to convert an image to grayscale without cvtColor?

@mshabunin

This comment has been minimized.

Copy link
Contributor

mshabunin commented Sep 25, 2015

I'd suggest to use concurrent.futures.ThreadPoolExecutor from python3.

@foozmeat

This comment has been minimized.

Copy link

foozmeat commented Sep 27, 2015

I ended up switching celery to use eventlet to work around the problem.

@ekolve

This comment has been minimized.

Copy link

ekolve commented Nov 30, 2015

When I run this on El Capitan with Python 3.5 and OpenCV3 the script hangs - when I comment out the main_image lines, it finishes.

@pogam

This comment has been minimized.

Copy link

pogam commented Dec 1, 2015

I have been testing the code of PuchatekwSzortach on two different ubuntu os (14.04), with 8GB and 64GB of ram respectively. I have the same problem as others when using 8 GB. However when using the 64GB it works.

This is however not sustainable with larger code. When running a code with opencv functions calls embedded in an homebrewed function parallelized in a "multiprocessing" pool, the code eventually ends up with idle processors after a number of call to the pool that can fluctuate from run to run.

@msalvato

This comment has been minimized.

Copy link

msalvato commented Dec 1, 2015

Seeing the same issue where the script hangs if main_image = cv2.cvtColor(...) is uncommented. OSX 10.11.1, Python 3.5. Came here because I've seen the bug in other places in my codebase as well, not just with cvtColor.

@ekolve

This comment has been minimized.

Copy link

ekolve commented Dec 2, 2015

Setting the start method to 'spawn' avoids the hang:

import cv2
import multiprocessing
import glob
import numpy

def job(path):

    image = cv2.imread(path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    return path

if __name__ == "__main__":

    main_image = cv2.imread("./image.png")
    main_image = cv2.cvtColor( main_image, cv2.COLOR_BGR2GRAY)

    multiprocessing.set_start_method('spawn')
    paths = glob.glob("./data/*")
    pool = multiprocessing.Pool()
    result = pool.map(job, paths)

    print('Finished')

    for value in result:
        print(value)
@pogam

This comment has been minimized.

Copy link

pogam commented Dec 2, 2015

thanks ekolve. Do you know if there is an equivalent for python 2.7?

@PuchatekwSzortach

This comment has been minimized.

Copy link

PuchatekwSzortach commented Jan 9, 2016

I tried the code with OpenCV 3.1 on OSX 10.11.2 and Python 3.4 and it worked just fine.

@mfergie

This comment has been minimized.

Copy link

mfergie commented Jan 19, 2016

I'm having similar issues on OSX 10.11.2 (15C50), Python 3.5.1, OpenCV 2.4.12_2 installed via homebrew.

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 14, 2016

Have the same problem after upgrading my python 2.7 code from opencv 2.4.11 to opencv 3.1. For me its is a cvtColor call which converts from BGR to HSV. Does anybody have any hints where to look in the c++ code to start fixing this annoying problem?

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

I experienced the same problem (code hangs) now in another call: mySurfDetector.detectAndCompute

@berak

This comment has been minimized.

Copy link
Contributor

berak commented Mar 16, 2016

@georgwaechter , there' a similar observation (c++ / multithreading/ cvtColor) here

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

I'm currently testing whether this maybe originates from the thread pool used in functions like cvtColor and detectAndCompute. In my understanding forking (without an "exec" call afterwords as done in python under linux) could be a problem, but thats only a guess. I've read that forking only clones the current thread. In the case the thread pool may assume wrongly that it has more threads that it has ...

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

@berak thanks for the link.

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

Ok, I think i nailed down the problem to the implementation of the pthreads thread pool in https://github.com/Itseez/opencv/blob/master/modules/core/src/parallel_pthreads.cpp ..

I have recompiled OpenCV "-D BUILD_TBB=ON -D WITH_TBB=ON" and suddenly it works! Using this cmake option i use the intel threading library instead of the standard pthreads. The intel library states it has "better fork support".

@berak The problem described there is something else as this problem with python only relates to multiple processes, not multi-threading per se.

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

Idea to solve the problem within the c++ code: Register a fork handler via http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html and reset the state of the thread pool on forking the process. This has the effect that the next call to cvtColor (or any other method using the thread pool) initializes the thread pool. thus starts the threads.

At the moment it seems that the code is waiting infinitley for a condition that is never signaled because the number of threads is not as expected after forking:
https://github.com/Itseez/opencv/blob/master/modules/core/src/parallel_pthreads.cpp#L440
The "notify_complete" method only signals the condition if all of the threads completed its work ..

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

This also explains why setting the start method to "spawn" solves the problem as described by @ekolve. Spawing resets the memory after forking, therefore forcing a re-inizialization of all data structures (as the thread pool) within OpenCV.

@georgwaechter

This comment has been minimized.

Copy link

georgwaechter commented Mar 16, 2016

@mshabunin Can you change the category of this issue? This is not related to the python bindings, but to the core.

Here is another library with the same problem: flame/blis#30

I'll try to prepare a pull request for this.

papr added a commit to papr/homebrew-core that referenced this issue Oct 31, 2017

opencv: enable tbb by default
tbb is required when one wants to use opencv in a multiprocessing
application. See this opencv issue thread for details:
opencv/opencv#5150 (comment)

BrewTestBot added a commit to BrewTestBot/homebrew-core that referenced this issue Oct 31, 2017

opencv: enable tbb by default
tbb is required when one wants to use opencv in a multiprocessing
application. See this opencv issue thread for details:
opencv/opencv#5150 (comment)

ilovezfs added a commit to Homebrew/homebrew-core that referenced this issue Oct 31, 2017

opencv: depend on tbb
tbb is required when one wants to use opencv in a multiprocessing
application. See this opencv issue thread for details:
opencv/opencv#5150 (comment)

Closes #20101.

Signed-off-by: ilovezfs <ilovezfs@icloud.com>

robohack added a commit to robohack/homebrew-core that referenced this issue Nov 3, 2017

opencv: depend on tbb
tbb is required when one wants to use opencv in a multiprocessing
application. See this opencv issue thread for details:
opencv/opencv#5150 (comment)

Closes Homebrew#20101.

Signed-off-by: ilovezfs <ilovezfs@icloud.com>
@onlyjus

This comment has been minimized.

Copy link

onlyjus commented Jan 18, 2018

I am having the same issue on windows 7, OpenCV 3.3.1, Python 3.6.4, from Anaconda.

cv2.setNumThreads(0) and multiprocessing.set_start_method('spawn', force=True) do not fix the issue of processes hanging. I am reading a video file in the main thread, sending frames to a Queue where those frames are then processed by multiprocessing.Process instances.

Any ideas?

@tkoch96

This comment has been minimized.

Copy link

tkoch96 commented Mar 22, 2018

I am using multiprocessing to speed up ARUCO target detection on a raspberry pi.
(Keep in mind I have gotten this working using no multiprocessing.)

Using this code:
``
workers = []
for i in range(num_workers):
workers.append(Worker_Bee(tasks,results,i))

for worker in workers:
worker.start()

for i in range(20):
tasks.put(Image_Processor(vs.read()))
time.sleep(.5)

``
The Image_Processor looks at the frame vs.read() which is an image. The image processor tries to find targets in the image.
The worker process Image_Processor hangs on the following line:
corners, ids, rejectedImgPoints = aruco.detectMarkers(gray, self.aruco_dict, parameters=self.parameters)
where gray is the image vs.read()
I've check the arguments to aruco.detectMarkers and they seem to be correct (and as I've said before, I've ran this before without multiprocessing fine).
There is no error message, and looking at top, I see the python process spawning but then quickly going away. The parent process stays.

I've seen similar problems with openCV hanging on cvtColor or resize, but all of their problems are fixed for me (compiling with TBB options and such).

Any advice on how to get this working?

@rns4731

This comment has been minimized.

Copy link

rns4731 commented Apr 6, 2018

I switched to eventlet to get this working.

@jaredleach

This comment has been minimized.

Copy link

jaredleach commented May 15, 2018

@tkoch96 Any chance you made any progress on the issues you were seeing? I'm running into nearly exactly the same problem you are.

@tkoch96

This comment has been minimized.

Copy link

tkoch96 commented May 15, 2018

No, it seems the library is designed with this inherent flaw and the developers here have insinuated it probably won't be fixed. I'm not too savvy with understanding how this low-level stuff works so I did not try to fix it myself.

If I were you I'd look for alternative solutions to speeding up processing.

@jaredleach

This comment has been minimized.

Copy link

jaredleach commented May 15, 2018

@tkoch96 Are you're referring to the aruco library or OpenCV?

@tkoch96

This comment has been minimized.

Copy link

tkoch96 commented May 15, 2018

I believe I posted this as an OpenCV issue (separate to this post) and they referred me to this post and closed my issue.
I have not posted on any aruco specific forum.

@burinov

This comment has been minimized.

Copy link

burinov commented Jun 19, 2018

I had the same issue on OSX.

Thanks to @mshabunin concurrent.futures.ThreadPoolExecutor works fine for me.

@0mza987

This comment has been minimized.

Copy link

0mza987 commented Jun 27, 2018

I have encountered the same issue on Python 2.7.10, Opencv 3.2 and Ubuntu 16.04
cv2.cvtColor(sheet_image.image, cv2.COLOR_GRAY2RGB) caused the problem, both parent process and child process hang forever, and the child process still exits after parent process gets killed.

@ogrisel

This comment has been minimized.

Copy link

ogrisel commented Jun 27, 2018

Thanks to @mshabunin concurrent.futures.ThreadPoolExecutor works fine for me.

Using threads is fine and often very efficient but can lead to GIL-contention if a significant fraction of the time is spent in pure python code. In that case, using process-based parallelism might lead to better performance.

In that case, you can try to use loky.get_reusable_executor() to fetch a process-based, reusable executor instance that does not cause OpenCV or any other OpenMP-based library to crash. The executor of loky can be used as a drop-in replacement for concurrent.futures.ThreadPoolExecutor.

More details: https://github.com/tomMoral/loky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment