-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fit_generator using use_multiprocessing=True does not work on Windows 8.1 x64, python 3.5 #10842
Comments
This issue is a bit unclear to me about whether that's expected or whether it should only occur with a standard data generator, but not with a sequence. I also get
with both (generator or sequence) when using multiprocessing=True in keras 2.2.0 in Python 3.6.6 under Windows 10 64-bit. I am not sure how to do multithreading in keras instead, either (the error message seems to suggest that might solve the problem, but I was unable to find any information exactly how one would do that). |
We do mimic Windows in the unit test using the 'spawn' method. So it "should" work. |
@Dref360 Can you point me to some code I should try, which would be expected to work? Does one need to do anything specific to use the 'spawn' method? |
Try running the tests in tests/keras/utils/data_utils.py 'spawn' (fork-exec) is the default on Windows. UNIX uses fork by default. |
I will test it from home with my proper system. I just realized that with a sequence generator at least multi_processing=False and workers=4 (or so) does actually do multi-threading. I had missed that, because my toy example did not spend enough time in the sequence generator to make it really obvious. multi_processing=True of course still hangs the system. |
@Dref360 Sorry, finally got around to trying it. What file / filename do I need to provide in the data_utils.py code? I assume some filename needs to go into where it says |
Any suggestions from anyone how to test this on a windows system (I'm honestly not clear on what files are needed for the test script @Dref360 pointed to)? |
pytest tests/keras/test_multiprocessing.py tests/keras/utils/ Follow the CONTRIBUTION.md for your setup. |
This still seems to be an issue. When use_multithreading=True, it is just hanging and literally nothing is happening. I am running it on Windows 10. Setting workers to a number that is bigger than 1 seems to improve the speed even if use_multithereading=False. Why is this setup improving it? Additionally, I also wonder if one needs to make its class generator (with I even asked a question related to this topic (regarding how things should be working in wondows 10) on Stackoverflow. |
@Dref360 Couldn't the keras team update |
What would be the purpose of this Lock? |
@Dref360 To overcome the error on Windows |
@Dref360 I create a generator class that extends
and uses it in its
When I try to use my generator and pass it to My initial thought was to have
and lastly update the
Then, per the above mentioned StackOverflow answer, Iterator would inherit its lock from Sequence, but I think we would run into the same problem because Also, fyi, a separate class that just holds a threading lock doesn't work either (i.e. Have |
@Dref360 Dumb question, but is there any way the thread locking could be moved to a function outside the |
Same here..! any solutions? |
PRs are welcome. I cannot work on this issue as I do not use Windows. |
I think the threading lock in the multiprocess module is unused. Can I remove the threading lock in the class Sequence? |
Sorry, but I don't see which Lock you're talking about.
In the class Sequence there is no lock: https://github.com/keras-team/keras/blob/master/keras/utils/data_utils.py#L305.
We do have a Lock hold by the _SEQUENCE_COUNTER:
https://github.com/keras-team/keras/blob/master/keras/utils/data_utils.py#L450
and internally there is a Lock inside the Queue. Both of these are really important.
COuld you point me to the code you're refering to?
…________________________________
De : txyugood <notifications@github.com>
Envoyé : 6 août 2019 10:54:15
À : keras-team/keras <keras@noreply.github.com>
Cc : Frédéric Branchaud-Charron <Frederic.Branchaud-Charron@USherbrooke.ca>; Mention <mention@noreply.github.com>
Objet : Re: [keras-team/keras] fit_generator using use_multiprocessing=True does not work on Windows 8.1 x64, python 3.5 (#10842)
I think the threading lock in the multiprocess module is unused. Can I remove the threading lock in the class Sequence?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#10842?email_source=notifications&email_token=ACEPRIS7ULGFBDNA344I5L3QDGGBPA5CNFSM4FNYXHVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3VM3KQ#issuecomment-518704554>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACEPRIRZZOPZTCAMM2OLTCDQDGGBPANCNFSM4FNYXHVA>.
|
I'm sorry, the threading lock in the class Iterator. I mean the multiprocess module doesn't use the thread, it uses the process. |
I guess this is used when we do not use the OrderedEnqueuer, because we cannot iterate a generator by two deferent threads. If it really solves your problem, we can make the lock lazily in next.
…________________________________
De : txyugood <notifications@github.com>
Envoyé : 6 août 2019 11:42:23
À : keras-team/keras <keras@noreply.github.com>
Cc : Frédéric Branchaud-Charron <Frederic.Branchaud-Charron@USherbrooke.ca>; Mention <mention@noreply.github.com>
Objet : Re: [keras-team/keras] fit_generator using use_multiprocessing=True does not work on Windows 8.1 x64, python 3.5 (#10842)
I'm sorry, the threading lock in the class Iterator.
https://github.com/keras-team/keras-preprocessing/blob/master/keras_preprocessing/image/iterator.py#L43.
I mean the multiprocess module doesn't use the thread, it uses the process.
So is the threading lock necessary in the process ?
Can I remove it, when I used the multiprocess on windows?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#10842?email_source=notifications&email_token=ACEPRIR46VDJK56OMAZS6LDQDGLV7A5CNFSM4FNYXHVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3VR7NQ#issuecomment-518725558>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACEPRIQWPVA6YEY6YIAIC33QDGLV7ANCNFSM4FNYXHVA>.
|
Still no solution? |
@evrial @txyugood @mchaniotakis I dont think there is a solution, unfortunately. Tensorflow has historically been built for Linux OSs. The |
I have a proposed "solution" that may interest others. Please note this is not a direct solution to the problem, but I believe a useful workaround. Please note this is coming from my experience with Tensorflow 1.15 (I have yet to use version 2). Please also see StackOverflow question Is the class generator (inheriting Sequence) thread safe in Keras/Tensorflow? TL;DRInstall NOTE: The Windows Subshell for Linux (WSL) version 2 is only available in Windows 10, Version 1903, Build 18362 or higher. Be sure to upgrade your Windows version in Windows Update to get this to work. Long AnswerFor
Linux supports The reason Windows hangs when using On Windows,
At this point, you may be asking yourself: "Wait...What about the Python Global Interpreter Lock (GIL)?..If Python only allows one thread to run at a time, why does it even have the The answer lies in the difference between
In programming, when we say two tasks are So, the GIL prevents threads from running in parallel, but not concurrently. The reason this is important for Tensorflow is because concurrency is all about I/O operations (data transfer). A good dataflow pipeline in Tensorflow should try to be IMPORTANT ASIDE: The
|
I am still able to reproduce the issue on Python 3.9.0 and Tensorflow 2.6.0 on Windows 10. I tried WSL 2 but the speedup relative to Windows 10 without multiprocessing was of just 20%. Is there any alternative or solution today? |
I am trying to get this same option to work on MacOS Monterey. While other python packages are able to use multi-processing, I see no improvement at all with this option. I am running 8 cores, tensorflow 2.6. I am using Keras Sequential models. I have:
And there is zero difference between having this on or off. |
Dear Keras community
I have been using keras succesfully for many tasks.
After implementing a custom data generator using the keras
Sequence
class, I tried using theuse_multiprocessing=True
of thefit_generator
function, with more than 1 worker (so data can be fed to my GPU).Unfortunately, after testing this setup in 3 different machines, the code seems to work only on Linux (even having a different GPU).
Is this the expected behaviour on a windows machine?
Kind regards,
The text was updated successfully, but these errors were encountered: