New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiprocessing and tqdm in jupyter notebooks. #485
Comments
I'm not sure what the culprit is but parallel bars are quite tricky. On Linux, it is usually transparent because What you can try is to use a lock you define in your Jupyter cell, and provide it to tqdm, if this fixes the issue then this would confirm my hypothesis above. |
Can confirm that the same problem arises when explicitly providing a lock, using the Windows-supported code from from __future__ import print_function
from time import sleep
from tqdm import tqdm_notebook, trange
from multiprocessing import Pool, freeze_support, RLock
L = list(range(9))
def progresser(n):
interval = 0.001 / (len(L) - n + 2)
total = 5000
text = "#{}, est. {:<04.2}s".format(n, interval * total)
# NB: ensure position>0 to prevent printing '\n' on completion.
# `tqdm` can't autmoate this since this thread
# may not know about other bars in other threads #477.
for _ in tqdm_notebook(range(total), desc=text, position=n + 1):
sleep(interval)
if __name__ == '__main__':
freeze_support() # for Windows support
p = Pool(len(L),
initializer=tqdm.set_lock,
initargs=(RLock(),)) # Provide a lock explicitly
p.map(progresser, L)
print('\n' * len(L)) |
I don't think so; I ran the above code on Linux (in a Jupyter notebook). |
I found a strange hack to workaround your issue: Example here: from time import sleep
from tqdm import tqdm_notebook as tqdm
from multiprocessing import Pool, freeze_support
def progresser(n):
# This line is the strange hack
print(' ', end='', flush=True)
text = "progresser #{}".format(n)
for i in tqdm(range(5000), desc=text, position=n):
sleep(0.001)
if __name__ == '__main__':
freeze_support()
L = list(range(10))
print()
Pool(2).map(progresser, L) Actually, you can put any The line I added is the line I found which works and which modifies as few as possible the output. My configuration: |
I can't believe that hack works but it does for jupyter! Wish I had found this thread 6 hours ago! |
I'm sorry if I'm off topic here, but is it possible to have just one progress bar that accounts for the execution of all of the workers? For example, my pool would have more than 65000 workers and I would like to have my progress bar updated every time a worker finishes executing. Unfortunately, each worker creates its own progress bar on top of the other and what I see is a progress bar stuck on 1/65000. |
btw check out |
Use interprocess communication to collect progress from all process and show i in the main progressbar. |
...
Excellent hack thanks! |
Seems position is ignored now. |
Summary: When implementing parallel inference in D34574082 (facebookresearch@dc066af), we added a hack to fix the issue where [Jupyter fails to render progress bar from a subprocess](tqdm/tqdm#485) by flushing `stdout` with a space for each chain of inference. Thinking that printing an extra space wouldn't be too bad in general, I didn't set a condition on when to run the snippet. However, it turns out that when using a non-standard stdout (e.g. within VSCode Jupyter plugin), this single line of `print` can leads to [a ton of empty output](https://app.reviewnb.com/facebookresearch/beanmachine/pull/1376/discussion/). While I haven't figured out what causes the issue and whether there's better alternative to fix the progress bar for Jupyter, one thing we can do for now is to only run the hacky snippet when necessary -- i.e. when a chain is being run in a subprocess, and within Jupyter notebook. Differential Revision: D34841233 fbshipit-source-id: 9aec4d4f6e5dcb213b9d0ed47275932e7710f7bc
Summary: Pull Request resolved: #1383 When implementing parallel inference in D34574082 (dc066af), we added a hack to fix the issue where [Jupyter fails to render progress bar from a subprocess](tqdm/tqdm#485) by flushing `stdout` with a space for each chain of inference. Thinking that printing an extra space wouldn't be too bad in general, I didn't set a condition on when to run the snippet. However, it turns out that when using a non-standard stdout (e.g. within VSCode Jupyter plugin), this single line of `print` can leads to [a ton of empty output](https://app.reviewnb.com/facebookresearch/beanmachine/pull/1376/discussion/). While I haven't figured out what causes the issue and whether there's better alternative to fix the progress bar for Jupyter, one thing we can do for now is to only run the hacky snippet when necessary -- i.e. when a chain is being run in a subprocess, and within Jupyter notebook. Reviewed By: jpchen Differential Revision: D34841233 fbshipit-source-id: 5b97cef298f7a451ac117c51a21ceb7eadcaa84d
I'm trying to use tqdm along with multiprocessing.Pool in a notebook, and it doesn't quite seem to render correctly. The general problem appears to be well documented in Issue #407 and Issue #329, but neither of the fixes appear to have percolated to the notebook code. In particular, the "canonical example" in Issue #407 works for me on the command line, but when I move to a jupyter notebook and replace
from tqdm import tqdm
withfrom tqdm import tqdm_notebook as tqdm
I get something like the following.Changing the number of workers in the Pool yields different results, but the results are consistent (e.g. running with Pool(4) always shows progressers 1, 6, and 7).
During times when no progress bar is updating, I see the following error message repeatedly appearing on the terminal where the jupyter notebook is running.
For reference, this is the same issue noted here, a different project that is making use of tqdm internally. The terminal printing stops when progressbars are successfully printing in the notebook.
I have tested with both tqdm version 4.19.4 (the current version on pip) and the current master (installed using
pip install -e git+https://github.com/tqdm/tqdm.git@master#egg=tqdm
). I have tested on both Linux (4.9.34-gentoo) and OS X (High Sierra 10.13.1). My jupyter version is 4.4.0 both on Linux and on OS X.The text was updated successfully, but these errors were encountered: