Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress bars jump around with multiprocessing when using the position kwarg #285

Closed
LankyCyril opened this issue Oct 8, 2016 · 14 comments

Comments

@LankyCyril
Copy link

LankyCyril commented Oct 8, 2016

from time import sleep
from tqdm import tqdm
from multiprocessing import Pool, freeze_support

def progresser(n):         
    text = "progresser #{}".format(n)
    for i in tqdm(range(5000), desc=text, position=n):
        sleep(0.001)

if __name__ == '__main__':
    freeze_support()  # for Windows support
    L = list(range(10))
    Pool(len(L)).map(progresser, L)

When running the above script, I get the following (or similar) output. While it is live, progress bars keep changing places, spamming my entire screen with copies of themselves and empty lines.
Tested with Python 2.7.3 from Ubuntu repos, and with Python 3.5.2 from Continuum Analytics (conda).

progresser #0: 100%|█████████| 5000/5000 [00:05<00:00, 916.57it/s]
progresser #1: 100%|█████████| 5000/5000 [00:05<00:00, 916.78it/s]
progresser #3:  90%|████████ | 4487/5000 [00:04<00:00, 914.76it/s]
(venv)$ clear: 100%|█████████| 5000/5000 [00:05<00:00, 910.49it/s]
progresser #4: 100%|█████████| 5000/5000 [00:05<00:00, 910.61it/s]
progresser #6:  95%|████████▌| 4769/5000 [00:05<00:00, 923.10it/s]
progresser #6: 100%|█████████| 5000/5000 [00:05<00:00, 911.17it/s]
progresser #7: 100%|█████████| 5000/5000 [00:05<00:00, 910.28it/s]
progresser #8: 100%|█████████| 5000/5000 [00:05<00:00, 910.75it/s]
progresser #9: 100%|█████████| 5000/5000 [00:05<00:00, 911.04it/s]
progresser #5:  97%|████████▊| 4862/5000 [00:05<00:00, 920.17it/s]
progresser #5: 100%|█████████| 5000/5000 [00:05<00:00, 911.18it/s]
progresser #6:  66%|█████▉   | 3288/5000 [00:03<00:01, 909.58it/s]
progresser #5:  99%|████████▉| 4955/5000 [00:05<00:00, 922.48it/s]
progresser #6:  97%|████████▊| 4862/5000 [00:05<00:00, 920.25it/s]
progresser #1:  79%|███████▏ | 3965/5000 [00:04<00:01, 919.05it/s]
progresser #9:  90%|████████ | 4489/5000 [00:04<00:00, 914.66it/s]
progresser #6:  99%|████████▉| 4955/5000 [00:05<00:00, 922.43it/s]
progresser #7:  68%|██████   | 3378/5000 [00:03<00:01, 908.62it/s]
progresser #4:  99%|████████▉| 4953/5000 [00:05<00:00, 920.60it/s]
progresser #5:  88%|███████▉ | 4398/5000 [00:04<00:00, 919.76it/s]
progresser #4:  82%|███████▍ | 4118/5000 [00:04<00:00, 915.07it/s]


progresser #3:  77%|██████▉  | 3842/5000 [00:04<00:01, 919.17it/s]


progresser #5:  90%|████████ | 4490/5000 [00:04<00:00, 916.05it/s]

progresser #9:  71%|██████▍  | 3563/5000 [00:03<00:01, 915.19it/s]
progresser #5:  66%|█████▉   | 3288/5000 [00:03<00:01, 909.53it/s]

Similar problems, naturally, happen when using the joblib module.

P.S. Notice the (venv)$ prompt closer to the top of the output. That's the prompt drawn by the shell after tqdm is "done." When I execute clear, of course, it simply scrolls down so that's why you still can see everything.

@lrq3000
Copy link
Member

lrq3000 commented Oct 8, 2016

Hello, what interpreter are you using to run the code? The terminal? IDLE?
PYCHARM? Something else?
Le 8 Oct. 2016 20:45, "Kirill Grigorev" notifications@github.com a écrit :

from time import sleepfrom tqdm import tqdmfrom multiprocessing import Pool
def progresser(n):
text = "progresser #{}".format(n)
for i in tqdm(range(5000), desc=text, position=n):
sleep(0.001)
L = list(range(10))
Pool(len(L)).map(progresser, L)

When running the above script, I get the following (or similar) output.
While it is live, progress bars keep changing places, spamming my entire
screen with copies of themselves and empty lines.
Tested with Python 2.7.3 from Ubuntu repos, and with Python 3.5.2 from
Continuum Analytics (conda).

progresser #0: 100%|█████████| 5000/5000 [00:05<00:00, 916.57it/s]
progresser #1: 100%|█████████| 5000/5000 00:05<00:00, 916.78it/s$ ^C #2: 100%|█████████| 5000/5000 [00:05<00:00, 917.91it/s]
progresser #3: 90%|████████ | 4487/5000 00:04<00:00, 914.76it/s$ clear: 100%|█████████| 5000/5000 [00:05<00:00, 910.49it/s]
progresser #4: 100%|█████████| 5000/5000 [00:05<00:00, 910.61it/s]
progresser #6: 95%|████████▌| 4769/5000 [00:05<00:00, 923.10it/s]
progresser #6: 100%|█████████| 5000/5000 [00:05<00:00, 911.17it/s]
progresser #7: 100%|█████████| 5000/5000 [00:05<00:00, 910.28it/s]
progresser #8: 100%|█████████| 5000/5000 [00:05<00:00, 910.75it/s]
progresser #9: 100%|█████████| 5000/5000 [00:05<00:00, 911.04it/s]
progresser #5: 97%|████████▊| 4862/5000 [00:05<00:00, 920.17it/s]
progresser #5: 100%|█████████| 5000/5000 [00:05<00:00, 911.18it/s]
progresser #6: 66%|█████▉ | 3288/5000 [00:03<00:01, 909.58it/s]
progresser #5: 99%|████████▉| 4955/5000 [00:05<00:00, 922.48it/s]
progresser #6: 97%|████████▊| 4862/5000 [00:05<00:00, 920.25it/s]
progresser #1: 79%|███████▏ | 3965/5000 [00:04<00:01, 919.05it/s]
progresser #9: 90%|████████ | 4489/5000 [00:04<00:00, 914.66it/s]
progresser #6: 99%|████████▉| 4955/5000 [00:05<00:00, 922.43it/s]
progresser #7: 68%|██████ | 3378/5000 [00:03<00:01, 908.62it/s]
progresser #4: 99%|████████▉| 4953/5000 [00:05<00:00, 920.60it/s]
progresser #5: 88%|███████▉ | 4398/5000 [00:04<00:00, 919.76it/s]
progresser #4: 82%|███████▍ | 4118/5000 [00:04<00:00, 915.07it/s]

progresser #3: 77%|██████▉ | 3842/5000 [00:04<00:01, 919.17it/s]

progresser #5: 90%|████████ | 4490/5000 [00:04<00:00, 916.05it/s]

progresser #9: 71%|██████▍ | 3563/5000 [00:03<00:01, 915.19it/s]
progresser #5: 66%|█████▉ | 3288/5000 [00:03<00:01, 909.53it/s]

Similar problems, naturally, happen when using the joblib module.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#285, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABES3ulCSc3Ar2YAyve8oWpbAAc_NFkPks5qx-TBgaJpZM4KRy5p
.

@LankyCyril
Copy link
Author

Directly in the terminal. More specifically, in variations:
(Windows) mintty -> tmux -> bash
(Linux) xfce4-terminal -> tmux -> bash
(Linux) xfce4-terminal -> bash
(Linux) xterm -> tmux -> bash
(Linux) xterm -> bash

@lrq3000
Copy link
Member

lrq3000 commented Oct 8, 2016

Ah ok so I see the issue is that tqdm is inside the parallelized function.
To work correctly, tqdm should be called along with Pool.

I'm not that much experienced with parallelized Python code, but on
stackoverflow you will find a few answers about how to properly use tqdm
with Pool.

We should add this to the README I think?

2016-10-08 22:25 GMT+02:00 Kirill Grigorev notifications@github.com:

Directly in the terminal. More specifically, in variations:
(Windows) mintty -> tmux -> bash
(Linux) xfce4-terminal -> tmux -> bash
(Linux) xfce4-terminal -> bash
(Linux) xterm -> tmux -> bash
(Linux) xterm -> bash


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#285 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABES3t5FO1YMcHIysg30mhsjdp4Kmg-Kks5qx_xVgaJpZM4KRy5p
.

@LankyCyril
Copy link
Author

LankyCyril commented Oct 8, 2016

I only found something pertaining to wrapping pool.map and/or pool.imap_unordered itself, not the loops within parallelized functions (http://stackoverflow.com/questions/32172763/, http://stackoverflow.com/questions/26637273/).

And if I'm looking for answers considering joblib, it's even less helpful (http://stackoverflow.com/questions/37804279).

Moreover, the tqdm readme states:

position : int, optional
Specify the line offset to print this bar (starting from 0) Automatic if unspecified.
Useful to manage multiple bars at once (eg, from threads).

So I was kinda thinking this is what it is for...

@lrq3000
Copy link
Member

lrq3000 commented Oct 8, 2016

That's what I meant, you should put tqdm outside of the pool.

But if you really want it to be inside the pool, you also can. You can set
the value of argument position so that each bar gets its own position and
does not collide with others.

The idea would be that the position gets set to the job number in the
pool. If you can instanciate each tqdm instance with its own position
number, even if the bars are in parallel, normally it will work (but there
can be some issues because there's no locking mechanism, so there can be
racing issues, but we can fix it if necessary). But I am not experienced
enough with Pool to give you a code snippet.

2016-10-08 23:15 GMT+02:00 Kirill Grigorev notifications@github.com:

I only found something pertaining to wrapping pool.map and/or
pool.imap_unordered itself, not the loops within parallelized functions (
http://stackoverflow.com/questions/32172763/, http://stackoverflow.com/
questions/26637273/).

And if I'm looking for answers considering joblib, it's even less helpful (
http://stackoverflow.com/questions/37804279).

Moreover, the tqdm readme states:

position : int, optional
Specify the line offset to print this bar (starting from 0) Automatic if unspecified. Useful to manage multiple bars at once (eg, from threads).

So I was kinda thinking this is what it is for...


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#285 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABES3tLQMDjsAesKYc1YBGjHFcVpNy5rks5qyAfegaJpZM4KRy5p
.

@LankyCyril
Copy link
Author

It's probably the locking issues then... Because as far as the rest goes, that's what I'm doing: mapping the function onto a range of positions.

@lrq3000
Copy link
Member

lrq3000 commented Oct 8, 2016

Ok then we'll take a look in the next weeks, I'm sure this is fixable
Le 8 Oct. 2016 23:52, "Kirill Grigorev" notifications@github.com a écrit :

It's probably the locking issues then... Because as far as the rest goes,
that's what I'm doing: mapping the function onto a range of positions.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#285 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABES3gv2vDT27VHg4kCW5f77oiz7jb5hks5qyBCvgaJpZM4KRy5p
.

@lrq3000
Copy link
Member

lrq3000 commented Oct 16, 2016

Hey @LankyCyril,

Can you please try PR #291 to see if it fixes your issue?

@LankyCyril
Copy link
Author

Hi @lrq3000 !

I have commented over there. Sadly nothing seems to have changed for me.

@jrollins
Copy link

jrollins commented Jan 7, 2017

I'm experiencing the same problem. I believe the issue is about tqdm issuing newlines once each bar is complete. When one process finishes before another, the newline causes the terminal to scroll and the tqdm bars to continue on a new line. The offset bars I think also somehow scroll up, which can cause any subsequent output to be printed on the same lines as remaining bars.

I'm not sure what the solution is other than to someone not issue any line scrolls once the progress is complete, and then someone scroll down all at once at the end once everything is complete. I can't seem to force a way around this right now, so any suggestions of a workaround are appreciated.

@lrq3000
Copy link
Member

lrq3000 commented Jan 9, 2017

tqdm is not yet parallel safe. Can you guys try the new PR #329?

About the newline return problem, this might be related to my comment here. The PR won't yet fix everything, but bars should not disappear anymore if you use a lock.

@kratsg
Copy link

kratsg commented Mar 28, 2017

@jrollins

I'm not sure what the solution is other than to someone not issue any line scrolls once the progress is complete, and then someone scroll down all at once at the end once everything is complete. I can't seem to force a way around this right now, so any suggestions of a workaround are appreciated.

I had this issue. I decided to use leave=False which stopped the newlines from appearing that way. For a very long-running process (hours) with thousands of tasks, this seemed to be the better option with N defined processes handling all M tasks, so that I always had N bars on the screen, rather than M bars. I find that when I have too many bars, tqdm goes really crazy and skips a bunch of lines on each update (it seems to be related to the console size somehow).

@kratsg
Copy link

kratsg commented Mar 28, 2017

The idea would be that the position gets set to the job number in the
pool. If you can instanciate each tqdm instance with its own position
number, even if the bars are in parallel, normally it will work (but there
can be some issues because there's no locking mechanism, so there can be
racing issues, but we can fix it if necessary). But I am not experienced
enough with Pool to give you a code snippet.

I was able to get this somewhat working using joblib with nested tasks. I have a top-level task running like so:

  from numpy import memmap, uint64
  pids = memmap(os.path.join(tempfile.mkdtemp(), 'test'), dtype=uint64, shape=num_cores, mode='w+')
  results = Parallel(n_jobs=num_cores)(delayed(utils.do_cut)(......, pids) for task in tasks)

In each do_cut contains a long running task. So initially, I wanted to pass in an array of tqdm objects through and have the nested process update the corresponding one. This turns out not to be a good idea because of how the serialization and such works for the library. Instead, I decided it was fine if I created the tqdm inside do_cut:

def do_cut(*args, pids):
  if os.getpid() not in pids: pids[np.argmax(test==0)] = os.getpid()
  position = np.where(pids==os.getpid())[0][0]
  ...
  for cut in tqdm.tqdm(iterable, desc='{0:s}, {1:d}, {2:d}'.format(did, position, os.getpid()), total=get_n_subtasks(), position=position, leave=False):
    ...

Note that I used leave=False here but I also manage a list of pids shared with all processes. Since these are smartly being re-used by joblib and the offsets are wildly different, I just created a shared list using np.memmap that will keep track of which ones are being used, and then figure out their position in that list. That way, when a job finishes and grabs the next one, the PID remains the same, it can find it's position, and re-update/overwrite that line in the console.

Hopefully this solution works for some other people looking for something like this as well. The better way would be to somehow maintain the list of tqdm objects in the parent thread, but I can't find a way to share that correctly (even with callbacks/function serializations) so this is the way I settled on for now.

Note: this temporarily seems to fix it in the short-term. After about 10 minutes, things start jumping around again.

@casperdcl
Copy link
Sponsor Member

I've noticed similar jumping around - but only in a Win10 command prompt (even with locks). Running the same multiprocessing script in a linux bash terminal worked fine.

Maybe see #329.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants