Fix parallel bars printing race issue by using locks #291

lrq3000 · 2016-10-16T15:42:24Z

Should fix #285 by using locks. There is one lock for multiprocessing and one for threading, so this should fix the issue in any case.

Canonical example:

from time import sleep
from tqdm import tqdm
from multiprocessing import Pool, freeze_support

def progresser(n):         
    text = "progresser #{}".format(n)
    for i in tqdm(range(5000), desc=text, position=n):
        sleep(0.001)

if __name__ == '__main__':
    freeze_support()  # for Windows support
    L = list(range(10))
    Pool(len(L)).map(progresser, L)

Note however that this doesn't fully fix issues on Windows when using too many bars at once, try with 3 bars and it works, try with 10 and it doesn't, because Windows console has a tendency to do weird newlines when there are too many lines printed at once (see also this PR that is suffering from the same issue). I can maybe also try on Windows using MSYS2 as suggested here to fix the issue?

TODO:

Unit test (using squash_ctrl and a fake IO, just check that there are 10 lines at the end, each with a different number, just like the example case provided).
Flake8

Signed-off-by: Stephen L. <lrq3000@gmail.com>

codecov-io · 2016-10-16T15:46:49Z

Current coverage is 90.74% (diff: 100%)

Merging #291 into master will decrease coverage by 1.03%

@@             master       #291   diff @@
==========================================
  Files             7          7          
  Lines           535        497    -38   
  Methods           0          0          
  Messages          0          0          
  Branches         97         88     -9   
==========================================
- Hits            491        451    -40   
- Misses           43         44     +1   
- Partials          1          2     +1

Powered by Codecov. Last update a65e347...1d397a1

CrazyPython

I'm probably not qualified to do this, but it looks OK to me.

CrazyPython · 2016-10-16T18:31:12Z

tqdm/_tqdm.py

@@ -734,6 +752,10 @@ def __iter__(self):
                                else smoothing * delta_t / delta_it + \
                                (1 - smoothing) * avg_time

+                        # Acquire locks if parallel bars


755-758 & 848-851 duplicate code

Can't work around this @CrazyPython, __iter__() is a duplication of update() for performance maximization.

@lrq3000 what about __iter__ = update, or the other way around?

We tried everything we could imagine, including what you propose, and it lowers drastically the performance of __iter__().

I think there is no way around, __iter__() is super efficient because once it's launched it works in its own virtual routine with minimal calls to external functions (see how most self variables are stored in method's local variables before the for loop), so any attempt to unify __iter__ and update is bound to fail IMO (except maybe if you're a Python wizard and you dynamically rewrite the method on first call depending on whether it's __iter__ that was called or not, but then I think it would be way more complex to maintain than simple code duplication).

@lrq3000

virtual routine with minimal calls to external functions

mind blown. Where'd you learn this much about python internals?

@lrq3000 I believe you. But what specific places can I personally learn more?

@CrazyPython Ah this is not easy to tell you... My current knowledge is a mix of several things, but mainly three sources: my own desire to program tools that can be useful to me, internet (with great resources such as StackOverflow!) to help get past issues or get to know more tricks (blog posts are also good), and university courses for theoretical stuff (like concurrent programming, this is not something you can usually learn by yourself on a fun spare-time project). And I am still learning almost everyday.

But really, the most important thing I think is just to try to make your own projects, that are useful to you. This is the only way to be willing to spend hours and hours fixing problems and hitting walls: the reward of having a software that will save you time later on, or enable you to do something that was impossible to do manually. This is nowadays called the project-based approach to learning, but it's really not new and is the same as the old saying "practice makes perfect".

So just find a purpose to make a program of your own, and continue. The goal is not to learn to program, but simply to achieve a program that works for your needs. In the end, you will progress a lot. And this works at any proficiency level, I am still learning everyday in the languages I master the most.

@lrq3000 Ah yes, I follow this myself. Project-based learning is what I tell others. Nice to see someone else shares my opinion. :)

But I'm asking specifically: how did you learn about Python internals, specifically? trial and error? profiling?

Ah then it's also the same way: i had University courses on compilers,
which apply to pretty much any language even if python generates bytecodes
instead of a binary, then i also did my own projects like profiling as you
said (see my easy-profilers repo) and benchmarking for tqdm (see the
bytecode analysis PR), and finally a lot of SO and blog posts, the most
useful being the PyPy blog, they even detail a tutorial to make your own
python compiler. There were also a few very interesting pure Python
optimizers with tutorials, i can dig them out if you are interested.

lrq3000 · 2016-10-16T22:18:50Z

@CrazyPython Thank you, but indeed @casperdcl enabled mandatory reviews, and only repo admins can do an official review. That is a good thing, I just hope @casperdcl will have some spare time soon because issues and PRs are piling up!

LankyCyril · 2016-10-19T19:59:31Z

I might be doing something wrong, but nothing has changed for me.
Here I show all the steps I'm taking in case I'm making a stupid mistake somewhere.

$ pyvenv-3.5 tqdm_test
$ source tqdm_test/bin/activate                                   
(tqdm_test) $ pip install --upgrade pip                           
Collecting pip                                                    
  Using cached pip-8.1.2-py2.py3-none-any.whl                     
Installing collected packages: pip                                
  Found existing installation: pip 7.1.2                          
    Uninstalling pip-7.1.2:                                       
      Successfully uninstalled pip-7.1.2                          
Successfully installed pip-8.1.2                                  
(tqdm_test) $ pip install -e git+https://github.com/tqdm/tqdm@1d39
7a118ae91242e03f44d8a0b012afe5fa21e7#egg=tqdm                     
Obtaining tqdm from git+https://github.com/tqdm/tqdm@1d397a118ae91
242e03f44d8a0b012afe5fa21e7#egg=tqdm                              
  Cloning https://github.com/tqdm/tqdm (to 1d397a118ae91242e03f44d
8a0b012afe5fa21e7) to ./tqdm_test/src/tqdm                        
  Could not find a tag or branch '1d397a118ae91242e03f44d8a0b012af
e5fa21e7', assuming commit.                                       
Installing collected packages: tqdm                               
  Running setup.py develop for tqdm                               
Successfully installed tqdm                                       
(tqdm_test) $ python parallels.py                                 
progresser #0:   0%|                     | 0/5000 [00:00<?, ?it/s]
                                                                 (
tqdm_test) $                                                      


progresser #2:   0%|                     | 0/5000 [00:00<?, ?it/s]




progresser #0: 100%|█████████| 5000/5000 [00:05<00:00, 921.46it/s]
progresser #3: 100%|█████████| 5000/5000 [00:05<00:00, 920.24it/s]

The only thing that I have modified from the code in #285 is I reduced the length of L to 5.

Edit: this time on CentOS 6.6, using Python 3.5.1.

lrq3000 · 2016-10-19T20:35:53Z

That's bad news. Can you try with 3 bars?

lrq3000 · 2016-10-19T20:37:13Z

I ask that because either this is still a racing condition and the locks do not work (which is VERY weird), either it's the console not able to manage more than 3 bars at once with our system (without curses) which would be very bad...

lrq3000 · 2016-10-19T20:42:18Z

Ok I just checked if tqdm can support lots of bars simultaneously, and yes it does:

from time import sleep
from tqdm import trange

tlist = []
for i in xrange(20):
    tlist.append(trange(5000, desc='bar'+str(i), position=i, leave=True))
for i in xrange(5000):
    for bar in tlist:
        bar.update()
    sleep(0.001)

So this means that it's really the locking mechanism that does not work. I have no idea why it does not work, this is a very weird issue...

LankyCyril · 2016-10-19T20:42:56Z

Still the same issue, with 3, and with 2.

I'm not that confident in using pip to install from pull requests -- maybe I'm not installing it correctly? Although I see the appropriate code changes in tqdm_test/src/tqdm/tqdm/_tqdm.py

lrq3000 · 2016-10-19T21:44:29Z

I think I got the issue: the Lock is meant to be created before instanciating the pool of parallel processes, and supplied to the pool as an argument. If it's created inside the class, it won't work.

That's bad, because we don't want the user to need to supply the lock, it should be totally automated and transparent. We need to know how other projects managed this, if any did.

lrq3000 · 2016-10-19T21:48:49Z

But from Google's own projects, it seems it's not possible:

class AtomicDict(object):
  """Thread-safe (and optionally process-safe) dictionary protected by a lock.
  If a multiprocessing.Manager is supplied on init, the dictionary is
  both process and thread safe. Otherwise, it is only thread-safe.
  """

  def __init__(self, manager=None):
    """Initializes the dict.
    Args:
      manager: multiprocessing.Manager instance (required for process safety).
    """
    if manager:
      self.lock = manager.Lock()
      self.dict = manager.dict()
    else:
      self.lock = threading.Lock()
      self.dict = {}

So two things:

we should also make the cls._instances dict thread and multiprocess safe.
the user needs to provide a manager or a lock for multiprocessing safety, thread safety can be managed by the tqdm class directly.

So for 2, maybe we should provide a helper function to ease the multiprocessing use?

lrq3000 · 2016-10-19T21:51:07Z

An alternative on pdfrankenstein project, which is very close to what we are trying to achieve (a progress bar printing on stdout).

Basically what they do is instanciate a multiprocessing lock as a global variable. This seems to work. But is it clean in our case?

PS: And good thing to know, Queue is not multiprocess safe because of implementation design.

Also good to know, flock can be used for cross-language locking and flock is only advisory so programs can bypass the locks!

And this implementation detail can be handy later, saving it here for reference (differences between Linux and Windows implementations of multiprocessing module).

And a whole host of other examples here for multiprocessing.Lock.

lrq3000 · 2016-10-25T18:15:10Z

Ah and python official doc and peps are useful too for precise details (but
rarely i use them at my first source when i need the bigger picture because
they often miss context and require prerequisite knowledge of previous
cpython implementation details).
Le 21 Oct. 2016 16:35, "Stephen LARROQUE" lrq3000@gmail.com a écrit :

Ah then it's also the same way: i had University courses on compilers,
which apply to pretty much any language even if python generates bytecodes
instead of a binary, then i also did my own projects like profiling as you
said (see my easy-profilers repo) and benchmarking for tqdm (see the
bytecode analysis PR), and finally a lot of SO and blog posts, the most
useful being the PyPy blog, they even detail a tutorial to make your own
python compiler. There were also a few very interesting pure Python
optimizers with tutorials, i can dig them out if you are interested.
Le 21 Oct. 2016 02:19, "CrazyPython" notifications@github.com a écrit :
@CrazyPython commented on this pull request.

In tqdm/_tqdm.py #291:

@@ -734,6 +752,10 @@ def iter(self):
else smoothing * delta_t / delta_it +
(1 - smoothing) * avg_time
                   # Acquire locks if parallel bars
@lrq3000 https://github.com/lrq3000 Ah yes, I follow this myself.
Project-based learning is what I tell others.

But I'm asking specifically: how did you learn about Python internals,
specifically? trial and error? profiling?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#291, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABES3gssAC4PyqNbwdvxdPPt4NniUJQsks5q2AUmgaJpZM4KYAF1
.

lrq3000 · 2016-10-31T16:50:08Z

Some clever SO folk (Alexey Smirnov) found out the culprit: a seemingly buggy management of global locks on Windows.

So I don't know what we do now. If there is no way around, we can provide a new kwarg for the user to provide a lock.

lrq3000 · 2016-12-26T12:33:03Z

So there is no way around passing the lock from the parent to the children, simply because on Windows there is no fork but only spawn for multi processes, so the children cannot access parent's data. However on Linux, a global lock initialized in the parent would work, but since we aim for crossplatform compatibility, we need to find a workaround...

See for more details:

So what I propose:

provide a new kwarg write_lock=None so that the user can provide a thread or multiprocessing lock to tqdm at pool's initialization
for Linux, provide a global lock: if write_lock is None: write_lock = global_write_lock. Only issue is that there are two kinds of locks: thread and multiprocessing... But we can create a custom lock that merge both maybe.
add in the documentation how to use tqdm with locks on Windows (as it should be transparent on Linux)
add the sample code as a unit test with mp.set_start_method('spawn') to force mimicking Windows "no-fork" spawning of processes. Else it's impossible to test on Linux.

lrq3000 · 2016-12-28T19:55:12Z

Continued in #329.

fixes #285 -> #291 -> #329 fixes #422 fixes #439 fixes #323 fixes #324 fixes #334 fixes #407 fixes #418 related to: - #97 - #143 - #331 - #361 - #384 - #385 - #417

Fix parallel bars printing race issue by using locks

1d397a1

Signed-off-by: Stephen L. <lrq3000@gmail.com>

lrq3000 mentioned this pull request Oct 16, 2016

Progress bars jump around with multiprocessing when using the position kwarg #285

Closed

lrq3000 added p0-bug-critical ☢ Exception rasing need-feedback 📢 We need your response (question) labels Oct 16, 2016

CrazyPython approved these changes Oct 16, 2016

View reviewed changes

lrq3000 assigned casperdcl Oct 16, 2016

casperdcl force-pushed the master branch 4 times, most recently from 8cade97 to a65e347 Compare October 31, 2016 02:34

lrq3000 added to-fix ⌛ In progress and removed need-feedback 📢 We need your response (question) labels Oct 31, 2016

lrq3000 added this to the v5.0.0 milestone Nov 14, 2016

lrq3000 unassigned casperdcl Nov 14, 2016

casperdcl force-pushed the master branch from c9cc5ab to 5faf18b Compare December 5, 2016 01:06

casperdcl added p2-bug-warning ⚠ Visual output bad and removed p0-bug-critical ☢ Exception rasing labels Dec 6, 2016

lrq3000 mentioned this pull request Dec 28, 2016

Parallelism safety (thread, multiprocessing) #329

Merged

5 tasks

lrq3000 closed this Dec 28, 2016

lrq3000 deleted the parallel-print-fix branch December 28, 2016 19:55

casperdcl mentioned this pull request Mar 16, 2017

Can i force a newline each refresh? #361

Closed

casperdcl added a commit that referenced this pull request Sep 26, 2017

bump version, Merge branch 'parallel-print-fix2'

e32428c

fixes #285 -> #291 -> #329 fixes #422 fixes #439 fixes #323 fixes #324 fixes #334 fixes #407 fixes #418 related to: - #97 - #143 - #331 - #361 - #384 - #385 - #417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parallel bars printing race issue by using locks #291

Fix parallel bars printing race issue by using locks #291

lrq3000 commented Oct 16, 2016 •

edited

Loading

codecov-io commented Oct 16, 2016 •

edited

Loading

CrazyPython left a comment

CrazyPython Oct 16, 2016

lrq3000 Oct 16, 2016

CrazyPython Oct 16, 2016

lrq3000 Oct 16, 2016 •

edited

Loading

CrazyPython Oct 17, 2016

CrazyPython Oct 20, 2016

lrq3000 Oct 20, 2016

lrq3000 Oct 20, 2016

CrazyPython Oct 21, 2016 •

edited

Loading

lrq3000 Oct 31, 2016

lrq3000 commented Oct 16, 2016

LankyCyril commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

LankyCyril commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Oct 25, 2016

@CrazyPython commented on this pull request.

lrq3000 commented Oct 31, 2016

lrq3000 commented Dec 26, 2016 •

edited

Loading

lrq3000 commented Dec 28, 2016

Fix parallel bars printing race issue by using locks #291

Fix parallel bars printing race issue by using locks #291

Conversation

lrq3000 commented Oct 16, 2016 • edited Loading

codecov-io commented Oct 16, 2016 • edited Loading

Current coverage is 90.74% (diff: 100%)

CrazyPython left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrq3000 Oct 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CrazyPython Oct 21, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrq3000 commented Oct 16, 2016

LankyCyril commented Oct 19, 2016 • edited Loading

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

LankyCyril commented Oct 19, 2016

lrq3000 commented Oct 19, 2016

lrq3000 commented Oct 19, 2016 • edited Loading

lrq3000 commented Oct 19, 2016 • edited Loading

lrq3000 commented Oct 25, 2016

@CrazyPython commented on this pull request.

lrq3000 commented Oct 31, 2016

lrq3000 commented Dec 26, 2016 • edited Loading

lrq3000 commented Dec 28, 2016

lrq3000 commented Oct 16, 2016 •

edited

Loading

codecov-io commented Oct 16, 2016 •

edited

Loading

lrq3000 Oct 16, 2016 •

edited

Loading

CrazyPython Oct 21, 2016 •

edited

Loading

LankyCyril commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Oct 19, 2016 •

edited

Loading

lrq3000 commented Dec 26, 2016 •

edited

Loading