-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parallel bars printing race issue by using locks #291
Conversation
Signed-off-by: Stephen L. <lrq3000@gmail.com>
Current coverage is 90.74% (diff: 100%)@@ master #291 diff @@
==========================================
Files 7 7
Lines 535 497 -38
Methods 0 0
Messages 0 0
Branches 97 88 -9
==========================================
- Hits 491 451 -40
- Misses 43 44 +1
- Partials 1 2 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably not qualified to do this, but it looks OK to me.
@@ -734,6 +752,10 @@ def __iter__(self): | |||
else smoothing * delta_t / delta_it + \ | |||
(1 - smoothing) * avg_time | |||
|
|||
# Acquire locks if parallel bars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
755-758 & 848-851 duplicate code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't work around this @CrazyPython, __iter__()
is a duplication of update()
for performance maximization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lrq3000 what about __iter__ = update
, or the other way around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tried everything we could imagine, including what you propose, and it lowers drastically the performance of __iter__()
.
I think there is no way around, __iter__()
is super efficient because once it's launched it works in its own virtual routine with minimal calls to external functions (see how most self
variables are stored in method's local variables before the for loop), so any attempt to unify __iter__
and update
is bound to fail IMO (except maybe if you're a Python wizard and you dynamically rewrite the method on first call depending on whether it's __iter__
that was called or not, but then I think it would be way more complex to maintain than simple code duplication).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual routine with minimal calls to external functions
mind blown. Where'd you learn this much about python internals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lrq3000 I believe you. But what specific places can I personally learn more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CrazyPython Ah this is not easy to tell you... My current knowledge is a mix of several things, but mainly three sources: my own desire to program tools that can be useful to me, internet (with great resources such as StackOverflow!) to help get past issues or get to know more tricks (blog posts are also good), and university courses for theoretical stuff (like concurrent programming, this is not something you can usually learn by yourself on a fun spare-time project). And I am still learning almost everyday.
But really, the most important thing I think is just to try to make your own projects, that are useful to you. This is the only way to be willing to spend hours and hours fixing problems and hitting walls: the reward of having a software that will save you time later on, or enable you to do something that was impossible to do manually. This is nowadays called the project-based approach to learning, but it's really not new and is the same as the old saying "practice makes perfect".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just find a purpose to make a program of your own, and continue. The goal is not to learn to program, but simply to achieve a program that works for your needs. In the end, you will progress a lot. And this works at any proficiency level, I am still learning everyday in the languages I master the most.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lrq3000 Ah yes, I follow this myself. Project-based learning is what I tell others. Nice to see someone else shares my opinion. :)
But I'm asking specifically: how did you learn about Python internals, specifically? trial and error? profiling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah then it's also the same way: i had University courses on compilers,
which apply to pretty much any language even if python generates bytecodes
instead of a binary, then i also did my own projects like profiling as you
said (see my easy-profilers repo) and benchmarking for tqdm (see the
bytecode analysis PR), and finally a lot of SO and blog posts, the most
useful being the PyPy blog, they even detail a tutorial to make your own
python compiler. There were also a few very interesting pure Python
optimizers with tutorials, i can dig them out if you are interested.
@CrazyPython Thank you, but indeed @casperdcl enabled mandatory reviews, and only repo admins can do an official review. That is a good thing, I just hope @casperdcl will have some spare time soon because issues and PRs are piling up! |
I might be doing something wrong, but nothing has changed for me.
The only thing that I have modified from the code in #285 is I reduced the length of L to 5. Edit: this time on CentOS 6.6, using Python 3.5.1. |
That's bad news. Can you try with 3 bars? |
I ask that because either this is still a racing condition and the locks do not work (which is VERY weird), either it's the console not able to manage more than 3 bars at once with our system (without curses) which would be very bad... |
Ok I just checked if tqdm can support lots of bars simultaneously, and yes it does:
So this means that it's really the locking mechanism that does not work. I have no idea why it does not work, this is a very weird issue... |
Still the same issue, with 3, and with 2. I'm not that confident in using pip to install from pull requests -- maybe I'm not installing it correctly? Although I see the appropriate code changes in tqdm_test/src/tqdm/tqdm/_tqdm.py |
I think I got the issue: the Lock is meant to be created before instanciating the pool of parallel processes, and supplied to the pool as an argument. If it's created inside the class, it won't work. That's bad, because we don't want the user to need to supply the lock, it should be totally automated and transparent. We need to know how other projects managed this, if any did. |
But from Google's own projects, it seems it's not possible:
So two things:
So for 2, maybe we should provide a helper function to ease the multiprocessing use? |
An alternative on pdfrankenstein project, which is very close to what we are trying to achieve (a progress bar printing on stdout). Basically what they do is instanciate a multiprocessing lock as a global variable. This seems to work. But is it clean in our case? PS: And good thing to know, Queue is not multiprocess safe because of implementation design. Also good to know, flock can be used for cross-language locking and flock is only advisory so programs can bypass the locks! And this implementation detail can be handy later, saving it here for reference (differences between Linux and Windows implementations of multiprocessing module). And a whole host of other examples here for multiprocessing.Lock. |
Ah and python official doc and peps are useful too for precise details (but
|
8cade97
to
a65e347
Compare
Some clever SO folk (Alexey Smirnov) found out the culprit: a seemingly buggy management of global locks on Windows. So I don't know what we do now. If there is no way around, we can provide a new kwarg for the user to provide a lock. |
So there is no way around passing the lock from the parent to the children, simply because on Windows there is no fork but only spawn for multi processes, so the children cannot access parent's data. However on Linux, a global lock initialized in the parent would work, but since we aim for crossplatform compatibility, we need to find a workaround... See for more details:
So what I propose:
|
Continued in #329. |
Should fix #285 by using locks. There is one lock for multiprocessing and one for threading, so this should fix the issue in any case.
Canonical example:
Note however that this doesn't fully fix issues on Windows when using too many bars at once, try with 3 bars and it works, try with 10 and it doesn't, because Windows console has a tendency to do weird newlines when there are too many lines printed at once (see also this PR that is suffering from the same issue). I can maybe also try on Windows using MSYS2 as suggested here to fix the issue?
TODO: