Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate nested with position #87

Merged
merged 35 commits into from Feb 4, 2016
Merged

Automate nested with position #87

merged 35 commits into from Feb 4, 2016

Conversation

lrq3000
Copy link
Member

@lrq3000 lrq3000 commented Dec 27, 2015

Should fix #83.

I initially planned to make a new class multi_tqdm() to centrally manage multiple tqdm bars, but I've found a way to do the same in a decentralized fashion.

The only glitch is when all bars have leave=True, they will be overwritten by command prompt (just like the old issue with nested, but here we can't fix that).

Sample code:

from tqdm import tqdm, trange
from time import sleep

# Iteration-based usage
for i in trange(2, desc='pos0 loop', position=0):
    for j in trange(3, desc='pos1 loop', position=1):
        for k in trange(4, desc='pos2 loop', position=2):
            sleep(0.1)

# Manual usage
t1 = tqdm(total=10, desc='pos0 bar', position=0)
t2 = tqdm(total=10, desc='pos1 bar', position=1)
t3 = tqdm(total=10, desc='pos2 bar', position=2)
for i in range(10):
    t1.update()
    t3.update()
    t2.update()
    sleep(0.5)

Should we keep this or rather try to make a centralized multi_tqdm() ? (but it will be uglier and a good deal slower).

Todo (edited):

  • Add unit test to get coverage 100%.
  • Update readme.
  • rebase
  • fix tests
    • conflicts with deprecate nested
    • nested deprecation test
    • position test
    • fix timing test (Discrete Timer)
  • fix py26 exceptions
  • fix pypy exceptions
  • fix py3 exceptions
  • fix display of test output (why are there blank lines in the terminal when we use StringIO!?)
  • fix del() exceptions in tests
  • fix pypy on Travis
  • answer this question on SO.
  • make more efficient

@codecov-io
Copy link

Current coverage is 100.00%

Merging #87 into master will not affect coverage as of 1c11413

@@            master     #87   diff @@
======================================
  Files            5       5       
  Stmts          244     326    +82
  Branches        49      63    +14
  Methods          0       0       
======================================
+ Hit            244     326    +82
  Partial          0       0       
  Missed           0       0       

Review entire Coverage Diff as of 1c11413

Powered by Codecov. Updated on successful CI builds.

@lrq3000 lrq3000 added the help wanted 🙏 We need you (discussion or implementation) label Dec 27, 2015
@casperdcl
Copy link
Sponsor Member

grr this seems like an alternative to nested. i'm hesitant because it's annoying to have both position and nested... maybe pick one behaviour and used that only? (i.e. update the nested behaviour to use this?)

@lrq3000
Copy link
Member Author

lrq3000 commented Dec 27, 2015

@casperdcl yes I thought the same :S But they are different in that nested is using relative positioning (it just print one "term_up" character at closing to give the hand to the parent bar), while position is using absolute positioning (it prints n * newline returns, prints the bar, then prints n * "termp_up" to go back to the initial position). Thus, positioning can also be used for recursive loops, but it's a bit less simple I think (you need to specify the absolute position instead of just switching the flag nested=True), and it prints useless newline returns and "term_up" characters. So in the end, I can't see how we can mix up both approaches, they are fundamentally different, so we either have to pick one of them, or keep them both.

So I'm not sure what to do :/

@casperdcl
Copy link
Sponsor Member

Basically, without a centralised dispatcher, we need to do:

pos = 0
t0 = tqdm(..., position=pos); pos += 1;
...
t1 = tqdm(..., position=pos); pos += 1;
...
t2 = tqdm(..., position=pos); pos += 1;

would be nice if there was some way to neaten this without a dispatcher. I know nested and position are implemented differently, but when they both can solve the same problem, it seems redundant. Specifically, position can solve the nested problem.

@casperdcl
Copy link
Sponsor Member

Ooh maybe something like http://stackoverflow.com/questions/8628123/counting-instances-of-a-class ? When we __init__() we set position to be equal to the number of already existing instances...

@casperdcl
Copy link
Sponsor Member

class tqdm(object):
    n_instances = 0

    @classmethod
    def increment_instances(cls):
        cls.n_instances += 1
        return cls.n_instances

    @classmethod
    def decrement_instances(cls):
        cls.n_instances -= 1

    def __init__(self, ...):
        self.position = self.increment_instances() - 1 if position is None else position
        ...

    def close(self):
        self.decrement_instances()
        ...

@casperdcl
Copy link
Sponsor Member

just added more to the TODO list. This branch should replace nested by completely automating the process. I suggest introducing a DeprecationWarning for nested and a note about manually specifying position in case the default behaviour isn't enough for people.

@casperdcl
Copy link
Sponsor Member

@lrq3000 any idea how to fix Travis::pypy failures? I think you wrote the terminal cursor test logic so you're probably going to be better at debugging this than me.

@casperdcl casperdcl changed the title Add position in tqdm Automate nested with position Dec 28, 2015
@casperdcl casperdcl added the p3-enhancement 🔥 Much new such feature label Dec 28, 2015
@lrq3000
Copy link
Member Author

lrq3000 commented Dec 29, 2015

💯 This is brilliant @casperdcl, using class variables to automate nested printing is perfect! Yes I'll take a look, I'm just going to finish a few other PRs before :)

@lrq3000
Copy link
Member Author

lrq3000 commented Dec 29, 2015

@casperdcl WARNING I wrongly force pushed this branch, please force push your local branch, I've lost the history of the changes you've done!

@lrq3000
Copy link
Member Author

lrq3000 commented Dec 29, 2015

PS: I didn't force push while being in this branch, but while being in relative-timing, so this is weird, but I'll be more careful in the future, I will always precise which remote branch I want to force push.

@lrq3000
Copy link
Member Author

lrq3000 commented Dec 29, 2015

If you don't have the local history anymore, we can manually restore the commits by fetching the files here:
e8e2a47

I tried to restore the history using git, but didn't succeed.

@casperdcl
Copy link
Sponsor Member

don't worry, I have a backup of this on my phone (was pushing this otg, debian jessie on arm64 FtW).

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 1, 2016

@casperdcl Thank's for restoring the history, I'll work on it later today :)

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 1, 2016

@casperdcl : nope, not necessarily, at least not with PyPy: its garbage collector works differently, so if some instances of tqdm are not closed explicitly, PyPy may not collect them (and thus decrease the instances counter) before the end of the whole script, even if the function where tqdm was instanciated is now finished. You can confirm that by printing the number of tqdm instances, you will see that before my commit and after, the counter decreases (and it fixes the issue).

Anyway, I took this bug just as an opportunity to clean up old unit tests (ie, to properly call and close tqdm, mainly using the with syntax). I anyway have put a safeguard against possible future errors such as this one by explicitly setting the number of tqdm instances to 0 at the beginning of the corresponding unit test.

There was also a very vicious bug with rate formatting at initialization, depending on the platform and the interpreter, it could return either 0.00it/s or ?it/s. This should now be fixed (I'm waiting for Travis to finish, but seems good).

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 1, 2016

Ah and this might be problematic: the number of instances in tests_perf.py were also accounted for when executing tests inside tests_tqdm.py if all tests are run at once using nose. For the moment, tqdm is meant to be shown on only one screen, but what would happen if it evolves to be printed on multiple screen inside a parallel app doing different tasks at the same time? After all, this is probably already possible by just setting different file outputs.

I really like the idea you had, but maybe we should move it to a subclass?

@casperdcl
Copy link
Sponsor Member

Ah ok PyPy having different gc makes sense.

  1. We shouldn't ever have to use the with syntax when wrapping an iterable that will be exhausted, though, since close will always automatically be called in those cases.
  2. It's more a question of what to have as default behaviour. By default we have sys.stderr and it seems reasonable to also assume a single terminal output. Thus it seems reasonable to automatically count instances and set the position accordingly. I made it so it is easy to override in other scenarios (logging to files, different terminals, etc) by manually specifying position=0 or whatever...

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 1, 2016

  1. This is probably something that you can't see with github's diff, except if you extend the collapsed parts, but I used the with statement only on manually updated tqdm instances and iterables ones that were not exhausted (eg, there's a break because we don't need to do all iterations, this was necessary before the virtual timer to do timing related tests). The tests where iterable tqdm instances were exhausted by themselves were not modified to use with.

  2. Ok, indeed if we can manually overwrite with position=0, that's easy enough for me :) I agree, we should make it easy to do 80% most common cases, and just let a possibility to do the 20% remaining by overriding parameters or subclassing tqdm. So I'm OK to go with this PR :)

PS: BTW I think we should specify in the readme how to override automated nesting (we'll do that when this PR is stable enough). Do you plan to add more features?

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 5, 2016

There's a bug currently it seems, there's always a newline return between each bar display when doing:

from tqdm import trange
from time import sleep
for i in trange(10000):
    sleep(0.01)

@casperdcl
Copy link
Sponsor Member

Really? Just in this branch?

@lrq3000
Copy link
Member Author

lrq3000 commented Jan 31, 2016

Ok problem solved. It happened with commit ff6e9fb9c Merge branch 'master' into positioned-tqdm because this merged the virtual timer, so the timing was perfect, this combined with the new position scheme added newline returns for every bars, and this tripped get_bar() which used \r to split bars displays, so in fact it was choosing the wrong display (iteration 2/3 instead of 3/3, so the rate was not the correct one!).

So the solution was either to force position=0 or to fix get_bar() to be more reliable. I did the latter. Tricky bug, luckily git-bisect saved me some time.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@casperdcl
Copy link
Sponsor Member

Yes good spot. I thought a regex version might've been necessary there. I did change things so we don't print things like '\r' unnecessarily.

@lrq3000
Copy link
Member Author

lrq3000 commented Feb 2, 2016

Yes but in this case it was because there were additional \r, which is normal because of position when having multiple bars at the same time.

@casperdcl
Copy link
Sponsor Member

Ok I think we should merge this now. Other PRs will need updating to support this but if we merge all of those first we may never get around to merging this (since all the fixes for them would have to be implemented here)! Agreed?

@lrq3000
Copy link
Member Author

lrq3000 commented Feb 4, 2016

Yes agreed, all seems to be working well, so we should merge it in!

@casperdcl casperdcl removed the help wanted 🙏 We need you (discussion or implementation) label Feb 4, 2016
@casperdcl casperdcl merged commit bf394dd into master Feb 4, 2016
@lrq3000
Copy link
Member Author

lrq3000 commented Feb 5, 2016

Thank's Casper, don't worry about other PRs needing to be updated, I
already planned to do that and there are a few TODO here and there to
remind me what I need to update. So I'll do that when I get some time
during next week.

I would also prefer not to release v4.0.0 on PyPi until other branches are
merged. This will allow us to both add more functionalities at once, and
also check that everything runs smoothly. Of course, the release version
will be bumped. Are you OK with this?

2016-02-04 21:56 GMT+01:00 Casper da Costa-Luis notifications@github.com:

Merged #87 #87.


Reply to this email directly or view it on GitHub
#87 (comment).

@casperdcl casperdcl mentioned this pull request Feb 29, 2016
@casperdcl casperdcl deleted the positioned-tqdm branch February 29, 2016 23:42
@CrazyPython
Copy link
Contributor

@lrq3000 just a reminder, you haven't answered that SO question yet

@casperdcl
Copy link
Sponsor Member

@CrazyPython good spot, thx

@CrazyPython
Copy link
Contributor

CrazyPython commented Jul 20, 2016

@casperdcl Uh, I found a bug. I'm too lazy to open a new issue, so here it is:

Using the desc= parameter with nested progress bars (the inner one) corrupts the output and gets you multiple progress bars like this. I only used two nested progress bars, by the way.

Removing makes it goes back to normal.

@CrazyPython
Copy link
Contributor

@casperdcl and the problem exists no matter what when you run it in an embedded IDE window. Don't know why, I'm using PyCharm.

@casperdcl
Copy link
Sponsor Member

hmm

from tqdm import tqdm                            
for i in tqdm(range(10), desc='foo'):            
      for j in tqdm(range(int(1e7)), desc='bar'):
              pass                               
print '\ndone'                                     

gives me:

foo: 100%|██████████████████████████████████████| 10/10 [00:17<00:00,  1.72s/it]
bar: 100%|█████████████████████| 10000000/10000000 [00:01<00:00, 6460949.98it/s]
done

in a normal console window. There's a known issue with idle which doesn't support multiline things. @lrq3000 correct me if I'm wrong.

@casperdcl
Copy link
Sponsor Member

@lrq3000
Copy link
Member Author

lrq3000 commented Jul 20, 2016

@CrazyPython Lol thank you for the reminder!

@casperdcl You are right, and if the IDE is using IDLE as a widget to run python scripts (as I expect since it's the easiest way), then yes this would explain why some IDE would output garbage. But here it doesn't seem like it's a missing control character problem, but something else.

@CrazyPython Could you tell us what IDE you use and on what OS please?
/EDIT: oops sorry, saw that it was PyCharm. We should open a new ticket for this issue.

@CrazyPython
Copy link
Contributor

CrazyPython commented Jul 21, 2016

@lrq3000 Mac OS X. Run or debug mode. Problem does not appear in terminal window.

@casperdcl you should mention you are a dev of tqdm. It'll earn you more respect. ;)

Also this code bugs out in the terminal and in the IDE.

from tqdm import trange
import time

for i in trange(10):
    for j in trange(10, desc='\t'):
        time.sleep(0.1)

@lrq3000
Copy link
Member Author

lrq3000 commented Jul 21, 2016

It does not surprise me that this code does bug out, the desc field is not made to include control characters. If you want to change tqdm display, you should use bar_format.

About Casper, yes he is a highly active maintainer and developper for tqdm, so his name is written all over the place in tqdm of course :)

@CrazyPython
Copy link
Contributor

@lrq3000 I mean on his SO description @casperdcl

@casperdcl
Copy link
Sponsor Member

@lrq3000 @CrazyPython while I may be the primary maintainer of tqdm I'd like to thank everyone else who's contributed. This is the only big-ish project I contribute to without a pseudonym (which was actually an accident - meant to use a different account for my first commit, lol). I don't want people to have different reactions on SO just because of who answered a question.

@CrazyPython
Copy link
Contributor

@casperdcl speaking of psuedonyms, I just removed all the personal info from my account... you're the only one here that doesn't have psuedonym :P

@casperdcl
Copy link
Sponsor Member

xD my name's Casper and I'm the only one not ghosting

@lrq3000 lrq3000 mentioned this pull request Aug 5, 2016
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3-enhancement 🔥 Much new such feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

What's the best way to use tqdm for different worker threads on different lines?
4 participants