Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

progress_apply reports wrong total count #489

Closed
joders opened this issue Dec 15, 2017 · 6 comments
Closed

progress_apply reports wrong total count #489

joders opened this issue Dec 15, 2017 · 6 comments

Comments

@joders
Copy link

joders commented Dec 15, 2017

I created a minimal replicating example:

import pandas as pd
from tqdm import tqdm
tqdm.pandas()
df=pd.DataFrame({'a':np.random.randint(2,5,size=100),'a':np.random.randint(2,9,size=100)})
gb = df.groupby('a')
print("group count: " + str(len(gb)))
gb.progress_apply(lambda x:x)

group count: 7
100%|█████████████████████████████████████████████| 8/8 [00:00<00:00, 1834.28it/s]

I did not find an open issue for this.

The relevant line in _tqdm.py is

614: total += 1 # pandas calls update once too many

total += 1 # pandas calls update once too many

Python 3.6.1
pandas==0.20.3
tqdm==4.19.5

@joders
Copy link
Author

joders commented Dec 15, 2017

I think the problem might be that pandas is actually performing len(gb)+1 operations when you invoke apply on a group with len(gb) items. From the docs:

In the current implementation apply calls func twice on the
first group to decide whether it can take a fast or slow code
path. This can lead to unexpected behavior if func has
side-effects, as they will take effect twice for the first
group.

@casperdcl
Copy link
Member

yes, it's an upstream pandas issue, but if the display really annoys you, you could try:

tqdm.pandas(unit_scale=len(gb)/(len(gb)+1.0))

@joders
Copy link
Author

joders commented Dec 17, 2017

dealt with issue now by printing for the user:

15 items to be processed (needs 16 operations)

@chengs
Copy link
Contributor

chengs commented Mar 23, 2018

once #524 is merge, this will be fixed

@chengs
Copy link
Contributor

chengs commented Apr 12, 2018

Since #524 is merge, this can be closed. @casperdcl

@casperdcl
Copy link
Member

casperdcl commented Apr 12, 2018

@joders please try with tqdm>=4.20.0 happy to reopen if it doesn't work for you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants