-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
progress_apply reports wrong total count #489
Comments
I think the problem might be that pandas is actually performing len(gb)+1 operations when you invoke apply on a group with len(gb) items. From the docs:
|
yes, it's an upstream tqdm.pandas(unit_scale=len(gb)/(len(gb)+1.0)) |
dealt with issue now by printing for the user: 15 items to be processed (needs 16 operations) |
once #524 is merge, this will be fixed |
Since #524 is merge, this can be closed. @casperdcl |
@joders please try with |
I created a minimal replicating example:
import pandas as pd
from tqdm import tqdm
tqdm.pandas()
df=pd.DataFrame({'a':np.random.randint(2,5,size=100),'a':np.random.randint(2,9,size=100)})
gb = df.groupby('a')
print("group count: " + str(len(gb)))
gb.progress_apply(lambda x:x)
group count: 7
100%|█████████████████████████████████████████████| 8/8 [00:00<00:00, 1834.28it/s]
I did not find an open issue for this.
The relevant line in _tqdm.py is
614:
total += 1 # pandas calls update once too many
tqdm/tqdm/_tqdm.py
Line 614 in 69ca16d
Python 3.6.1
pandas==0.20.3
tqdm==4.19.5
The text was updated successfully, but these errors were encountered: