fix the problem #299 in terms of using tqdm in pandas #524

chengs · 2018-03-17T09:07:55Z

rebase on to master
Add new test cases about pandas
fix Remove support for *args in pandas wrappers #299 - find the correct total when using progress_apply and other functions in pandas.

#299 partially achieved its goal. With forbidding *args, one can get correct axis now, and use it to get the total (as number of iterations in progress_apply)

However, pandas sometimes runs several more iterations to optimise itself. For example, in dataframe.progress_apply(func).

if func is a numpy.ufunc or is very slow, then it will be executed total times.
but if it is very quick or in some other cases, it will be executed total+1 or total+shape[axis0] times.

I think there is no need to adjust such particular pandas implementation, so in the wrapper function, I write

def wrapper(*args, **kwargs):
                    # update tbar correctly
                    # it seems pandas apply calls func twice on the first column/row 
                    # to decide whether it can take a fast or slow code path.
                    # so stop when t.total==t.n
                    t.update(n=1 if t.total and t.n < t.total else 0)
                    return func(*args, **kwargs)

if total is achieved, tbar will not be updated.

codecov-io · 2018-03-17T09:18:40Z

Codecov Report

Merging #524 into master will increase coverage by 0.3%.
The diff coverage is 100%.

@@           Coverage Diff            @@
##           master    #524     +/-   ##
========================================
+ Coverage    99.1%   99.4%   +0.3%     
========================================
  Files           8       8             
  Lines         668     674      +6     
  Branches      118     118             
========================================
+ Hits          662     670      +8     
+ Misses          4       3      -1     
+ Partials        2       1      -1

chengs · 2018-03-17T09:49:11Z

I add new tests, so now, pandas.Series, pandas.Dataframe, and groupby cases are all covered. Not cover pandas.Panel because it is going to be removed. http://pandas.pydata.org/pandas-docs/version/0.20/whatsnew.html#whatsnew-0200-api-breaking-deprecate-panel

chengs · 2018-03-17T09:59:31Z

tqdm/_tqdm.py

                    Transmitted to `df.apply()`.
                """
+
                # Precompute total iterations
                total = getattr(df, 'ngroups', None)


this ngroups is an inner attribute of pandas. It may be changed in future, but seems not happen in near future or never. So let's keep it so far.

For perfection, I will suggest to use one of pandas public apis to get group size in future. So far, it is fine.

- suppress `RuntimeError: Set changed size during iteration` (tqdm#481) - partially re-add 64f5e73 Happy Easter!

- closes tqdm#454 - TODO: might want to autodetect semi-unicode support and fallback to ascii?

Fix total computation for pandas apply

chengs mentioned this pull request Mar 17, 2018

fix the problem #299 in terms of using tqdm in pandas #521

Closed

chengs commented Mar 17, 2018

View reviewed changes

chengs mentioned this pull request Mar 17, 2018

Remove support for *args in pandas wrappers #299

Closed

casperdcl self-requested a review March 19, 2018 20:30

casperdcl added to-review 🔍 Awaiting final confirmation submodule ⊂ Periphery/subclasses labels Mar 19, 2018

chengs mentioned this pull request Mar 23, 2018

progress_apply reports wrong total count #489

Closed

TqdmSynchronisationWarning

0e8f839

- suppress `RuntimeError: Set changed size during iteration` (tqdm#481) - partially re-add 64f5e73 Happy Easter!

casperdcl assigned casperdcl and chengs and unassigned casperdcl Apr 3, 2018

casperdcl force-pushed the aplavin-patch-2 branch from ff63a31 to 072c2fc Compare April 3, 2018 09:38

casperdcl and others added 6 commits April 3, 2018 11:03

update known issues (newline/linefeed/unicode)

98a84f0

- closes tqdm#454 - TODO: might want to autodetect semi-unicode support and fallback to ascii?

Remove support for *args in pandas wrappers

aaaf189

Fix total computation for pandas apply

grammar

a4e60eb

address the problem in tqdm#299 correctly.

6921743

fix a bug and add tests in test_pandas

bcbdc22

linting

0c56c08

casperdcl force-pushed the aplavin-patch-2 branch from cdf3bc3 to 0c56c08 Compare April 3, 2018 10:07

casperdcl merged commit 0c56c08 into tqdm:master Apr 3, 2018

This was referenced Apr 3, 2018

Pandas progress bar when iterating over rows #322

Closed

Missing total for DataFrame.progress_apply #351

Closed

Pandas apply on either axis #366

Closed

chengs deleted the aplavin-patch-2 branch April 13, 2018 12:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the problem #299 in terms of using tqdm in pandas #524

fix the problem #299 in terms of using tqdm in pandas #524

chengs commented Mar 17, 2018 •

edited by casperdcl

codecov-io commented Mar 17, 2018 •

edited

chengs commented Mar 17, 2018

chengs Mar 17, 2018 •

edited

fix the problem #299 in terms of using tqdm in pandas #524

fix the problem #299 in terms of using tqdm in pandas #524

Conversation

chengs commented Mar 17, 2018 • edited by casperdcl

codecov-io commented Mar 17, 2018 • edited

Codecov Report

chengs commented Mar 17, 2018

chengs Mar 17, 2018 • edited

Choose a reason for hiding this comment

chengs commented Mar 17, 2018 •

edited by casperdcl

codecov-io commented Mar 17, 2018 •

edited

chengs Mar 17, 2018 •

edited