Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: A bit more flexibility for progress bars #4438

Merged
merged 6 commits into from Apr 24, 2020
Merged

Conversation

adswa
Copy link
Member

@adswa adswa commented Apr 22, 2020

This PR enhances the existing progress bars with the ability to

  1. set an initial value for the progress bar
 log_progress(lgr.info, '3',  'whee!', total=400, initial=20)                  
  5%|██▊                                                     | 20.0/400 [00:00<?, ?/s]

and 2) update the total of an existing progress bar

log_progress(lgr.info, '3',  'whee!', update=0, increment=True, total=1000)   
  2%|▉                                              | 20.0/1.00k [00:00<00:00, 154k/s]

This is hopefully some ground work towards #4390.

mih
mih previously approved these changes Apr 22, 2020
Copy link
Member

@mih mih left a comment

I like! ;-)

adswa added 2 commits Apr 22, 2020
This also introduces the change that calling `ProgressBarBase.start()` (i.e.
without arguments), re-sets the initial value to the value given to the
`initial` argument of the constructor, and not to zero. The ability
to give a different `initial` value to `start()` is unimpaired.
This can also be done via `log_progress()` like so:

```
log_progress(..., update=0, increment=True, total=newtotal)
```
@@ -32,7 +32,7 @@ def __init__(self, label=None, fill_text=None, total=None, out=None, unit='B', i
self.total = total
self.unit = unit
self.out = out
self._current = initial
self._current = self._initial = initial
Copy link
Contributor

@kyleam kyleam Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ProgressBarBase.__init__ has an initial parameter. Confusingly, its children that explicitly define parameters, SilentProgressBar and tqdmProgressBar, do not. As far as I can see, no caller in our code base actually instantiates a progress bar class with initial.

Instead of extending tqdmProgressBar.__init__ to have an initial parameter and updating ProgressHandler.emit to pass it in, what about just making ProgressHandler.emit pass initial in its call to start()?

I'd also be in favor of pruning initial from ProgressBarBase.__init__, though I don't think that needs to be done in this PR.

Copy link
Contributor

@kyleam kyleam Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think re-reading your first commit message, this might be answered there. The idea is that now a bare .start() allows re-initializing, I think.

Copy link
Contributor

@kyleam kyleam Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still I think having two spots to pass in initial (especially given the inconsistency with classes) isn't a good thing. I'd be in favor of having .start() also store the value it receives for initial in an attribute. Then .start() could still be called without any arguments to re-initialize to that value.

Copy link
Member Author

@adswa adswa Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Kyle, I hope 057dcc0 did what you had in mind. Please let me know if not.

Copy link
Contributor

@kyleam kyleam Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope 057dcc0 did what you had in mind.

Hmm, I think that makes sense to do, but the core change I was suggesting with "Instead of extending ..." was something like this (untested):

diff
diff --git a/datalad/log.py b/datalad/log.py
index 29e5389fc0..414564eeba 100644
--- a/datalad/log.py
+++ b/datalad/log.py
@@ -214,9 +214,8 @@ def emit(self, record):
                 label=getattr(record, 'dlm_progress_label', ''),
                 unit=getattr(record, 'dlm_progress_unit', ''),
                 total=getattr(record, 'dlm_progress_total', None,),
-                initial=getattr(record, 'dlm_progress_initial', 0)
             )
-            pbar.start()
+            pbar.start(initial=getattr(record, 'dlm_progress_initial', 0))
             self.pbars[pid] = pbar
         elif update is None:
             # not an update -> done
diff --git a/datalad/ui/progressbars.py b/datalad/ui/progressbars.py
index d5a3251322..d28ac793ff 100644
--- a/datalad/ui/progressbars.py
+++ b/datalad/ui/progressbars.py
@@ -215,7 +215,7 @@ class tqdmProgressBar(ProgressBarBase):
 
         def __init__(self, label='', fill_text=None,
                      total=None, unit='B', out=sys.stdout, leave=False,
-                     frontend=None, initial=0):
+                     frontend=None):
             """
 
             Parameters
@@ -228,12 +228,10 @@ def __init__(self, label='', fill_text=None,
             leave
             frontend: (None, 'ipython'), optional
               tqdm module to use.  Could be tqdm_notebook if under IPython
-            initial: (0, int), where to start the progress bar
             """
             super(tqdmProgressBar, self).__init__(label=label,
                                                   total=total,
-                                                  unit=unit,
-                                                  initial=initial)
+                                                  unit=unit)
 
             if frontend not in self._frontends:
                 raise ValueError(

I don't feel too strongly about this (and don't think it should hold up this PR if you or others disagree), but if we can avoid having two spots that intitial can come in (i.e. start and __init__), I'd prefer to.

Copy link
Member

@mih mih Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyleam That makes sense, but (if I am not missing something) it ignores the (bad) situation of ProgressBarBase that still has the two slots for initial, and now leaves one of the original issues in that initially brought us here (provide initial to the constructor, only to have it be ignored, unless provided again to start()). I have no sufficient oversight to have a good concept of how progress bars are supposed to be used, so my POV is the one of a "victim" ;-)

Copy link
Contributor

@kyleam kyleam Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I missing (at least part of) your point, but as I mentioned above, I think ProgressBarBase.__init__ should lose its initial parameter. The fact that its children don't have the same parameter is a sign that (1) things aren't wired up usefully and (2) it's not being used. I'd be happy to see it done in this PR, though I don't want to push clean up onto @adswa.

Copy link
Contributor

@kyleam kyleam Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much time has been taken up by my comment though. The log_progress API will stay the same either way, so I'd say we leave this as is and then any interested parties (maybe me :) can follow up with another PR at some point if they think it's worth it.

Copy link
Member Author

@adswa adswa Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, and failed - I can't seem to wrap my head around this, sorry @kyleam :(

@mih mih dismissed their stale review Apr 22, 2020

@adswa reported issues with incremental updates

adswa added 3 commits Apr 23, 2020
Previously, only tqdmProgressBar but not BaseProgressBar
had this ability
Without a refresh after a progress update the progress bar is
not advanced until the next log_progress() call.
@adswa
Copy link
Member Author

@adswa adswa commented Apr 23, 2020

FTR re inconsistencies with incremental updates: I saw failures when I tried to simultaneously update the progress bar and the total:
line 2 should update the total from 400 to 2000, and the progress from 20 to 21. However, the progress update is only happening the next time I'm logging any progress.

In [1]: import datalad, logging; from datalad.log import log_progress; lgr=logging.get
   ...: Logger('datalad.something')                                                   

In [2]: log_progress(lgr.info, '3',  'whee!', total=400, initial=20)                  
  5%|██▊                                                     | 20.0/400 [00:00<?, ?/s]
In [3]: log_progress(lgr.info, '3',  'whee!', update=1, increment=True, total=2000)   
  1%|▍                                             | 20.0/2.00k [00:00<00:00, 90.3k/s]hi 1 True 2000

In [4]: log_progress(lgr.info, '3',  'whee!', update=0, increment=True, total=2000)   
  1%|▍                                             | 21.0/2.00k [00:00<00:00, 89.7k/s]

This should be fixed with c76b0b1:

In [1]: import datalad, logging; from datalad.log import log_progress; lgr=logging.get
   ...: Logger('datalad.something')                                                   

In [2]: log_progress(lgr.info, '3',  'whee!', total=400, initial=20)                  
  5%|██▊                                                     | 20.0/400 [00:00<?, ?/s]
In [3]: log_progress(lgr.info, '3',  'whee!', update=32, increment=True, total=2000)  
  3%|█▏                                             | 52.0/2.00k [00:00<00:00, 210k/s]
In [4]:                                                                               

@adswa
Copy link
Member Author

@adswa adswa commented Apr 23, 2020

In don't see how my changes could cause the failing test:

======================================================================
ERROR: datalad.distributed.tests.test_create_sibling_ria.test_create_simple(None,)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
   File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 186, in newfunc
    return func(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 691, in newfunc
    return t(*(arg + (filename,)), **kw)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/distributed/tests/test_create_sibling_ria.py", line 48, in newfunc
    return func(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 518, in newfunc
    return t(*(arg + (d,)), **kw)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 691, in newfunc
    return t(*(arg + (filename,)), **kw)
   File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/distributed/tests/test_create_sibling_ria.py", line 125, in _test_create_store
    assert_result_count(installed_ds.get(op.join('ds', 'file1.txt')),
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/distribution/dataset.py", line 499, in apply_func
    return f(**kwargs)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/interface/utils.py", line 494, in eval_func
    return return_func(generator_func)(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/interface/utils.py", line 482, in return_func
    results = list(results)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/interface/utils.py", line 469, in generator_func
    msg="Command did not complete successfully")
datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully [{'refds': '/crippledfs/datalad_temp__test_create_storehqh_3zkl/test_install', 'annexkey': 'MD5E-s4--03d59e663c1af9ac33a9949d1193505a.txt', 'message': 'not available; Try making some of these repositories available:; \te0b2f3cb-53ec-4641-8b83-153e063da907 -- runner@fv-az50:/crippledfs/datalad_temp_tree__test_create_storepmmftx6u; ; (Note that these git remotes have annex-ignore set: origin)', 'type': 'file', 'status': 'error', 'action': 'get', 'path': '/crippledfs/datalad_temp__test_create_storehqh_3zkl/test_install/ds/file1.txt'}]

IOW: no `initial` parameter in ProgressBarBase anymore.

This is following up on a casual comment by @kyleam:
datalad#4438 (comment)

Days later it works. I should commit under pseudonym. You can keep this
code, I don't need it ;-)
@adswa
Copy link
Member Author

@adswa adswa commented Apr 24, 2020

thanks for the cleaning up, @mih! It looks better, and setting an initial + updating initial and total still works

@codecov
Copy link

@codecov codecov bot commented Apr 24, 2020

Codecov Report

Merging #4438 into master will decrease coverage by 0.06%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4438      +/-   ##
==========================================
- Coverage   88.91%   88.84%   -0.07%     
==========================================
  Files         285      285              
  Lines       37765    37844      +79     
==========================================
+ Hits        33578    33623      +45     
- Misses       4187     4221      +34     
Impacted Files Coverage Δ
datalad/log.py 89.71% <0.00%> (ø)
datalad/ui/progressbars.py 81.52% <70.00%> (-2.94%) ⬇️
datalad/support/tests/test_github_.py 87.14% <0.00%> (-12.86%) ⬇️
datalad/downloaders/http.py 72.11% <0.00%> (-2.79%) ⬇️
datalad/downloaders/tests/test_http.py 58.79% <0.00%> (-2.17%) ⬇️
datalad/support/github_.py 78.65% <0.00%> (-0.30%) ⬇️
datalad/support/tests/test_annexrepo.py 95.29% <0.00%> (-0.03%) ⬇️
datalad/consts.py 100.00% <0.00%> (ø)
datalad/support/gitrepo.py 89.96% <0.00%> (ø)
datalad/distribution/dataset.py 94.14% <0.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23d7683...a8bdafe. Read the comment docs.

mih
mih approved these changes Apr 24, 2020
Copy link
Member

@mih mih left a comment

I think this is good! Thx @adswa, and @kyleam for the review.

@mih mih merged commit e7eedf0 into datalad:master Apr 24, 2020
10 of 12 checks passed
@adswa adswa deleted the progress-4390 branch Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants