Proposal: (option to) ignore the first iteration when estimating it/s and remaining time #967

jonasrauber · 2020-05-09T12:12:01Z

It's quite common that the first iteration takes longer than all other iterations because it performs some additional initializations, etc. (e.g. in my case the function that is called in each iteration gets automatically jit-compiled).

Therefore, the estimate of the iterations/sec and, more importantly, the remaining time is overestimated.

As far as I can see, there is currently no option to control this (except for disabling all smoothing
and only looking at the instantaneous iterations/second, but that's of course often not useful.

I'd be willing to contribute this feature if you agree it's useful and no one else does it before I get to it.

What do you think?

The text was updated successfully, but these errors were encountered:

casperdcl · 2020-05-09T12:27:15Z

I think you may mean something like:

from tqdm import tqdm, trange
from time import sleep
def inconsistent_function(i):
    sleep(1 if i == 0 else 0.1)

for i in trange(10, desc="trange"):
    inconsistent_function(i)
with tqdm(total=10, desc="manual") as pbar:
    for i in range(pbar.total):
        inconsistent_function(i)
        pbar.update()
with trange(10, desc="unpaused") as pbar:
    for i in pbar:
        inconsistent_function(i)
        if i == 0:
            pbar.unpause()

trange: 100%|██████████| 10/10 [00:01<00:00,  5.08it/s]
manual: 100%|██████████| 10/10 [00:01<00:00,  5.07it/s]
unpaused: 100%|████████| 10/10 [00:00<00:00, 10.33it/s]

You could also do sleep(0.1) right after pbar.unpause() to get a better estimate for the first iteration; but at that stage it's really cheating. Technically there's no substitute for:

for _ in trange(1, desc="precomputing/compiling first time"):
	inconsistent_function(0)
for i in trange(1, 10, desc="fun times"):
    inconsistent_function(i)

jonasrauber · 2020-05-12T17:10:24Z

@casperdcl Thanks for this workaround. I actually don't care about the time of the first run, all I want is a better estimate of the remaining time, so I think a simple flag to ignore the first iteration for the estimates would be great. The workaround increases the nesting quite a bit makes everything much more verbose.

casperdcl · 2020-05-12T19:57:56Z

I agree if you use it a lot in your code base; it's probably best to sub-class and add this feature. Not sure if it's worth putting this in the core implementation unless there's more demand for it.

jonasrauber · 2020-05-12T21:06:19Z

I see, sounds good. Will do that. Let me know when things change and a PR would be of interest.

ThatAIGeek · 2020-10-14T10:29:22Z

Just googled to find this proposal and a bit sad that it isn't available. Would love to see this feature in the box. Thanks!

almson · 2020-12-25T13:51:16Z

I think #1101 mostly solves this.

casperdcl · 2020-12-25T14:51:51Z

indeed; lemme know if tqdm>=4.55.0 still doesn't quite work for you

stefangstark · 2021-02-04T11:17:44Z

A burn-in parameter that ignores the first 1 or N steps would still be quite useful, imo. If I understand it correctly, EMA can make the time estimate sensitive to spikes.

Vinno97 · 2022-03-28T09:25:36Z

Since this proposal seems to be stalled by doubts over interest: tqdm is pretty popular in ML/Data Science applications and a large majority of projects I've come across that use tqdm suffer from this. Colleagues and I just take it for granted that we have to wait a couple of minutes before we can expect to trust the expected duration (worsened when we set smoothing lower, since IMO this is too high by default, #1104). I think this would be a very helpful quality-of-life change.

Some examples of reasons for a slower first run:

Data loaders need to spawn subprocesses
PyTorch/Numba JIT needs to be compiled
Multiprocessing (like tqdm.contrib.concurrent) needs to spawn processes
Datasets need to be opened

dreamflasher · 2022-06-07T18:39:55Z

Came here to create a new issue for this – yes, this is very relevant for ML/DS. Couldn't explain it any better than @Vinno97. Solution would be a burn-in/warmup parameter as @stefangstark said.

The above workaround is too complicated to use it all over in the code – but would it work to just .unpause() at iteration N? Will this reset EMA?

mspinaci · 2023-12-15T15:42:14Z

After more than 18 months, I'd like to revamp this proposal. I think it would still be a very useful feature to have; as already mentioned this is especially true in the ML/DL setting, where the first iteration is often much slower than the others (for all the good reasons already mentioned, plus others like warm up time for GPUs).

The workaround indeed works well, but it clutters the code significantly. Furthermore, in realistic use cases, one would use tqdm on a generic loop and not trange, so it requires a further enumerate, one more index variable to be defined (which in long or nested loops could get mistakenly overwritten), and it is generally hard to remember (at least for me, unpause is a weird name for reset, especially so since a pause method doesn't exist).

I don't think I know tqdm's code base well enough, but I could try looking into it and opening a PR if it can help.

casperdcl self-assigned this May 9, 2020

casperdcl added the question/docs ‽ Documentation clarification candidate label May 9, 2020

casperdcl added the need-feedback 📢 We need your response (question) label Dec 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: (option to) ignore the first iteration when estimating it/s and remaining time #967

Proposal: (option to) ignore the first iteration when estimating it/s and remaining time #967

jonasrauber commented May 9, 2020

casperdcl commented May 9, 2020 •

edited

jonasrauber commented May 12, 2020

casperdcl commented May 12, 2020

jonasrauber commented May 12, 2020

ThatAIGeek commented Oct 14, 2020

almson commented Dec 25, 2020

casperdcl commented Dec 25, 2020 •

edited

stefangstark commented Feb 4, 2021

Vinno97 commented Mar 28, 2022

dreamflasher commented Jun 7, 2022 •

edited

mspinaci commented Dec 15, 2023

Proposal: (option to) ignore the first iteration when estimating it/s and remaining time #967

Proposal: (option to) ignore the first iteration when estimating it/s and remaining time #967

Comments

jonasrauber commented May 9, 2020

casperdcl commented May 9, 2020 • edited

jonasrauber commented May 12, 2020

casperdcl commented May 12, 2020

jonasrauber commented May 12, 2020

ThatAIGeek commented Oct 14, 2020

almson commented Dec 25, 2020

casperdcl commented Dec 25, 2020 • edited

stefangstark commented Feb 4, 2021

Vinno97 commented Mar 28, 2022

dreamflasher commented Jun 7, 2022 • edited

mspinaci commented Dec 15, 2023

casperdcl commented May 9, 2020 •

edited

casperdcl commented Dec 25, 2020 •

edited

dreamflasher commented Jun 7, 2022 •

edited