Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub stats showed as an unique progress bar #77

Closed
jcea opened this issue Dec 24, 2015 · 30 comments
Closed

Sub stats showed as an unique progress bar #77

jcea opened this issue Dec 24, 2015 · 30 comments
Assignees
Labels
p3-enhancement 🔥 Much new such feature question/docs ‽ Documentation clarification candidate

Comments

@jcea
Copy link
Contributor

jcea commented Dec 24, 2015

I have thinking about this for a couple of days.

Lets say I have two threads working in parallel in a single problem. The counter update would be something similar to this: long pause, an update (thread 1), short pause, another update (thread 2), long pause... Those two close updates break the speed and ETA estimation.

Thinking of more sophisticated algorithms, I realized that the problem is actually pretty simple if we keep average speed for both threads separated and just accumulate it when printed.

I realize that this code is a bit specialized but I think it would be a nice feature. If you are not interested or you are worried about this change impacting performance, would be nice to provide some hooks to specialize via the constructor of thru subclassing.

What do you think?

Thanks.

@casperdcl
Copy link
Sponsor Member

I believe this is solved if you set the smoothing kwarg to something small (close or equal to zero). Let me know if you can think of a better/different solution

@casperdcl
Copy link
Sponsor Member

Also tqdm supports subclassing very well - tqdm_gui for example is a child class of tqdm

@casperdcl casperdcl added the question/docs ‽ Documentation clarification candidate label Dec 25, 2015
@casperdcl
Copy link
Sponsor Member

Let me know if you prefer and have (attempted to) implement a subclass, I'll reopen the issue and we can add it here.

@jcea
Copy link
Contributor Author

jcea commented Jan 10, 2016

Subclassing would be an ideal solution if module functions like "format_meter()" or "StatusPrinter()" would be methods of class tqdm. Then I could replace them easily in a subclass. For compatibility reasons maybe would be easier to add them as constructor parameters (with default values = current functions).

Without this help, I must rewrite most logic in my subclass.

@casperdcl
Copy link
Sponsor Member

I could make both @classmethods. Would that work for you?

@casperdcl casperdcl reopened this Jan 10, 2016
@lrq3000
Copy link
Member

lrq3000 commented Jan 10, 2016

Agree, this would make subclassing way easier and more powerful. However, I hope this won't hit performance too much, we'll test that out (as @classmethod or @staticmethod, this may work great without any perf hit).

@casperdcl casperdcl self-assigned this Jan 10, 2016
@jcea
Copy link
Contributor Author

jcea commented Jan 11, 2016

No idea about performance but yes, this would help a lot.

@jcea
Copy link
Contributor Author

jcea commented Jan 12, 2016

Great. Any date for a PYPI release?

@jcea jcea closed this as completed Jan 12, 2016
@jcea jcea reopened this Jan 12, 2016
@casperdcl casperdcl added the p3-enhancement 🔥 Much new such feature label Jan 12, 2016
@casperdcl
Copy link
Sponsor Member

er... yeah. when I get around to feeling OK with #87 ... Spent ages trying to optimise it ;)

@jcea
Copy link
Contributor Author

jcea commented Jan 12, 2016

Crossing fingers and biting my nails...

@casperdcl
Copy link
Sponsor Member

well.. I just tagged and released v3.7.0 on github which includes the inheritance version... so if you want to bug @lrq3000 he could release it on pypi.

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

Great work Casper!
However the version is still set to 3.6.0, and I don't know how to fix that
directly in the master branch without using a PR...

2016-01-13 0:33 GMT+01:00 Casper da Costa-Luis notifications@github.com:

well.. I just tagged https://github.com/tqdm/tqdm/releases/tag/v3.7.0
which includes the inheritance version... so if you want to bug @lrq3000
https://github.com/lrq3000 he could release it on pypi.


Reply to this email directly or view it on GitHub
#77 (comment).

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

I still changed the version and upload to pypi so we just need to fix the
GitHub release.

2016-01-13 11:48 GMT+01:00 Stephen LARROQUE lrq3000@gmail.com:

Great work Casper!
However the version is still set to 3.6.0, and I don't know how to fix
that directly in the master branch without using a PR...

2016-01-13 0:33 GMT+01:00 Casper da Costa-Luis notifications@github.com:

well.. I just tagged https://github.com/tqdm/tqdm/releases/tag/v3.7.0
which includes the inheritance version... so if you want to bug @lrq3000
https://github.com/lrq3000 he could release it on pypi.


Reply to this email directly or view it on GitHub
#77 (comment).

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

I'm updating the setup.py classifier troves, so I'm going to update to version 3.7.1 by the way.

@lrq3000 lrq3000 closed this as completed Jan 13, 2016
@casperdcl
Copy link
Sponsor Member

oops my bad. Forgot to double check _version.py

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

nvm it happens :) But do you have a way to update _version.py directly on master branch without using a PR? For instance, I saw you can change the version during a merge, but I didn't see the version changes in the branches. Can you do that during the merge or in fact you push to the branch before merging to master?

@jcea
Copy link
Contributor Author

jcea commented Jan 13, 2016

Please, write a changelog too, inlined or linked in PYPI. I hate to "diff" between releases to know what is new :). Something to add to the "RELEASE" file :-).

@casperdcl
Copy link
Sponsor Member

@jcea I write changelogs on github (https://github.com/tqdm/tqdm/releases). If you want to look through those and add them to the RELEASE file in a PR, I'll be happy to merge it in.

@casperdcl
Copy link
Sponsor Member

@lrq3000 I can update _version.py with a force-push bit I already tagged v3.7.0 which makes it a bad idea to tamper with. Doesn't matter since the 3.7.1 release. I notices you copied over the release notes, too,and removed them from the 3.7.0 tag.

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

@casperdcl Yes the 3.7.1 was just an excuse to fix the versioning by fixing the setup.py file by the way ;) 3.7.0 on Github should not be used since its version is not correct, but the one on PyPi is OK.

@lrq3000
Copy link
Member

lrq3000 commented Jan 13, 2016

@jcea I'm not really a huge fan of putting all changes in a CHANGELOG, because it pollutes the commits (we already have the git commit + GitHub releases descriptions to detail and summary the changes). Sure, I've done it a lot in the past for other projects, but it was just because I didn't have another place to put the changes (ie, I was not using git nor github releases).

But well, if you really think that's a necessary addition, then why not...

@jcea
Copy link
Contributor Author

jcea commented Jan 14, 2016

@casperdcl, I was trying to say that the project should include a "changelog" file in the package, and it should be pasted or linked in PYPI.

My opinion, of course :).

In this particular case, a link in PYPI (named "changelog") pointing to https://github.com/tqdm/tqdm/releases would be enough for me.

@jcea
Copy link
Contributor Author

jcea commented Jan 14, 2016

@lrq3000, think about people unfamiliar with the project. The changelog is for those people, not for you :-).

I have been using this project for a month and even providing some ideas and PR, and I was lost when 3.7.1 was released. I just read the code to review the changes myself.

"Changelog" is standard practice for a reason :).

@lrq3000
Copy link
Member

lrq3000 commented Jan 14, 2016

I'm not saying we don't need a changelog, but rather that we already have a
changelog in the form of Github Releases.

2016-01-14 3:16 GMT+01:00 jcea notifications@github.com:

@lrq3000 https://github.com/lrq3000, think about people unfamiliar with
the project. The changelog is for those people, not for you :-).

I have been using this project for a month and even providing some ideas
and PR, and I was lost when 3.7.1 was released. I just read the code to
review the changes myself.

"Changelog" is standard practice for a reason :).


Reply to this email directly or view it on GitHub
#77 (comment).

@lrq3000
Copy link
Member

lrq3000 commented Jan 14, 2016

Another possibility instead of manually maintaining a CHANGELOG: we can
generate a changelog from git commits on-the-fly at the moment of building,
and include it in the package. Would that be ok?

2016-01-14 8:52 GMT+01:00 Stephen LARROQUE lrq3000@gmail.com:

I'm not saying we don't need a changelog, but rather that we already have
a changelog in the form of Github Releases.

2016-01-14 3:16 GMT+01:00 jcea notifications@github.com:

@lrq3000 https://github.com/lrq3000, think about people unfamiliar
with the project. The changelog is for those people, not for you :-).

I have been using this project for a month and even providing some ideas
and PR, and I was lost when 3.7.1 was released. I just read the code to
review the changes myself.

"Changelog" is standard practice for a reason :).


Reply to this email directly or view it on GitHub
#77 (comment).

@casperdcl casperdcl mentioned this issue Jan 14, 2016
@jcea
Copy link
Contributor Author

jcea commented Jan 14, 2016

The most simple approach for you is just to add a link in PYPI to the github changelog page.

@jcea
Copy link
Contributor Author

jcea commented Jan 15, 2016

Going back to the original aim of this issue :-), I have coded my idea using TQDM 3.7.1 ability to subclass tqdm class easily.

Here is the result:

class subprogress(tqdm.tqdm) :
    def __init__(self, *args, **kwargs) :
        self._metaprogress = None
        super(subprogress, self).__init__(*args, **kwargs)

    @staticmethod
    def format_meter(*args, **kwargs) :
        pass

    def status_printer(self, *args, **kwargs) :
        def print_progress(dummy) :
            if self._metaprogress :
                self._metaprogress()

        return print_progress

    def set_metaprogress(self, metaprogress) :
        self._metaprogress = metaprogress

    def __call__(self) :
        self.update()

class metaprogress(tqdm.tqdm) :
    def __init__(self, path, initial, num_fragmentos) :
        print(os.path.basename(path), file = sys.stderr)
        super(metaprogress, self).__init__(total = num_fragmentos,
                                           initial = initial,
                                           unit = "block",
                                           dynamic_ncols = True,
                                           miniters = 1,
                                           mininterval = 0)
        self._total = num_fragmentos

        self._subprogresses = []
        self.lock = threading.Lock()

    def new_subprogress(self) :
        progress = subprogress(
                total = self._total,
                miniters = 1,
                mininterval = 0)
        self._subprogresses.append(progress)
        progress.set_metaprogress(self)
        return progress

    def __call__(self) :
        with self.lock :
            n = self.n + sum((progress.n for progress in self._subprogresses))
            rate = None
            for progress in self._subprogresses :
                avg = progress.avg_time
                if avg :
                    if rate is None :
                        rate = 1/avg
                    else :
                        rate += 1/avg
            self.sp(self.format_meter(n, self._total, self._time() - self.start_t,
                (self.dynamic_ncols(self.fp) if self.dynamic_ncols else self.ncols),
                self.desc, self.ascii, self.unit, self.unit_scale, rate, self.bar_format))

I hope this is useful to anybody. Remember that I usually have like 2-3 updates per second, so performance is a non issue. Also, the code is running in Python 3.5, I don't care about 2.x anymore. Finally, this code solves my use case, it is not ready-to-use generic code.

It is an example. Feel free to add it to the examples collection.

Thanks for your work in tqdm. It is highly appreciated.

@casperdcl
Copy link
Sponsor Member

Ok, thanks. Could you provide a minimal example use case for this code?

@jcea
Copy link
Contributor Author

jcea commented Jan 16, 2016

Sure. This is a fragment of my production code:

progress_bar = metaprogress(name, done, initial)
[...]
for i in range(concurrency) :
    futures.append(executor.submit(WORKER, progress = progress_bar.new_subprogress())

I create a "metaprogres" bar with the details: name, total, initial. Then I create threading workers passing to them a new instance each of a "virtual" progress bar with "metaprogress.new_subprogress()".

Each worker call its private "subprogress" object. They keep stats private and call the "metaprogress" parent to totalize and actually display the actual progress bar. Since this could be called by several threads at the same time, I use a lock to protect the totalization & display. If contention is low, the lock is not a performance problem (it will be almost always unlocked), and if the contention is high, you better have a lock protecting you :-)

I hope I have explained it easy enough. Let me know if not.

@lrq3000
Copy link
Member

lrq3000 commented Jan 22, 2016

Thank you @jcea, we'll see if we can turn that into a tqdm subclass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3-enhancement 🔥 Much new such feature question/docs ‽ Documentation clarification candidate
Projects
None yet
Development

No branches or pull requests

3 participants