Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: add a progress bar to Scan() #19

Closed
matthewcarbone opened this issue Jul 22, 2018 · 14 comments
Closed

Suggestion: add a progress bar to Scan() #19

matthewcarbone opened this issue Jul 22, 2018 · 14 comments
Assignees

Comments

@matthewcarbone
Copy link
Collaborator

Perhaps tqdm?

@mikkokotila mikkokotila self-assigned this Jul 22, 2018
@mikkokotila
Copy link
Contributor

Ok great, I will add that. I'm using it in another project, and I think it should be fairly simple. I also have a working implementation of live updating plot that shows progress per epoch so will include both in the same update. It seems the progress bar should be a parameter so it can be turned off (maybe on by default?).

@matthewcarbone
Copy link
Collaborator Author

Sorry missed the question when I read your comment the first time. Yes I think it is standard to have a toggle for it. There must be an easy way to do that with tqdm if that's what you're using. Its important that you mentioned this since some notebooks (Google Colab...) do not play nicely with tqdm!

@mikkokotila
Copy link
Contributor

Tried this and unfortunately it's going to be a little complicated.

The way the loop is handled now:

        if self.round_limit is not None:
            for i in range(self.round_limit):
                self._null = self._run()
        else:
            while len(self.param_log) != 0:
                self._null = self._run()

Out of these two the more common case is the else where round_limit is not set (second case). The first case is very simple and a standard use-case for tqdm. The second case is problematic in the sense that if a reduction method is applied, then the self.param_log keeps becoming smaller faster than 1 per iteration. Not to say that this is not possible to implement, but is not going to be as simple as I hoped. The question that would need to be answered is related with how to handle the case where a reducer is used; for example you might have 20 hours to start with, and then it jumps down to 10, then to 2, and so on...and in the process it might not also be so accurate as there is a great degree of variety depending on the hyperparameters.

@matthewcarbone
Copy link
Collaborator Author

matthewcarbone commented Jul 24, 2018

I think I understand. There should be ways to manually update the progress bar. Here's an example (I didn't test it yet).

I can look into this in more detail sometime later as well!

Edit: also, I think just having more or less an indicator of the amount of progress Scan has made is desirable, even independent of the estimated time until the process finishes. (I just like to know something is happening in the background!).

@mikkokotila
Copy link
Contributor

@x94carbone re 'Edit' > great. I was thinking like that as well. I think there are two sides to this; something that serves the purpose of terminal and notebook use, and then secondarily for the notebook use only. For notebook it seems that a live updating matplotlib plot is an idea situation as you can also see epoch level updates (and get an idea of how the training takes place which can be really valuable especially early in the process or an automated reduction/optimization method is used). How about a very simple x out of xxx rounds left for the universal case?

@matthewcarbone
Copy link
Collaborator Author

@mikkokotila right I totally agree. There's got to be an easy way for the program to detect whether or not its running in a notebook or not. Regardless, since there are some notebooks that don't play nice with these progress bars, we should implement some default behavior first. Something like what you suggested should work great! Particularly, maybe something that constantly overwrites the first output line so we don't overwhelm the user with a ton of output.

@mikkokotila
Copy link
Contributor

@x94carbone what do you think, can we assume that the user is not going to want to have verbose for model.fit on anything except 0? If we can assume that, we can detect the os, and then according to the os we use a system clear command (windows cls and otherwise clear). Here we just accept that whatever have been shown on the terminal / notebook will be cleared. What do you think?

@matthewcarbone
Copy link
Collaborator Author

We might not even have to make it that complicated actually- I may have been overthinking it. We could implement three options and let the user choose for themselves. 0 for silent running, 1 for progress bar and 2 for raw text output. Maybe that’s a better starting place than detecting the os or if the user is in a notebook. Then once we have that working we can go from there. What do you think?

@matthewcarbone
Copy link
Collaborator Author

I have another idea actually. We don't need a progress bar, we can do this with the Python 3 print function if we're clever. Backwards compatible via from __future__ import print_function, I think. I will give this a shot at some point soon.

@mikkokotila
Copy link
Contributor

That would be good if we could do it very simply! :)

@mikkokotila
Copy link
Contributor

I've added tqdm which is on by default. Together with some notable changes, you can find it in dev-mikko, and hopefully tomorrow in daily-dev as well (will have to do more testing, and also add some unit tests).

@Geekgurus
Copy link

This Should Help.
Example Solution From My Experience

Configuration
If #tqdm is not already installed, install it with #pip.

pip install #tqdm
Suppose the flyer in question is called flyer, an instance of the class Flyer. Find where flyer is defined in your IPython profile. Something like:

flyer = Flyer(some_arguments)
We can subclass Flyer, customizing its collect method to incorporate a progress bar.

class FlyerWithProgressBar(Flyer):

def collect(self):
    yield from tqdm(super().collect())

flyer = FlyerWithProgressBar(some_arguments

If you have some way of knowing in advance how many data points will be yielded by collect, you can provide that information to tqdm to make it more informative. (It can predict the time remaining, for example.) Suppose we have a flyer that always produces 100 data points. The class should be defined like:

class FlyerWithProgressBar(Flyer):

def collect(self):
    yield from tqdm(super().collect()

Or, add an attribute that can be updated interactively: 👻

@Geekgurus
Copy link

And finally it goes this way 😳

interactively:

class FlyerWithProgressBar(Flyer):

def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)  # pass arguments to Flyer
    self.total = 100

def collect(self):
    yield from tqdm(super().collect(), total=self.total)

Here is a demo of FlyerWithProgressBar.

In [1]: from bluesky.plans import fly

In [2]: flyer = FlyerWithProgressBar()

In [3]: plan = fly([flyer])

In [4]: RE(plan)
100%|█████████████████████████████████████████████████| 100/100 [00:01<00:00, 92.35it/s]
Out[4]: ['3acf0eb7-96bf-4c09-b813-e715dab

@mikkokotila
Copy link
Contributor

Progress bar had been added some time ago and now available in master as well as pip so closing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants