Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi {threading/processing} support #264

Closed
brunolnetto opened this issue Jan 31, 2022 · 51 comments
Closed

Multi {threading/processing} support #264

brunolnetto opened this issue Jan 31, 2022 · 51 comments

Comments

@brunolnetto
Copy link

brunolnetto commented Jan 31, 2022

Description

Given multiple tasks processed in parallel, a progressbar for each one may bring some enlightment to the developer.

Code

i found this option here, but its interface makes it too slow to process heavy tasks. Your ASCII-based solution suits me well.

Versions

  • Python version: 3.8.10 (default, Nov 26 2021, 20:14:08) [GCC 9.3.0]
  • Python distribution/environment: CPython/Anaconda/IPython/IDLE
  • Operating System: Ubuntu Linux
@brunolnetto
Copy link
Author

We may even try to revamp the archictecture of this solution here: rsalmei/alive-progress#20 (comment)

@wolph
Copy link
Owner

wolph commented Jan 31, 2022

It's been a long standing issue to add this and it's at the top of my to-do list for this library:

It is a complicated issue to implement well however, my first few attempts at it resulted in breaking other behaviour in the progressbar and backwards compatibility is quite important to me. This progressbar is perhaps not as pretty as alive-progress out of the box, but it should be at least very fast and stable :)

@brunolnetto
Copy link
Author

brunolnetto commented Jan 31, 2022

Take a look on the link from the second post. The solution may lay on re-render the multiple bars. My best shot is:

  • add a class property counter for the progress-bar instance
  • listen to stout and look for \n character;
  • update the current counter (for strongly typed languages, I recommend some either unsigned long int or scientific mantissa-like notation) for every \n on stout output;
  • Bars with very different rates (100 or more) require special treatment. A good approach is to take, for example, the max of some default minimum refresh rate and either average or median of bars current rates.

Bars may always keep on the top of the terminal stack from the command run.

@wolph
Copy link
Owner

wolph commented Jan 31, 2022

It's slightly more complicated than that. Within a terminal you can't just write at a specific line, all output is appended at the end of the current line.

To rewrite a current line we can use \r which is supported by many shells and environments, but to write multiple lines simultaneously requires more advanced control over the terminal which has much less support.

It is very likely that pycharm for example, won't support it

@brunolnetto
Copy link
Author

It seems line starve in combination to delete character or ANSI cursor move character may be useful here.

@wolph
Copy link
Owner

wolph commented Sep 14, 2022

While it's not a full solution, this little bit of code reasonably approaches the goal: #189 (comment)

@wolph
Copy link
Owner

wolph commented Sep 14, 2022

And for a solution that features locks for multithreading: #176

@brunolnetto
Copy link
Author

Hi, Wolph. My surname is "Wolf". What a coincidence!
Nevertheless, Let me get this straight: we update the progress bars only at particular sample instants, right? It is very reasonable! Will somebody release it as a feature somewhen in the near future?

@wolph
Copy link
Owner

wolph commented Sep 29, 2022

Haha, that's indeed a coincidence :)

Yes, that is correct. It is doing slightly more now than it needs to do, but it's still reasonably efficient. When you have 2 progressbars what it essentially does when updating the top progressbar:

  1. move the cursor up
  2. print the progressbar
  3. move the cursor back down

With regards to the new feature, I have to review the impressive bit of code that @kdschlosser wrote: https://github.com/kdschlosser/python-progressbar/tree/multi_thread
I will merge it if it doesn't break any existing scenarios/environments. And that's something I still need to test :)
As always, time is limited but I hope to have this pushed out within a month or so.

@brunolnetto
Copy link
Author

The test step must be very satisfying. We hope it stands! Great job!

@brunolnetto
Copy link
Author

I was messing around the issues section and wondered: don't you think it is a great time to refactor the README file and add current functionalities and beauties of this library? It is very convenient.

@kdschlosser
Copy link

Update the readme for what??

@brunolnetto
Copy link
Author

Fancy words right?! I mean, developers use to think they speak "devish" language or something. I mean to refactor README in a more user-friendly way to bring the useful libraries functionalities to daily simple activities. Also, some images of use cases would be welcome as well.

@kdschlosser
Copy link

Have at it and submit a PR for it. Personally I am one to keep the Readme to describe the library and maybe some screen shots. Usage would be and should be contained in the docstrings.

@kdschlosser
Copy link

For the changes I made there is no real usage to know about. create more then a single bar and update them. this can be done in a single thread or multiple threads it doesn't matter. So from a usage standpoint there is nothing that needs to be done differently with having multiple bars other than constructing more then one bar

@brunolnetto
Copy link
Author

I see. I recommend you add a minimal example snippet as well as a screenshot of the final result or a GIF to the existing README for future users to enjoy it as you do.

@wolph
Copy link
Owner

wolph commented Oct 8, 2022

Fancy words right?! I mean, developers use to think they speak "devish" language or something. I mean to refactor README in a more user-friendly way to bring the useful libraries functionalities to daily simple activities. Also, some images of use cases would be welcome as well.

Documentation really isn't my strong suit unfortunately... all of my projects suffer from this issue. All help is welcome though, if anyone is better at documenting :)

@brunolnetto
Copy link
Author

I can help if you list the "hot" topics of this library in a list of bullet points. :) (or any library you wish to improve)

@wolph
Copy link
Owner

wolph commented Oct 18, 2022

I can help if you list the "hot" topics of this library in a list of bullet points. :) (or any library you wish to improve)

In the case of this library... I think one of the nicer features is that it can do full output redirection (i.e. when enabled, you can still do print(...) while using the progress bar).

I think some animated gifs of the output could be very useful to demonstrate what it can do. There are quite a few examples available but I'm not really sure what would appeal to most people.

@brunolnetto
Copy link
Author

Great remarks! It recalls me on morgan library with log categories. I will report to you latter on with some tests. :-)

@brunolnetto
Copy link
Author

brunolnetto commented Oct 19, 2022

In my experience, good libraries are almost invisible to people and still make them aware of their existence. Example: NumPy, pandas, tqdm, matplotlib. In my opinion, a non-verbose library is a good development practice.

In our case, some of these library practices are somewhat "user-frightening", widgets and the use of explicit object properties like "progressbar.ProgressBar" or "progressbar.AnimatedMarker": they are yes necessary for the feature construction, but relevant only to the programmer to know. For example, in the case of "progressbar.AnimatedMarker(fill='#')", the user provides its marker preference '#' to object "progressbar.ProgressBar", and the library assigns the respective widgets at will.

Do you understand me?

@brunolnetto
Copy link
Author

The words below are those needed by the user. A common human uses at most 10 words in his/her "cache memory".

ProgressBar
Bar
progressbar
Percentage
AnimatedMarker
MultiRangeBar
MultiProgressBar
GranularBar
PercentageLabelBar
RotatingMarker
ETA
FileTransferSpeed
ReverseBar
SimpleProgress
Counter
Timer
FormatLabel
BouncingBar
AdaptiveETA
AbsoluteETA
AdaptiveTransferSpeed
Variable
FormatCustomText

@github-actions github-actions bot added the Stale label Aug 21, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 28, 2023
@brunolnetto
Copy link
Author

@wolph How are you? I hope you are great. Have you take a look on this idea? Thanks. :-)

@wolph
Copy link
Owner

wolph commented Aug 28, 2023

The stale bot is a bit overzealous (I'm still tweaking its settings). The feature is actually already in the current beta release and I'm working on a proper release right now :)

https://pypi.org/project/progressbar2/4.3b0/

@wolph wolph reopened this Aug 28, 2023
@wolph wolph added in-progress and removed Stale labels Aug 28, 2023
@brunolnetto
Copy link
Author

brunolnetto commented Aug 28, 2023

Not-related topic: I see you use tox. Consider using poetry. I can push a PR, if you allow me.

@wolph
Copy link
Owner

wolph commented Aug 28, 2023

Tox and poetry serve different purposes though, tox allows for testing on multiple platforms and multiple things in parallel. Poetry is great for project management.

I actually use poetry quite a lot for projects but I found it lacking (the last time I tried) for building and deploying packages. That was about 2 years ago however and poetry has improved quite a bit so it might be sufficient now. The only issue is that I use a single build and deploy script for all the packages I maintain so I would either need to migrate all packages or expand on that script as well.

Regardless, it would be good to have a poetry workflow for 3rd party developers willing to contribute though...

@brunolnetto
Copy link
Author

I wrote the python library eule, which I find quite decent to manage. The main commands are on Makefile. Please, take a look and say if this is of your interest.

@kdschlosser
Copy link

It was quite a bit of work to hammer out the best way to track cursor position to update only the lines that need updating, If all bars get redrawn over and over again it leads to flickering. Coming up with a cross platform way to track and set the cursor position without having to go to crazy with writing OS specific code was the tricky part. Took several people trying many different things to come up with a way to get it done. @wolph now has the task of taking the code and adding it in a manner that is easy to use and doesn't disrupt too much of the existing code base or cause any API breaking changes. That's the really hard part.

@wolph
Copy link
Owner

wolph commented Aug 30, 2023

Not breaking backwards compatibility while adding new features is indeed the hard part. This library has a ton of legacy stuff unfortunately, but I think it's worth it so people don't have to deal with breaking code :)

At some point I might have to cull stuff, but for now even code from the original 2006 version of the progressbar should still work without any issues. Almost 20 years of compatibility ;)

All of the code is working fine now, I just need to lint, test and modernize the build system a bit. Hopefully hatch can help me to largely switch to a pyproject.toml file since Github is rather stupid when it comes to parsing a setup.py file so the current version certainly has issues.

@kdschlosser
Copy link

pyproject.toml file use is not that hard unlesss you have a complex build system. If you have to compile anything you are going to want to keep the setup.py file if you have anything that requires special treatment. writing a build backend is not what I call a good time. If you think distutils was a pain to deal with build backends are a complete nightmare because of all of the voodoo magic code in it. Hard to debug and doesn't produce helpful traceback information because of it's extensive use of subprocess.

@wolph
Copy link
Owner

wolph commented Aug 30, 2023

The issue is mainly that I want to keep my __about__.py file leading and I want the pyproject.toml to read from that.

@kdschlosser
Copy link

You can read from the about file in the pyproject'toml file. I have to remember how the hell to do it bit I know it can be done.

@wolph
Copy link
Owner

wolph commented Aug 30, 2023

Indeed, but it depends what build tool you use and it doesn't work for all attributes. I'm currently testing hatch, that might be sufficient for me

@brunolnetto
Copy link
Author

Even though hatch may be a good tool, there are some uncovered domains. Consider using poetry instead before going further.

1692865559073

@wolph
Copy link
Owner

wolph commented Aug 30, 2023 via email

@brunolnetto
Copy link
Author

@wolph I opened an issue on the package repository python-poetry. You can read it: python-poetry/poetry#8388

@wolph
Copy link
Owner

wolph commented Aug 30, 2023

It seems it's definitely not supported :P
I'm certainly not opposed to poetry as I use it in pretty much all of my regular projects for dependency management. However... I've also seen multiple bugs and broken features such as OS specific dependencies not working, the extras system that doesn't properly work, etc. So I'm a bit hesitant to switch to it for long-term package building since it can take quite a bit of work to switch over all of my packages and build scripts.

Having that said, it seems that even bare setuptools can do what I need so I might just try that instead. That's definitely a long term stable project.
https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata

In any case, thank you for opening the issue and helping out @brunolnetto :)

@wolph
Copy link
Owner

wolph commented Aug 31, 2023

I think it should be noted that poetry isn't consistent in this manner either... the readme is read from a single or multiple files: https://python-poetry.org/docs/pyproject/#readme
Rightfully so of course, it would be ludicrous not to have the readme external but I'm not sure why other external values are an issue.

From what I can gather it is supported by setuptools, hatch, flit and pdm. The only one that does not support this is poetry.

I absolutely love using poetry for projects as it's much less clunky than all the other options such as pipenv, but I'm not convinced that it's a good package builder. Naturally it doesn't have to be, hatch and flit only handle building and I personally lean more towards the unix mindset of having 1 tool for 1 job.

It also irks me a bit that poetry isn't using the regular [project] keys in the pyproject.toml but requires the use of [tool.poetry] instead. There might be a good reason for that, but I prefer PEP compliance when possible.

@kdschlosser
Copy link

The problem with setuptools is its reliance on distutils and distutils doesn't like to play nice when compiling C code on Windows using MSVC

In the event you run into an issue with it There is a solution.

https://github.com/kdschlosser/python_msvc

It's crazy easy to use and it will gurantee a successful build with MSVC. I wanted to throw that out there to ya in the event you have had anyone that was not able to compile under Windows using MSVC. Cython has a plug in their docs about using it.

@wolph
Copy link
Owner

wolph commented Sep 4, 2023

I've taken a proper look at converting everything to pyproject.toml but reading those details are clunky at best: https://stackoverflow.com/questions/10567174/how-can-i-get-the-author-name-project-description-etc-from-a-distribution-objec

While I admire the idea of having everything statically in the pyproject.toml, I have backwards compatibility to keep in mind here and converting something that's currently static on runtime to a dynamic bit of code that takes more time seems like a step backwards.

When it comes down to it, runtime happens often, install time very rarely. So any time not needed to be spent at runtime is worth it in my opinion. And reading your version dynamically at every import of the library such as what poetry does seems like a waste of CPU time.

@brunolnetto
Copy link
Author

brunolnetto commented Sep 4, 2023

Make backward compatibility an achievement milestone. :-P. Although I agree with your reasonable point, this is usually made once in a while, not an operation that will be often used. Therefore, CPU time is not critical here...

@wolph
Copy link
Owner

wolph commented Dec 18, 2023

The new version with this included has been released

@wolph wolph closed this as completed Dec 18, 2023
@brunolnetto
Copy link
Author

May you write a minimal example snippet for me test on the fly?

@wolph
Copy link
Owner

wolph commented Dec 18, 2023

The release comes with a few examples in the readme: https://github.com/wolph/python-progressbar?tab=readme-ov-file#showing-multiple-independent-progress-bars-in-parallel

But if you're looking for something different, please let me know :)
It's a very new feature so backwards compatibility is not a large issue yet ;)

@brunolnetto
Copy link
Author

I must be doing something very wrong: the very first example, below, just logs TypeError: 'module' object is not callable . I tried the further examples, but also makes some error related to non-existent function. You can try reproduct the same step-by-step as me on Google Colab: https://colab.research.google.com/drive

import time
from progressbar import progressbar

for i in progressbar(range(100)):
    time.sleep(0.02)

@wolph
Copy link
Owner

wolph commented Dec 18, 2023

It looks like I forgot to update the readme after the last code cleanup... the lines and stream arguments of the LineOffsetStreamWrapper are swapped in the readme example.

Fixed example:

import random
import sys
import time

import progressbar

BARS = 5
N = 100

# Construct the list of progress bars with the `line_offset` so they draw
# below each other
bars = []
for i in range(BARS):
    bars.append(
        progressbar.ProgressBar(
            max_value=N,
            # We add 1 to the line offset to account for the `print_fd`
            line_offset=i + 1,
            max_error=False,
        )
    )

# Create a file descriptor for regular printing as well
print_fd = progressbar.LineOffsetStreamWrapper(0, sys.stdout)

# The progress bar updates, normally you would do something useful here
for i in range(N * BARS):
    time.sleep(0.005)

    # Increment one of the progress bars at random
    bars[random.randrange(0, BARS)].increment()

    # Print a status message to the `print_fd` below the progress bars
    print(f'Hi, we are at update {i+1} of {N * BARS}', file=print_fd)

# Cleanup the bars
for bar in bars:
    bar.finish()

The second example was already correct :)

@brunolnetto
Copy link
Author

brunolnetto commented Dec 18, 2023

This example logs TypeError: __init__() got an unexpected keyword argument 'max_value'. Also, I think it is a great idea to use context managers like this library did: https://pypi.org/project/alive-progress/

@wolph
Copy link
Owner

wolph commented Dec 18, 2023

I think you might be using a different progressbar module. It seems that collab comes with a progressbar module different from this one.

Here's an example collab including the install: https://colab.research.google.com/drive/1Bkxi0r6tPZQbEwtu9vf6YslPC-GBfkOm?usp=sharing

There's also another option that uses context managers similar to alive-progress: https://github.com/wolph/python-progressbar?tab=readme-ov-file#multiple-threaded-progressbars

Within collab it doesn't appear to work because it doesn't understand the ANSI escape sequences used. These 2 bars (as opposed to the rest of the library) only work in a regular shell/terminal sessions that support ANSI escape sequences such as bash, zsh, etc.

I could try writing a widget for collab to do the same if that would be useful, not sure how easy/hard that would be though.

@brunolnetto
Copy link
Author

On current multipurpose usage of progress bars, I say it is relevant to make the renderable in python notebooks. You may try to use something less advanved as ANSI, like unicode or even ASCII. :-P

@brunolnetto
Copy link
Author

Additionally, although I know you must be proud of the module structure you built, from a user perspective, it is wise to expose only the most friendly assets and not the crude objects used to orchestrate the main feature.

@wolph
Copy link
Owner

wolph commented Dec 19, 2023

On current multipurpose usage of progress bars, I say it is relevant to make the renderable in python notebooks. You may try to use something less advanved as ANSI, like unicode or even ASCII. :-P

That's a very fair point :)
These progressbars are oftentimes used via the commandline, but notebooks have become increasingly more popular so I need to look at that as well.

Additionally, although I know you must be proud of the module structure you built, from a user perspective, it is wise to expose only the most friendly assets and not the crude objects used to orchestrate the main feature.

I fully agree, that's not my strong suit unfortunately. But we'll get there at some point I hope :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants