Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the progress bar smoother #38

Merged
merged 5 commits into from
Apr 26, 2022

Conversation

iainelder
Copy link
Collaborator

Use the "Submit and use as completed" pattern as described by Jason Brownlee in
"ThreadPoolExecutor in Python: The Complete Guide".

Fixes #37

Use the "Submit and use as completed" pattern as described by Jason Brownlee in
"ThreadPoolExecutor in Python: The Complete Guide".

Fixes connelldave#37
with futures.ThreadPoolExecutor(max_workers=self.thread_workers) as executor:
with ThreadPoolExecutor(max_workers=self.thread_workers) as executor:
jobs = [executor.submit(self.cove_thread, s) for s in self.sessions]
outputs = (f.result() for f in as_completed(jobs))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a huge nitpick comment, but I stared at this for ages trying to work out what didn't sit well with me: it's a super clean solution - I feel like outputs should be something that makes clear that the variable stored is a generator.

Passing the generator directly to line 43 doesn't make it any clearer, but line 39 means you need to know exactly how this works mechanically or it's just baffling :D

My suggestion is completed_futures_generator. It's not exactly a clever name :D A comment with # https://superfastpython.com/threadpoolexecutor-in-python/#Submit_and_Use_as_Completed feels worthwhile too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a nitpick at all! In fact the difference between this working "smoothly" and being totally broken is exactly two characters.

If you replace the generator syntax (...) with list comprehension syntax [...] the progress bar is invisible during the whole operation until it appear at the end already 100% complete. That's because tqdm doesn't even see the input until all the futures have completed.

So I think making it more explicit is the right thing to do.

Names don't have to be clever, but they should be clear. outputs is about as vague as they come :-) completed_futures_generator does a better job of telling what it is.

I wonder if maybe a new helper function called something like iter_each_result_as_completed that uses the yield syntax would be even clearer.

Copy link
Collaborator Author

@iainelder iainelder Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My latest commit aims to make more explicit what's going on to implement the pattern. What do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change - much more parsable imo. I've been burned enough times by (x) vs (x,) to treat braces with suspicion where possible :)

with ThreadPoolExecutor(max_workers=self.thread_workers) as executor:
jobs = [executor.submit(self.cove_thread, s) for s in self.sessions]
outputs = (f.result() for f in as_completed(jobs))
def submit_decorated_func_with_all_sessions(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this can just be inline - a type hint on the list comp has the same impact without needing a seperate func

futures: List["Future[CoveSessionInformation]"] = [executor.submit(self.cove_thread, s) for s in self.sessions]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer functions with descriptive names over inline comments, so I had replaced your inline comment with a function named to similar effect. In this case the inline function doesn't do much for the readability of the code, so I'm okay with inlining it.

The original comment is still gone because I think the new comment about the pattern does a better job of explaining the code.

Mypy knows the type without a type hint, so I'd prefer to omit the hint. Do you have another reason for including it?

Copy link
Owner

@connelldave connelldave Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not particularly worried about that comment - the original that's removed isn't super useful and should probably be a docstring if anything.

My reply was intended for the inner function being better served by just resolving the list comprehension in the outer func, or in this case, didn't feel it needed a func at all since the value it was adding in my eyes was the return annotation :)

Mypy knows the type without a type hint, so I'd prefer to omit the hint. Do you have another reason for including it?

The reason I like that approach is for two reasons - annotations can be more useful for humans than just mypy inference - in this case it's mainly just an annotation saying "here's what this magic results in", and declaring the var type means that if mypy infers different it'll fail which is a useful signal for the occasional corner case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've added the type hint for clarify. How does it look now?

) -> FutureCoveResults:
return [executor.submit(self.cove_thread, s) for s in self.sessions]

def iterate_results_in_order_of_completion(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea - much more digestable. Only suggestion is not having it inside closure of run_cove_function, no functional benefit - can be outside the class itself as a generic helper

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of using an inner function was to make it clearer that it's just a helper for the enclosing function. It's impossible to reference it from outside.

But I think the generic helper is actually more readable, so I'll go with that.

As a bonus, VS Code will warn about unused functions whose names start with an underscore, so it achieves the same thing as what I was trying with the inner function.

@@ -21,6 +22,7 @@ class CoveSessionInformation(TypedDict):


CoveResults = List[CoveSessionInformation]
FutureCoveResults = List["Future[CoveSessionInformation]"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not worth defining for one shot, but again - nitpick :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was modelled after CoveResults, but I wasn't totally convinced either. I'll inline the type hint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out CoveResults is the same one-shot type alias, so I added another commit to remove it as well.

I think "list of thing" is more readable than an alias to the same. The CoveResults alias had me thinking the type was something more complicated.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed - I think it was a bit more useful once upon a time :)

@connelldave
Copy link
Owner

Looks good! Happy to merge when you are - wasn't sure from a previous merge strategy conversation if I'd been jumping the gun sometimes?

@iainelder
Copy link
Collaborator Author

I seem to have merge rights in the repo. Are you happy for me to press the button?

In #36 I suggested that we replace the default squash merge message with something that "smooths out the wrinkles" of corrective commits.

You can use git log to to simulate what Github would generate for the squash message like this:

$ git log --reverse --format=format:"* %s%n%b" master..
* Make the progress bar smoother
Use the "Submit and use as completed" pattern as described by Jason Brownlee in
"ThreadPoolExecutor in Python: The Complete Guide".

Fixes #37

* Make generator and pattern more explicit

* Refactor to improve readability of run_cove_function

* Use list of session info type hint directly

* Add type hint for clarification

I think here we can discard all the messages except the first one as they are just corrective commits that don't add to the headline feature. So my squashed commit message would look like this:

feat(cove_runner): Make the progress bar smoother

Use the ["Submit and use as completed"][doc] pattern as described by Jason
Brownlee in "ThreadPoolExecutor in Python: The Complete Guide".

[doc]: https://superfastpython.com/threadpoolexecutor-in-python/#Submit_and_Use_as_Completed

Fixes #37

@connelldave
Copy link
Owner

I think you can merge because I approved it, but I did make you a collaborator so CI ran so maybe you can just merge whenever anyway :D

Personally I just rebase -i and squash out the noisy commits into one that way before force pushing over my work in progress commits - "squash and merge" is a lazier version of that but I'm not worried about enforcing anything stylistically here. I think your suggested approach is similar to that but writing the prettified commit message in Github instead of local git, which is probably more user friendly?

If you want to bump the patch version in pyproject.toml and update the changelog though that'd be helpful before merging, then I can just hit the release button.

@iainelder
Copy link
Collaborator Author

I did make you a collaborator so CI ran

I think the CI requires a manual invocation only for a first-time contributor even if they are not a collaborator. As a collaborator, I have a lot of permissions to make changes to your repo and the supporting resources. All the collaborator access permissions are documented.

Apparently I even have permissions to manage releases. Is that what you were expecting? Would you like me to handle the release or is that something you want to retain control of?

If you want to bump the patch version in pyproject.toml and update the changelog though that'd be helpful before merging, then I can just hit the release button.

I'm happy to bump the patch version and update the changelog, but it feels dirty to mix in those changes with this PR. I like how so far the commits that prepare a release are separate from any commits for features and bug fixes.

Since I have collaborator permissions, I could merge this now as-is and then push a new commit to master with the changes to pyproject.toml and the changelog.

But before I make any changes I want to check that the setup is as you were expecting :-)

@connelldave
Copy link
Owner

connelldave commented Apr 26, 2022

The master branch is protected and requires a PR review - I pushed a codeowners file today while I remembered, but I wasn't worried about consequences of the collab permissions, both from a trust point of view and that I'd need to review :)

I'm relatively sure that just giving "run CI" is the biggie, since you can could pull the pypi key out of the CI anyway?

Appreciate you checking though - I definitely haven't given any of this a great deal of scrutiny past my initial thought :) I'll keep responsibility for releases for now (and add a changelog/release commit). Feel free to merge an approved PR when you're ready though.

@iainelder iainelder merged commit 0337849 into connelldave:master Apr 26, 2022
@iainelder iainelder deleted the smoother-progress-bar branch April 26, 2022 18:05
@iainelder
Copy link
Collaborator Author

Feel free to merge an approved PR when you're ready though.

Done! Thanks :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the progress bar smoother
2 participants