-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor pattern matching [cps/tasks/download.py: parse yt-dlp output for download progress percentage] #88
Conversation
Adjusted the download progress polling to dynamically capture progress updates from the subprocess output, enhancing accuracy.
Fix indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deldesir Special Request:
Please communicate in 3-6 words the central purpose of each commit.
Not in the hidden secondary commit message. But in the actual commit message.
With non-descriptive commit messages like "Update download.py" it becomes unnecessarily complicated for everyone to understand (1) what is happening and (2) why. 😕
Goal: Commit Messages should Communicate Clear Intent! ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I thought I wrote "Fix indentation"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I wrote it in description field. I thought I was putting the commit message in the title. It was an accident, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everyone benefits when motivation is clear & concise e.g. anything like:
-
"cps/tasks/download.py: Fix indentation"
-
"download.py: Fix indenting in ~30 lines"
(Or anything similar!)
else: | ||
log.error("Failed to send the list of requested files to %s", self.original_url) | ||
self.progress = 0.99 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I record the first 100% progress cycle as %99 so the script won't stop before yt-dlp finishes. There is often one or more quick 0%-100% cycles after the first one (for example the thumbnail and/or subtitles downloads)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely don't understand it yet, ah well 😎
But thanks so much for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I record the first 100% progress cycle as %99 so the script won't stop before yt-dlp finishes. There is often one or more quick 0%-100% cycles after the first one (for example the thumbnail and/or subtitles downloads)
@deldesir ideally put some-or-all of this explanation in-line as a comment, near the Python code in question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put this explanation (or similar) into a comment near the actual Python code.
Then let's move forward with merging this PR.
Thanks!
Has this PR been quickly tested? If so we should feel free to merge it, to keep community momentum! |
Yes, it is well tested.
|
calibre-web/cps/tasks/download.py Lines 47 to 57 in 4da5c46
The first 100% progress cycle is recorded as %99. This prevent the script from being stopped before yt-dlp finishes. There is often one or more quick 0%-100% cycles after the first one (for example the thumbnail and/or subtitles downloads) |
cps/tasks/download.py
Outdated
@@ -41,44 +42,42 @@ def run(self, worker_thread): | |||
p = process_open(subprocess_args, newlines=True) | |||
|
|||
# Define the pattern for the subprocess output | |||
pattern_analyze = r"Running ANALYZE" | |||
pattern_download = r"'action': 'download'" | |||
pattern_progress = r"downloading" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "downloading" is an extremely common word, that might appear anywhere on a line in future.
Does the following make the regex a bit more precise, and future-proof?
pattern_progress = r"downloading" | |
# Equivalent Regex: https://github.com/iiab/calibre-web/blob/3c3c77f4dbf54ff093c3e3a68ac9b93a522c4070/scripts/lb-wrapper#L26 | |
pattern_progress = r"^downloading" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad News:
Regex "^downloading"
was being used incorrectly yesterday. Functional testing didn't happen when it should have (to catch this serious oversight). ❌
Good News:
PR's like #90, #91 and #93 are getting much smaller and hence tangibly clearer ~ "TDD Craftsmanship" Organic + Incremental + Visible ~ nurturing a Responsive Culture of Collaboration.
By learning from our mistakes (the only way!) one day at a time 🌱 🌱 🌱
Co-authored-by: A Holt <holta@users.noreply.github.com>
Legibility = Understandability needs to be the central driving force going forward:
Thank @deldesir for taking approachable (small) PR's very seriously! AI² = Articulate Assumptions, Intent & Intuition |
Co-authored-by: A Holt <holta@users.noreply.github.com>
@deldesir any quick tip(s) you can provide to @EMG70 to help him reconfirm / validate this just-merged PR? e.g. Is there anything specific he should look for in "Tasks" view today or in coming days? calibre-web/cps/tasks/download.py Lines 44 to 46 in d594aeb
Related:
|
A positive scenario testing should finish with 100% recorded in tasks' progress column.
We will work today on tasks' messages printed in Task column (this is a discussion for another PR) |
These changes enhance the download task in several ways:
Smoke and feature tested on Ubuntu 24.04
I reviewed the changes in the download task logic and subprocess handling.
I confirm that the dynamic progress calculation reflects the download progress.
I verified the improved subprocess output pattern for reliable progress tracking.
I checked the adjusted completion condition for 99% progress at 100% subprocess completion
Testing Steps in Calibre-Web UI