Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort job if more than 50% of data is missing after trying x MB #1785

Closed

Conversation

puzzledsab
Copy link
Contributor

https://forums.sabnzbd.org/viewtopic.php?f=4&t=25183

Sometimes you know the downloads will fail if there are a lot of missing articles early. After having tried missing_threshold_mbytes MB it will fail if the number of failed bytes exceeds the number of downloaded bytes.

I don't see any perfect solutions for this and it's a bit hackish so I didn't make a proper config. The threshold will have to be a guess, and if the user thinks it's a false positive they can retry. I'm very open for suggestions on how to improve it.

@thezoggy
Copy link
Contributor

thezoggy commented Feb 5, 2021

Isn't this already implemented as the % required completion value?

@puzzledsab
Copy link
Contributor Author

I don't think so, but I realize now that I don't understand how req_completion_rate is supposed to be used. Why would you require more or less than 100%?

Either way, I think this can let SAB mark the download as failed much earlier when there are a lot of par2-files.

@@ -1172,6 +1173,8 @@ def remove_article(self, article: Article, success: bool):
# Check the availability of these first articles
if cfg.fail_hopeless_jobs() and cfg.fast_fail():
job_can_succeed = self.check_first_article_availability()
if not job_can_succeed:
abort_reason = "https://sabnzbd.org/not-complete (check_first_article_availability)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you want to specify, but this is not usefull for regular users, let's not modify the text.
No need for abort_reason.

job_can_succeed, _ = self.check_availability_ratio()
if not success and job_can_succeed and not self.reuse:
# Abort if more than 50% is missing after reaching missing_threshold_mbytes
job_can_succeed = self.check_missing_threshold_mbytes()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be called if cfg.fail_hopeless_jobs() == True, like the other checks.

if self.bytes_missing > self.bytes_downloaded:
missing_threshold = cfg.missing_threshold_mbytes() * MEBI
if missing_threshold and self.bytes_tried > missing_threshold:
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add INFO logging here to notify why the job was aborted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other tests will only do a debug log. How about either a debug or info log for each test, like this?
logging.debug('Abort job "%s", due to impossibility to complete it (test: check_first_article_availability)', self.final_name)

@@ -1352,6 +1352,7 @@ def saveSwitches(self, **kwargs):
"selftest_host",
"rating_host",
"ssdp_broadcast_interval",
"missing_threshold_mbytes",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this after the req_completion_rate?
Both here and in cfg.py.

@puzzledsab
Copy link
Contributor Author

I kept the old logging but added abort_reason so that all three checks behave the same way.

@thezoggy
Copy link
Contributor

thezoggy commented Feb 8, 2021

this change looks to download 50M of a nzb and basing if its complete based on a % of that 50M? or if more than 50M is missing altogether? either of these is prone to false failures...

@puzzledsab
Copy link
Contributor Author

More than half of the tried data must be available after reaching the configured limit. Yes, it can cause false positives, but then so can the first articles search.

@Safihre
Copy link
Member

Safihre commented Feb 10, 2021

@puzzledsab I did a small refactor.
You mentioned something about 50% but I couldn't find that anywhere?
I changed it now to trigger if "missing > succeeded and missing > threshold".

@puzzledsab
Copy link
Contributor Author

self.bytes_missing > self.bytes_downloaded translates into failing if more than 50% is missing when the threshold is met. I agree that checking missing is better because it can fail faster.

Not sure how useful this is in the end but at least it's a small change with minimal performance impact.

@Safihre
Copy link
Member

Safihre commented Feb 14, 2021

@puzzledsab So should we still add it? I am also not certain if it is usefull..

@puzzledsab
Copy link
Contributor Author

I was hoping to hear back from nismozcar in the thread. I remember having had this problem myself a year ago, but I no longer see it because I haven't downloaded the kind of old postings that have that problem anymore. I think some files have enough of the first articles left not to trigger that check. In those cases it can take a while before it gives up because some servers take a long time to respond when articles are gone.

@puzzledsab puzzledsab closed this Feb 19, 2021
@puzzledsab puzzledsab deleted the feature/article_fail_limit branch June 6, 2021 12:05
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants