Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does --skip-unavailable-fragments finalize the video? #21720

Closed
altimumdelta opened this issue Jul 10, 2019 · 7 comments
Closed

Does --skip-unavailable-fragments finalize the video? #21720

altimumdelta opened this issue Jul 10, 2019 · 7 comments
Labels

Comments

@altimumdelta
Copy link

@altimumdelta altimumdelta commented Jul 10, 2019

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

Does it just skip on any error like 404 or 503 and finalized the download with the available fragments or not?

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jul 10, 2019

What do you mean by finalize in the first place?

@dstftw dstftw closed this Jul 10, 2019
@altimumdelta
Copy link
Author

@altimumdelta altimumdelta commented Jul 10, 2019

To mux/merge and save the video as downloaded and treat it as if it's complete, sorry I'm doing this in the context of an archive, not for a one off download.

I was hoping that it skips them but leaves it temp so that it can be continued on the next session, but I wanted to be sure, it's not something I can just easily test IMO, as these occurences are sporadic.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jul 10, 2019

It will skip unavailable fragments and output video with only downloaded fragments. Also such skipped fragments are unlikely to be available on the next run since each fragment is retried by default before being skipped.
It's also not possible to continue on the next run since fragments are not stored individually but joined immediately for saving disk storage.
What you can do is to abort with --abort-on-unavailable-fragment and continue on the next run starting with this first unavailable fragment.

@altimumdelta
Copy link
Author

@altimumdelta altimumdelta commented Jul 11, 2019

Well ... fififurfufi ... this means over ~20% of all my videos maybe be more or less broken and unplayable in some cases if the fragment missing was the file header ?

I stopped using --abort-on-unavailable-fragment and --abort-on-error because I thought why bother with hanging on that one video when it could be downloading the rest in the meantime and get back to it later or on another session. I wanted to deferr that and "SKIP TO NEXT VIDEO", not abort the WHOLE batch, because I don't want to sit all day in front of the script and manually restart it every 20-40 minutes.

I hope you understand these are heavily unfavorable circumstances this current behavior is imposing.

This should be dealth with immediately IMO, this means all my +1TB archives are quite a bit less useful if everytime there were unavailable fragments the video was labeled as "100% complete". The sheer number of videos means it would be extremely tedious to go and check each video one by one, it would help if I saved all the logs correctly which I didn't since I was simultaneously learning about youtube dl and improving my scripts, only now a few days ago I got up to that point.

Alternatively, or in addition, there really could be some kind of a verification ability to verify the integrity of the archive, redownload the ones with missing fragments, my skills are limited but I'm willing to do something if I can to help with this.

This goes hand-in-hand with my previous idea about overhaul of the metadata that is recorded in the archive log file, I would bascially include pretty much the full fragment data as well on top of my previous suggestions about datetime, filename, etc so that the structure of the archive can be fixed in case something got edited in the resulting videos, but this right here is the biggest reason why that is quite a good idea.

However I have many times seen when I restart the session it downloads some videos it previously wasn't able, but I haven't figure if that's due to video's being private and made public (5 year old videos suddenly becoming public in the span of days or one week is a really long shot) or that the copyright restrictinons were lifted, anyway I have no idea so that was making it look like that the rest of the fragments were being downloaded later, we need to really be sure about this ... ?

@altimumdelta
Copy link
Author

@altimumdelta altimumdelta commented Jul 11, 2019

I just noticed stoping and restarting suppose to continue but I see some .part files being left over and the file seems to be completely skipped, some kind of timing coincidence has written the file into the download log as "finished" and the program doesn't seem to check the extension is still .part ... oh

I also was suspicious why the heck is ffmpeg remuxing the videos so long to mkv, the ones which aren't fragmented to Audio and Video, the debug line does not show -acodec copy and vcodec copy options which means it's using default values for, that's a critical issue right there, it's re-coding the whole video which is complete nonsense, sorry I'm being a bit fuming now that I have to start from scratch but I guess I can just suck it up and treat it as "training" or "experience", I should have figured out this before, should have paid attention to the messaging more.

This is some heavily unwelcome and unrealiable behavior across the board and is not compatible with the whole point of creating an archive, I'm basically having to stop everything until it's solved, I'll take a look at the code myself even.

I guess with what I meant "finalize" is the point it writes the URL to the download logfile, that's where it treats it as already downloaded, it does seem to check files, and I even noticed a clever trick to make it recode the videos to a custom container by deleting the download log it will start downloading each, find it already exists by filename and will just recode it without downloading, that's good behavior, but it should be proper to make an option in the config rather than such a hidden trick.

I think I spoke about the recode-video option name context before, I think I didn't realize at the time it wasn't remuxing, or I completely forgot and kept using the command because I have quite a lot of scripts and sometimes I do get lost and forget as I keep updating the overall default template that is used by all of them.

It should never ever be touching the video or audio, it has to be original as it came from youtube for a proper archive, I know I can do another step with an external program but I would really appreciate a proper remux option.

EDIT: Could this be due to "no-overwrites" option, that it doesn't overwrite the .part file? ... it never created the actual video file of that ID, I need to be clear, I'm not saying it didn't just cleanup the .part, I'm saying it never finished the part and never redownloaded, it just assumed it's done and moved forward.

@altimumdelta
Copy link
Author

@altimumdelta altimumdelta commented Jul 11, 2019

I was able to tweak together a solution, it will abort on all errors, won't recode, will overwrite, and put it all into a Windows Task so it starts every 5 minutes but doesn't make a new instance if it exists, so after the batch craps out it will restart it, but in such a case where a particular video will simply not work for many many hours the solution isn't practical enough still, it keeps retrying and retrying and now it gets more complicated trying to skip this individual video adding a filter but most importantly requiring human intervention to make that filter so I have to again come back and keep checking if it stalled or not. But okay I'm just wishing for this convenience, although the time/effort spent on this stuff is well worth it in the end anyway.

@altimumdelta
Copy link
Author

@altimumdelta altimumdelta commented Jul 11, 2019

I might put up a separate proper feature requests for these points, some already are, and I understand that archive-mode isn't that popularly used, if people prefer storage space over broken video than that I guess is their right to think so, but I would just hope the program behaves differently when in archive-mode more specifically rather than just a tack-on.

I apologize for steaming out a bit, I got over it, adjusted my scripts good and even tho I have to check every 2 hours to add new filters, it does appear that it's doing the job correctly for now, but I don't want to say this is it and it's solved, the whole point of such a program is to be able to manage the whole thing by it self, with a clear picture of what's going on, without major external tricks.

Being fair I wouldn't need most of those right now, all I would need is simply a "deferr download", if it's unavailable, it's an error, skip it but don't abort and don't mark it downloaded, deferr it by either trying again in the end or don't do anything for another session to pick up, and continue to next video/audio.

There's no issue with having to restart the script, I don't have a problem with doing that manually multiple times a day as long as it's a good 3-6 hour session, the issue is that when a video or fragment-s are unavailable they're many times unavailable for a really long time (youtube slow day), which can last a day or two or almost a week, which means the script just loops on the same video forever.

So I have to keep adding --match-filter "id != 'URL_ID' & id != etc etc" - But I'm just glad it at least works as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.