Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--match-filter Privated, Deleted or Premium videos? #689

Closed
3 tasks done
barkoder opened this issue Aug 12, 2021 · 6 comments
Closed
3 tasks done

--match-filter Privated, Deleted or Premium videos? #689

barkoder opened this issue Aug 12, 2021 · 6 comments
Labels
question Question

Comments

@barkoder
Copy link

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

Using yt-dlp version 2021.08.10 (zip).

I have a list of youtube URLs. I only need to get the IDs of videos that are either Private or Deleted or Premium.

I know that in order to only get unlisted URLs from the list I could use

yt-dlp --match-filter "availability = 'unlisted'" --batch-file list_of_youtube_links.txt --get-id

What should the --match-filter be to only get the ids of

  1. Privated,
  2. Deleted,
  3. Premium videos?

I'm especially looking for a way to only --get-id of URLs that give this error:

ERROR: Video unavailable. This video is no longer available because the YouTube account associated with this video has been terminated.

Thanks!

@barkoder barkoder added the question Question label Aug 12, 2021
@pukkandan
Copy link
Member

There is no single-step way to do this since we cannot extract any metadata from unavailable videos.

You will need to first run yt-dlp --download-archive archive -s --force-write-archive -a batchfile. Now all videos that can be downloaded will be recorded in the archive. Then you can compare it against your batchfile to get the videos that aren't downloadable

@barkoder
Copy link
Author

@pukkandan commented on Aug 14, 2021, 6:07 PM UTC:

You will need to first run yt-dlp --download-archive archive -s --force-write-archive -a batchfile. Now all videos that can be downloaded will be recorded in the archive. Then you can compare it against your batchfile to get the videos that aren't downloadable

It's a little tedious but with sort | uniq -u that works! Thanks!

There is no single-step way to do this since we cannot extract any metadata from unavailable videos.

That is unfortunate, since I have ~10-20K videos saved, and am just trying to ascertain which ones among those have been removed from public youtube, so it would be great if there were also an --archive-errors errored-ids.txt that saves errored IDs in addition to good ones.

@Zirro
Copy link
Contributor

Zirro commented Aug 16, 2021

...so it would be great if there were also an --archive-errors errored-ids.txt that saves errored IDs in addition to good ones.

I have had a feature like this in mind as well, but for a different reason. I regularly re-crawl large playlists where most of the videos are already present in my download archive. yt-dlp quickly filters these out by pre-checking the archive, but that leaves mostly deleted videos which yt-dlp attempts to extract every time. With enough such videos, this becomes a significant bottleneck.

With some types of errors, we can be pretty certain that a video won't become available again. By keeping track of these in a separate "error archive", yt-dlp can save the time it takes to rediscover that the same videos have been deleted every time. Determining which errors should be considered permanent and thus suitable for the archive vs user-specific (age-gated) or temporary (private videos which could potentially be made public again) might require dividing them into categories as described in #457, though.

@pukkandan
Copy link
Member

I've also wanted this, but can't figure out how to implement it 😅

The exceptions are thrown from all over the place and by the time they are caught, it is no longer known what the URL/video_id that threw the error is. I can probably make a dirty hack by saving the last processed URL in a class variable, but this is not a good practice and definitely won't be easy to maintain in the future

@pukkandan
Copy link
Member

From 1151c40, the error message will show the extractor and the video id. I know it isn't exactly what you requested, but this can help you find errored videos much easier than before

$ yt-dlp yZIXLfi8CZQ qEJwOuvDf7I -q
ERROR: [youtube] yZIXLfi8CZQ: Private video. Sign in if you've been granted access to this video
ERROR: [youtube] qEJwOuvDf7I: This live stream recording is not available.

Note that for extractors other than youtube, the ID shown may not be the actual id (that is used in archive). This happens since we may not know the actual id when the extraction fails


btw, please close this issue if your original question has been fully answered.

it would be great if there were also an --archive-errors errored-ids.txt that saves errored IDs in addition to good ones.

if you still want this, open a feature request

@pukkandan
Copy link
Member

Closed due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question
Projects
None yet
Development

No branches or pull requests

3 participants