convert urls to the format that the option --download-archive archiveFile.txt converts them to #32730

Iridium-Lo · 2024-02-23T05:48:21Z

Checklist

I'm asking a question
I've looked through the README and FAQ for similar questions
I've searched the bugtracker for similar questions including closed ones

Question

Is there a way I can convert urls I have into the format that the option --download-archive archiveFIle.txt converts them to?

I deleted my archive file by accident, but still have the urls, I'd like to convert them back into the format they'd be in archiveFile.txt so I don't duplicate downloads.

The text was updated successfully, but these errors were encountered:

dirkf · 2024-02-23T14:01:58Z

The format is f'{cls.ie_key()} {video_id}', where cls is the IE class used by yt-dl to download the item and video_id is the id of the item from the info-json.

If you still have the info-json files for the archived items you can use a jq command to extract and format these values, or a Python (etc) script.

In general the only way to generate the archive entry is to process the URL with yt-dl, but that won't help with items that are no longer available (though the effect for such items is as if they were in the archive anyway).

Subject to that, I think that the only simple way to regenerate the archive is to re-download the items to a junk location. By using -f "worstvideo/worst" --test, the actual amount of downloading would be trivial, certainly compared with fetching a YouTube bloat web page.

If you can reliably work out the archive index values for some site, then it could be easy to make a script to write a fake archive file.

Related: #13687.

Iridium-Lo · 2024-02-23T16:25:57Z

many thanks @dirkf, I was trying to avoid redownloading all the files.

I have a bash script which uses gnu parallel to download an array of urls simultaneously with youtube-dl. I use it all the time.

You might think 'that's what a playlist is for,' but creating playlists is time consuming, you have to select the video then add it to the playlist one by one.

What I do is:

open many tabs with video urls I want
'bookmark all'
copy the text in the book mark
paste it to a file, named <whatever> run a script which writes the contents to an array
bash downloadSimultaneously.bash <whatever>
it makes a directory named <whatever> downloads videos simultaneously to that directory, with the archive option set

Would you accept a PR for that?

dirkf · 2024-02-24T00:07:58Z

Isn't this just:

<whatever xargs --max-procs=1 --max-args=1 --delimiter=' ' youtube-dl args...

If the URL list contains items whose generated filename happens to be the same, those downloads could interfere with each other. Ideally two yt-dl instances running at the same time should have different current directories, and the output templates should be relative to those directories.

Iridium-Lo · 2024-02-24T06:49:25Z

having wrote this I changed my logic. My script has no issues with same url downloads.

1 directory with a text file of download links
bash downloadSimultaneously.bash textFileName
creates textFileName, cds to created dir, yt-dl starts and makes archive file there. writes textFileName to an array (parallel needs an array)
parallel limits to 60 instances
running 400 yt-dl instances for example would be too much (for my laptop at least)

say I do args $(cat textFileName), does your command download all the urls simultaneously or asynchronously?

also you'd have to create a dir and cd to it, a manual step, this script is just cmd run.

Also why do you extract things like view count etc? I wanted to add some sites but can't be bothered with that kind of stuff

Iridium-Lo · 2024-02-24T06:54:17Z

solution given

Iridium-Lo added the question label Feb 23, 2024

Iridium-Lo closed this as completed Mar 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert urls to the format that the option --download-archive archiveFile.txt converts them to #32730

convert urls to the format that the option --download-archive archiveFile.txt converts them to #32730

Iridium-Lo commented Feb 23, 2024 •

edited

Loading

dirkf commented Feb 23, 2024 •

edited

Loading

Iridium-Lo commented Feb 23, 2024 •

edited

Loading

dirkf commented Feb 24, 2024

Iridium-Lo commented Feb 24, 2024 •

edited

Loading

Iridium-Lo commented Feb 24, 2024

convert urls to the format that the option --download-archive archiveFile.txt converts them to #32730

convert urls to the format that the option --download-archive archiveFile.txt converts them to #32730

Comments

Iridium-Lo commented Feb 23, 2024 • edited Loading

Checklist

Question

dirkf commented Feb 23, 2024 • edited Loading

Iridium-Lo commented Feb 23, 2024 • edited Loading

dirkf commented Feb 24, 2024

Iridium-Lo commented Feb 24, 2024 • edited Loading

Iridium-Lo commented Feb 24, 2024

Iridium-Lo commented Feb 23, 2024 •

edited

Loading

dirkf commented Feb 23, 2024 •

edited

Loading

Iridium-Lo commented Feb 23, 2024 •

edited

Loading

Iridium-Lo commented Feb 24, 2024 •

edited

Loading