You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ x] I've verified that I'm running youtube-dl version 2019.08.02
[ x] I've searched the bugtracker for similar feature requests including closed ones
Description
The Internet Archive uses youtube-dl rather extensively for youtube archiving. Great piece of software - thank you for your efforts.
Being an Archive we have a specific objective which is to record the complete interaction of acquiring and saving the video. Currently we do this in a simplistic way - we get the yt url from youtube-dl and then use that url to download the video and record the complete http interaction into the WARC format (Web ARChive format). Other than providing the url yt-dl is not involved.
This works in most cases but is somewhat limiting - for example - each video source requires a unique solution. A more useful scenario would enable us to easily record a WARC (Web ARChive) file regardless of the http source selected. This might be accomplished either by
hooks or adding WARC output as a feature (or your preferred solution). WARC output is currently supported by curl if you wish to experiment. It is a widely excepted format and is not a complex specification.
Checklist
Description
The Internet Archive uses youtube-dl rather extensively for youtube archiving. Great piece of software - thank you for your efforts.
Being an Archive we have a specific objective which is to record the complete interaction of acquiring and saving the video. Currently we do this in a simplistic way - we get the yt url from youtube-dl and then use that url to download the video and record the complete http interaction into the WARC format (Web ARChive format). Other than providing the url yt-dl is not involved.
This works in most cases but is somewhat limiting - for example - each video source requires a unique solution. A more useful scenario would enable us to easily record a WARC (Web ARChive) file regardless of the http source selected. This might be accomplished either by
hooks or adding WARC output as a feature (or your preferred solution). WARC output is currently supported by curl if you wish to experiment. It is a widely excepted format and is not a complex specification.
More about the Archive
https://archive.org/
More info on WARC format:
https://en.wikipedia.org/wiki/Web_ARChive
The text was updated successfully, but these errors were encountered: