Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method to allow writing of WARC file format as output. #21983

Open
openAccess opened this issue Aug 2, 2019 · 0 comments
Open

Method to allow writing of WARC file format as output. #21983

openAccess opened this issue Aug 2, 2019 · 0 comments
Labels

Comments

@openAccess
Copy link

Checklist

  • [ x] I'm reporting a feature request
  • [ x] I've verified that I'm running youtube-dl version 2019.08.02
  • [ x] I've searched the bugtracker for similar feature requests including closed ones

Description

The Internet Archive uses youtube-dl rather extensively for youtube archiving. Great piece of software - thank you for your efforts.

Being an Archive we have a specific objective which is to record the complete interaction of acquiring and saving the video. Currently we do this in a simplistic way - we get the yt url from youtube-dl and then use that url to download the video and record the complete http interaction into the WARC format (Web ARChive format). Other than providing the url yt-dl is not involved.

This works in most cases but is somewhat limiting - for example - each video source requires a unique solution. A more useful scenario would enable us to easily record a WARC (Web ARChive) file regardless of the http source selected. This might be accomplished either by
hooks or adding WARC output as a feature (or your preferred solution). WARC output is currently supported by curl if you wish to experiment. It is a widely excepted format and is not a complex specification.

More about the Archive
https://archive.org/

More info on WARC format:
https://en.wikipedia.org/wiki/Web_ARChive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant