Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to download web archives as WARC files instead of WACZ? #49

Closed
YousufSSyed opened this issue Aug 29, 2021 · 1 comment
Closed

Comments

@YousufSSyed
Copy link

YousufSSyed commented Aug 29, 2021

Either warc or warc.gz?

@ikreymer
Copy link
Member

Yes, from the individual archive view, on the left, you can download either WACZ or WARC:
Screen Shot 2021-08-29 at 11 43 55 AM

Note that the WACZ is just a zip file that contains WARC as well. We recommend WACZ as it includes additional metadata, such as list of pages and a build-in index, and that makes it easy to host large web archives online and make them accessible without download the entire file. With WARC, the entire file needs to be downloaded as there is no built-in index (due to limitation of the WARC standard).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants