Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
An HTTP-based warc-to-zip converter
Branch: master
Clone or download
Alard
Latest commit 2ca8159 Mar 8, 2013
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
hanzo
.gitignore Initial commit. Jun 6, 2012
Procfile Initial commit. Jun 6, 2012
README.md Update readme (2). Mar 8, 2013
app.py Add a no-conversion option. Mar 7, 2013
requirements.txt
stream_post.py POST features. Aug 27, 2012

README.md

Experimental WARC to ZIP converter.

This conversion service converts .warc.gz to .zip files. You can use it in several ways:

  1. Create a WARC-to-ZIP download link.

You specify the HTTP location of the WARC and the conversion parameters.

Usage:

http://warctozip.archive.org/<url to warc.gz>

Maybe the WARC file is part of a larger file:

http://warctozip.archive.org/<byte range>/<url to warc.gz>

(the server must support HTTP Range requests)

A ZIP filename can be specified if you don't like the default:

http://warctozip.archive.org/<zip filename>/<url to warc.gz>
http://warctozip.archive.org/<byte range>/<zip filename>/<url to warc.gz>

To link to WARC records within a larger file, without converting to ZIP:

http://warctozip.archive.org/<byte range>/<warc filename>/<url to warc.gz>

(the server must support HTTP Range requests)

Examples:

http://warctozip.archive.org/https://github.com/downloads/alard/warc-proxy/picplz-00454713-20120603-143400.warc.gz
http://warctozip.archive.org/example.zip/https://github.com/downloads/alard/warc-proxy/picplz-00454713-20120603-143400.warc.gz
http://warctozip.archive.org/30573088768-30573890406/http://archive.org/download/mobileme-hero-1335023445/mobileme-full-1335023445.tar
http://warctozip.archive.org/30573088768-30573890406/homepage.mac.com-jcarias.zip/http://archive.org/download/mobileme-hero-1335023445/mobileme-full-1335023445.tar
http://warctozip.archive.org/30573088768-30573890406/homepage.mac.com-jcarias.warc.gz/http://archive.org/download/mobileme-hero-1335023445/mobileme-full-1335023445.tar
  1. Send the data via an HTTP POST request.

Use Curl or another download tool to upload your WARC. The service responds with a ZIP version of the contents.

Usage:

curl -d @your.warc.gz http://warctozip.archive.org/ > your.zip
  1. Create a multipart file upload form.

Create an upload form that POSTs a WARC file to http://warctozip.archive.org/. You'll get a ZIP file back.

Usage:

<form method="post" enctype="multipart/form-data" action="http://warctozip.archive.org/">
  <input type="file" name="warc" />
  <input type="submit" value="Convert" />
</form>

You can specify the file name of the ZIP file in the URL:

http://warctozip.archive.org/<zip filename>
You can’t perform that action at this time.