Skip to content
Chrome extension to "Create WARC files from any webpage"
JavaScript HTML
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
html Rm old references to nested FileSaver Oct 22, 2018
icons Mv meta imgs and PSDs Feb 7, 2019
js Fix string indicating crawl when it is not a "crawl" per se. Jan 25, 2019
meta Mv meta imgs and PSDs Feb 7, 2019
.gitignore Rm cruft and deprecated functions. Jun 6, 2017
.travis.yml Standard adjustments Jan 12, 2018
LICENSE Update LICENSE Jan 12, 2018
README.md Match license in README to actual Jan 13, 2018
manifest.json Rm ocupload, does not comply with Google Jan 19, 2019
package-lock.json
package.json Update contact info Feb 26, 2020

README.md

WARCreate logo
WARCreate

"Create WARC files from any webpage"

TravisCI build status

WARCreate is a Google Chrome extension with an aim to be able to "Create WARC files from any webpage".

With WARCs normally being limited to be generated by Internet Archive's Heritrix Archival Crawler, providing another means of generating these files from webpages opens the door to

  • Preserving content not accessible to crawlers (e.g., deep web contents)
  • Circumventing the complication and overhead needed to setup a Heritrix instance by an end-user
  • Allowing a webpage to be interacted with (e.g., Facebook comments unrolled) prior to preservation, ensuring content that might not be initially present in a page is available to be captured.

...among many other use cases.

WARCreate is currently in active development though has gone through various release and retraction periods due to changes in the Google Chrome extension API and rules controlling extension distribution.

The original idea and prototype was published in the Joint Conference on Digital Libraries 2012 (JCDL '12) Proceedings.

Install

The latest stable binary can be downloaded from the Chrome Web Store.

Sample Usage

(TODO)

Contact

WARCreate is a project of the Web Science and Digital Libraries (WS-DL) research group at Old Dominion University (ODU), created by Mat Kelly.

For support e-mail warcreate@matkelly.com or tweet to us at @machawk1 and/or @WebSciDL.

License

MIT

You can’t perform that action at this time.