GitHub - cgiffard/Re-Serve: Node utility for archiving and serving cached websites

Re-Serve

A really straightforward way to archive and serve entire domains from cache. Relies on node and node-simplecrawler to work (packaged).

Happily churns through gigabytes and gigabytes of data - I've used this for archiving some very large websites.

Please don't abuse it! The archiver prioritises speed of archiving over being nice to webservers, (I'll add a runtime preference for this in future) so make sure you have the blessing of your webmaster. 😖

Usage

Please note that this code is still somewhat immature, and usage will be changed as it is cleaned up.

Archiving

./index.js -v domain1.com domain2.com domain3.com domain4.com/initialpath

Serving From Cache

Detects domain to serve based on Host HTTP header:

./servecache.js

Or you can scope what it serves to a specific domain:

./servecache.js domain1.com

Repairing the cache

./repaircache.js

Notes

Makes a 'cache' folder in the CWD. This is as close as possible to a complete directory-for-directory clone of a given website.

Still a bit 'developmenty', but it should be extremely easy to debug.

Once I've published node-simplecrawler to npm, I'll remove the submodule from the directory and structure the repo to make it easier to use as a system-wide utility.

Todo:

Custom UA support
NPM
Tests (I'm one of those bastards that 'forgets' to do TDD from the start)
Cleanup

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
node-simplecrawler @ 8677813		node-simplecrawler @ 8677813
.gitmodules		.gitmodules
README.md		README.md
index.js		index.js
repaircache.js		repaircache.js
servecache.js		servecache.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Re-Serve

Usage

Archiving

Serving From Cache

Repairing the cache

Notes

Todo:

About

Releases

Packages

Languages

cgiffard/Re-Serve

Folders and files

Latest commit

History

Repository files navigation

Re-Serve

Usage

Archiving

Serving From Cache

Repairing the cache

Notes

Todo:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages