Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Add support for taking multiple snapshots of websites over time #179
This is by far the most requested feature.
People want an easy way to take multiple snapshots of websites over time.
This will be easier to do once we've added pywb support since we'll be able to use timestamped de-duped WARCs to save each snapshot: #130
For people finding this issue via Google / incoming links, if you want a hacky solution to take a second snapshot of a site, you can add the link with a new hash and it will be treated as a new page and a new snapshot will be taken:
echo https://example.com/some/page.html#archivedate=2019-03-18 | ./archive # then to re-shapshot it on another day... echo https://example.com/some/page.html#archivedate=2019-03-22 | ./archive
Looking forward to this feature. Thanks for the hacky workaround as well, I have a few pages I'd like to continue monitoring for new content but I was worried about the implications of my current backup being overwritten by a 404 page if the content went down.
I just updated the README to make the current behavior clearer as well: