Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
A tool to fix dead URLs by replacing them with pointers to archive.org
Common Lisp
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
lib
obso
src
.hgignore
README
heroku-setup.lisp
load.lisp
reset-cache
waybacker.asd

README

A tool for updating stale links on web pages.

Theory:

Web links often get stale, particularly on old blog posts.  This tool will take a web page, find dead links, and find substitutes for them using the Wayback Machine facility on archive.org.

The Wayback Machine (surprisingly) lacks an API, so this relies on screen-scraping, which is likely to break in the future.

Caveat: the archive picked is the latest available, which, given the way websites degenerate before they die, is not necessarily the best substitute.  So you might want to go pick one by hand.

Usage:
1) Start a Lisp (tested in Clozure)
2) Load Quicklisp
3) Load the wayback.lisp file
4a) (process-page "http://xenia.media.mit.edu/~mt") 

Output:
A lot of verbosity.
The value returned is a list of bad urls and their backups on archive.org:
(("http://el.www.media.mit.edu/groups/el/projects/fishtank/"
  "http://web.archive.org/web/20100618022707/http://el.www.media.mit.edu/groups/el/projects/fishtank/")
 ("http://mt.www.media.mit.edu/people/mt/papers/chi94/chi94.html"
  "http://web.archive.org/web/20060901124153/http://mt.www.media.mit.edu/people/mt/papers/chi94/chi94.html")
 ("http://www.ddj.com/documents/s=889/ddj0001l/0001l.htm"
  "http://web.archive.org/web/20070529043833/http://www.ddj.com/documents/s=889/ddj0001l/0001l.htm"))

OR,
4b) (transform-url "http://xenia.media.mit.edu/~mt") 
Will output the transformed contents of the page on the console. 


Todo:
- a web UI, duh
- exclude self URLs
- capture and report errors
- handle "bad" URLs (eg those that contain a ?)
- blogger interface that actually rewrites the entry,
  and similarly for other services


Blog repair setup:
load .asd
(asdf:operate 'asdf:load-op :waybacker)
-- will start on port 8080
Use localtunnel to open up the port
> localtunnel 8080
edit *callback-uri* to reflect the assigned host
go to [host]/obtain
Something went wrong with that request. Please try again.