Skip to content

see if a URL is available in a web archive somewhere on the web

Notifications You must be signed in to change notification settings

edsu/webarchives

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

webarchives

webarchives is a Python module for easily determining if a given URL is available from a known Web archiving project. The idea is that it could be handy in situations where you have a URL, but the URL no longer resolves, and you would like to see the content. Web archiving projects are being run by national libraries, archives and non-profits that are part of the International Internet Preservation Consortium.

The genesis for webarchives was work done by the Memento Project on the Memento Proxy which provided the seed for the scraping backend modules used by webarchives.

Usage

The webarchives module provides a function lookup, which you pass a url that you want to lookup in the Web archives. lookup will return a list of (time, url) tuples. Each tuple represents when the requested url was archived and where the archived representation can be retrieved from.

import webarchives

print webarchives.lookup("http://www.geocities.com/homestead/homedir.html")

About

see if a URL is available in a web archive somewhere on the web

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages