A custom Django model field that automatically archives a URL
Install this library from the Python Package Index.
$ pip install django-urlarchivefield
Add it to one of your models.
from django.db import models from urlarchivefield.fields import URLArchiveField class MyModel(models.Model): archive = URLArchiveField(upload_to="my_archive")
Saving an URL's HTML content to your storage backend becomes this easy.
>>> obj = MyModel.objects.create(archive="http://www.latimes.com") >>> # You can do it this way too >>> obj = MyModel() >>> obj.archive = "http://www.latimes.com" >>> obj.save()
The field attribute now has all the typical qualities of a Django FileField with a few additions.
# Return the URL that was originally submitted for archival >>> obj.archive.archive_url # Return the timestamp when the archive was created >>> obj.archive.archive_timestamp # Return the HTML from the archive >>> obj.archive.archive_html
By default, the data are compressed before being saved by gzip. If you'd rather store the raw
HTML set the
compress keyword argument on the field.
class MyModel(models.Model): archive = URLArchiveField(upload_to="my_archive", compress=False)
This is a joint project of PastPages.org, The Reynolds Journalism Institute and the University of Missouri.