A custom Django model field that automatically archives a URL
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



A custom Django model field that automatically archives a URL

Build Status PyPI version Coverage Status

Getting started

Install this library from the Python Package Index.

$ pip install django-urlarchivefield

Add it to one of your models.

from django.db import models
from urlarchivefield.fields import URLArchiveField

class MyModel(models.Model):
    archive = URLArchiveField(upload_to="my_archive")

Saving an URL's HTML content to your storage backend becomes this easy.

>>> obj = MyModel.objects.create(archive="http://www.latimes.com")
>>> # You can do it this way too
>>> obj = MyModel()
>>> obj.archive = "http://www.latimes.com"
>>> obj.save()

The field attribute now has all the typical qualities of a Django FileField with a few additions.

# Return the URL that was originally submitted for archival
>>> obj.archive.archive_url
# Return the timestamp when the archive was created
>>> obj.archive.archive_timestamp
# Return the HTML from the archive
>>> obj.archive.archive_html

By default, the data are compressed before being saved by gzip. If you'd rather store the raw HTML set the compress keyword argument on the field.

class MyModel(models.Model):
    archive = URLArchiveField(upload_to="my_archive", compress=False)


This is a joint project of PastPages.org, The Reynolds Journalism Institute and the University of Missouri.