The National Archives has a broad goal, stated in its Open Government Plan and elsewhere, of uploading its public domain digital assets from its catalog to Wikimedia Commons to make them available on Wikipedia and for other reuse. This tool consists of the script used to operate an approved bot account on Wikimedia Commons that performs automated uploads using medi and data from NARA's online catalog.
This tool is a script written in Python 2.7. It interacts with the National Archives Catalog API, as the source of the media and data to be uploaded, as well as the Wikimedia Commons API in order to perform the uploads. Its workflow consists of:
- Reading JSON data from the catalog
- Running through a series of steps to parse the necessary metadata fields
- Populating a template with the fields to generate the wiki page text for the Wikimedia Commons upload
- Downloading the file from the catalog
- Using mwclient to upload the media file and accompanying wiki page text
- Repeating for each media file in the set, paginating as necessary, until complete
In addition to Python 2.7, the script requires the following two Python libraries:
- Requests (http://docs.python-requests.org/)
- mwclient (http://mwclient.readthedocs.io/)