Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

simGeorge: background and license

This repository hosts code/settings for creating a GPT-2 inflected language model that writes catalogue descriptions in the "voice" of Mary Dorothy George.

These code/settings can be run via Max Woolf's Colaboratory Notebook (see 'How To Make Custom AI-Generated Text With GPT-2'), or a local install of the python package gpt-2-simple (though the latter hasn't been tested).

The dataset used is CurV-corpus-27Jan2019.txt at

Unless otherwise stated, these materials are licensed under a GNU General Public License v3.0.

This work is based on data created during the project 'Curatorial Voice: legacy descriptions of art objects and their contemporary uses', and is associated with the project 'Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship'.

Two datasets and two papers have emerged from this work:

  • James Baker and Andrew Salway, ‘Curatorial labour, voice, and legacy: Mary Dorothy George and the Catalogue of Political and Personal Satires, 1930-1954’, Historical Research (forthcoming 2020)
  • Andrew Salway and James Baker, ‘Investigating Curatorial Voice with Corpus Linguistic Techniques: the case of Dorothy George and applications in museological practice’, Museum & Society (2020).
  • Baker, James, & Salway, Andrew. (2019). Corpus Linguistic Analysis of the BMSatire Descriptions corpus [Data set]. Zenodo.
  • Baker, James, & Salway, Andrew. (2019). Creation of the BMSatire Descriptions corpus (Version v1.0). Zenodo.

All data are derived from text written by M. Dorothy George and published between 1935 and 1954 as volumes 5 to 11 of the Catalogue of Political and Personal Satires Preserved in the Department of Prints and Drawings in the British Museum. This text is published in lightly edited form by the British Museum via ResearchSpace as linked open data at The data, text and images available via this service are published under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license (Research Space, 2016; accessed 10 September 2018).


Making a GPT-2 inflected model that writes descriptions a bit like Mary Dorothy George






No releases published


No packages published