Skip to content

Latest commit

 

History

History
35 lines (24 loc) · 1.54 KB

README.md

File metadata and controls

35 lines (24 loc) · 1.54 KB

Topic modelling of Trove Books

The data can be downloaded from the GLAM Workbench website.


Created by Adel Rahmani

The data

The data comes from the GLAM Workbench website (follow the link at the bottom of the page to download the data), and is comprised of 9,738 documents harvested and kindly made available by Tim Sherratt.

The analysis

The code Trove_Digitised_Books.ipynb file is a Jupyter notebook documenting my initial exploration of the data. The code is written in Python 3.7. The notebook can also be viewed here.

Requirements

If you are using the Anaconda distribution, you can reproduce my virtual environment by using the provided environment.yml configuration file. This can be done by running

conda env create -f environment.yml

in a terminal.

Note: On macOS I had to use the following to install the CLD2 library:

export CC=clang; CFLAGS=-stdlib=libc++ pip install --ignore-installed pycld2

Credits

Adel Rahmani

The text is released under a Creative Commons Attribution 4.0 International License, and the code is released under the MIT license.