History Lab API

History as Data Science

History Lab API

The History Lab focuses on digitizing historical documents and turning them into a format more amenable to the tools of modern data analysis. As part of this, the History Lab has compiled a database of more than 3 million declassified historical documents.

Traversing any large database of this sort can be tedious though. histlabapi is a Python library that aims to solve this, making it easier for users to access data from the History Lab's database by wrapping around the History Lab's API.

Installation and setup

Installation is quite straightforward with pip. This package is only compatible with Python 3.9+ due to its usage of the requests dependency and its reliance on sphinx to generate its documentation.

$ pip install histlabapi

Once installed, you can import the package with this:

from histlabapi import histlabapi

Usage

Before extracting documents left and right, its important to get some bearing on how the History Lab stores and structures its various documents. As such, I've compiled a quick guide where one can look up the various collections and fields that you can access through this API here.

Once that's settled, you can use this package's various functions to extract information in all kinds of ways:

An overview of all the collections currently available in the API
Listing all the entities of a certain type that appear across all collections
Searching and extracting documents by text, entity, date or document ID

Documentation

Full documentation can be accessed at Read the Docs

Support

Feel free to contact me at dg3279@columbia.edu if you have any questions and/or want to contribute!

License

histlabapi was created by Derrick Gozal. It is licensed under the terms of the MIT license.

Credits

histlabapi was created with cookiecutter and the py-pkgs-cookiecutter template. Also much thanks to Professor Raymond Hicks and the rest of the History Lab team for all the support in building up this package.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
docs		docs
src/histlabapi		src/histlabapi
tests		tests
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

History as Data Science

History Lab API

Installation and setup

Usage

Documentation

Support

License

Credits

About

Releases 1

Packages

Languages

License

dgozal/histlabapi

Folders and files

Latest commit

History

Repository files navigation

History as Data Science

History Lab API

Installation and setup

Usage

Documentation

Support

License

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages