Analysis of British Library ESTC Data Collection

This is an algorithmic toolkit for R, designed for transparent quantitative analysis of the British Library English Short Title Catalogue (ESTC) data collection. The package is under active, open development; the tools, analysis, and documentation are preliminary and constantly updated. Your contributions, bug reports and feedback are welcome (but please, don't ask us if we know who is Livy if he is temporarily on discarded author list. Serious data science takes time)!

Installing

# install and load the devtools to gain access to install_github() command
install.packages("devtools")
library("devtools")

# install the dependencies in order
install_github("ropensci/genderdata")
install_github("ropengov/bibliographica")
install_github("ropengov/estc")

library('genderdata')
library('bibliographica')
library('estc')

ESTC data overview

An overview of knowledge production between 1477-1800 based on the ESTC metadata on almost half million documents is provided in the following automatically generated files. This is work in progress. The analyses may contain errors but we provide the complete source code and results already at this preliminary stage to improve the transparency of our work.

The steps to reproduce these summaries from the raw data are fully described at the tutorial page. This includes several steps from raw data extraction to harmonizing the textual annotation fields, preprocessing the information, and carrying out statistical analysis and visualization. Whereas this package focuses on the ESTC data, it utilizes additional tools from the more generic bibliographica and many other R packages listed in the DESCRIPTION file. The ESTC raw data is confidential and available only on a separate agreement, so we can only publish statistical summaries and our own analysis source code at this site. The process is fully automated, and can be easily repeated with different subsets of the data.

Reproducible analysis

We have frozen the analysis for already published material:

Figures for Lahti, Ilomaki, Tolonen (2015). Liber Quarterly 25(2), pp.87–116

Acknowledgements

Authors: Leo Lahti, Ville Vaara, Mikko Tolonen. Part of COMHIS.

You are welcome to:

submit suggestions and bug reports
send a pull request (we will acknowledge contributions)
join IRC at !ropengov@Freenode
contact or follow us

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
R		R
inst		inst
man		man
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of British Library ESTC Data Collection

Installing

ESTC data overview

Reproducible analysis

Acknowledgements

Project activity

About

Releases

Packages

Languages

License

COMHIS/estc

Folders and files

Latest commit

History

Repository files navigation

Analysis of British Library ESTC Data Collection

Installing

ESTC data overview

Reproducible analysis

Acknowledgements

Project activity

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages