Skip to content
Simple processing scripts for Estonian National Bibliography
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
corpus
data
.here
9.1_metadata_erb_pub.Rmd
9.1_metadata_erb_pub.pdf
9.3_corpus_erb_pub.Rmd
9.3_corpus_erb_pub.pdf
CC-BY-4.0
LICENSE.md
README.md

README.md

Tidy Estonian National Bibliography (Eesti Rahvusbibliograafia, ERB) and Texts

Compiled by Peeter Tinits, 2018 (CC-BY-4.0).

This repository presents the code used to process the Estonian National Bibliography entries for a simple bibliographic overview and the linking of these entries with texts collected from DIGAR.

This is part of greater effort to utilize the Estonian National Bibliography and texts for digital humanities studies, and documents the state at one point in time (Sep 2018). The code aims to ease access to the Bibliography data by transforming it to a tidy data format (Wickham, 2013), with useful information gathered and harmonized by publication.

Note that the scripts are not cleaned and documented thoroughly, as newer, more polished and expanded versions, are anticipated very soon. Ask me for up-to-date results.

Use

Preparation

Data.digar.ee presents two sets of data files. Download these and place them in appropriate directories.

  • processed_txt_tsv.zip - data files of the National Bibliography in different formats. Place them in a folder called data/.
  • processed_pdf_tsv.zip - the processed corpus files from Digar. Place them in a folder called corpus/

Alternatively simply alter the .Rmd files to match the locations of these files.

Running

Simply use the .Rmd files to knit final documents, edit and reuse as needed.


The code is licensed under the MIT license, and is free to use for whatever purpose. If you find this dataset helpful, you can cite this as:

Tinits, Peeter. 2018. Tidy Estonian National Bibliography. (Unpublished.)

An updated and more polished version of the dataset will be completed in the near future. If you have any questions, suggestions, or issues running the code, contact me or post an issue in this repository.

Stable location now at https://github.com/peeter-t2/tidy_ERB/.

You can’t perform that action at this time.