Humanities Data Curation Record

A data curation record supports reuse of data and reproducibility of claims by documenting data source(s), data types and formats, data quality, as well as methods and tools used to subset, transform, augment, and derive insight from data. With this record another researcher should be able to:

(1) understand how data are organized
(2) access the methods and tools used to support analysis
(3) be exposed to data cleaning and transformation processes
(4) identify data source(s)

Summary

When, where, and by whom the data were created

What the data topically describes (e.g. Black Death mortality rates)

Describe data features briefly (e.g. no. of records, mixture of text and tabular data)

Quality

Quality of data (OCR vs. hand transcribed text, uncompressed vs. compressed images)

Type, Format, Extent, Size

text. .txt, 100 files, 10MB

Filenaming Conventions

yyyymmdd_authorlastname_authorfirstname_title.txt

19401021_hemingway_ernest_forwhomthebelltolls_.txt

Modifications

augmentation - e.g geocoded location data

transformation - e.g. converted date formats from mm/dd/yy to yyyy/mm/dd

cleaning - e.g. Rchrd Jmes to Richard James

Methods & Tools

method | approach | algorithm | tool

text analysis, topic modeling, latent dirichlet allocation, MALLET

Source

Where source data came from indicated via citation and persistent identifier if available

Michigan State University Libraries. “Feeding America”. East Lansing, MI: Michigan State University. https://www.lib.msu.edu/feedingamerica/

Reuse License

Humanities Data Curation Record
Thomas Padilla & Brandon Locke
CC-BY 4.0

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
networkanalysis.md		networkanalysis.md
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Humanities Data Curation Record

Summary

Quality

Type, Format, Extent, Size

Filenaming Conventions

Modifications

Methods & Tools

Source

Reuse License

About

Releases

Packages

Contributors 2

datapraxis/hdcr

Folders and files

Latest commit

History

Repository files navigation

Humanities Data Curation Record

Summary

Quality

Type, Format, Extent, Size

Filenaming Conventions

Modifications

Methods & Tools

Source

Reuse License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages