Skip to content
Humdrum corpora without tools
Makefile
Branch: master
Clone or download
Latest commit 13d9012 Apr 3, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bach-js
beethoven Added corelli files (missing op. 2 and some movements) Apr 3, 2019
chopin Updated submodules. Jun 22, 2018
corelli @ 31f45f2
haydn
hummel Added corelli files (missing op. 2 and some movements) Apr 3, 2019
joplin @ 581e164 Update joplin submodule location. Jun 25, 2018
jrp @ 0d544cc Update jrp data Apr 3, 2019
mozart Added corelli files (missing op. 2 and some movements) Apr 3, 2019
scarlatti-d Added scarlatti-d/keyboard-sonatas submodule. Jun 4, 2014
songs
.gitmodules Added corelli files (missing op. 2 and some movements) Apr 3, 2019
.modulenote Added corelli files (missing op. 2 and some movements) Apr 3, 2019
Makefile Added scarlatti-d/keyboard-sonatas submodule. Jun 4, 2014
README.md Updated README.md Apr 3, 2019
index-master.hmd Updated README.md Apr 3, 2019

README.md

humdrum-data

The humdrum-data repository contains a collection of musical scores in the Humdrum file format for use with Humdrum-processing software. The primary collection of command-line Humdrum processing software for use with this data can be found here, and documentation about Humdrum can be found here.

Installing git

To download this repository, you should have git installed on your computer. Git is a version control program which is the main interface to online repositories on GitHub. To check if the git program is installed on your computer, type in a terminal:

   which git

If the terminal replies with a line such as /usr/local/bin/git, then git is installed and you can run the above installation commands. If the which command replies with an error that git cannot be found, you need to install git. How to do this will depend on your operating system.

In linux, the installation command for git is usually one of these two possibilities:

   sudo yum install git
   sudo apt-get install git

For Apple OS X, the easiest method is to download git from this link. More advanced Mac users can use Homebrew to install git:

   brew install git

For cygwin on MS Windows computers, you should have included git when installing packages when you first installed cygwin. Re-run the installation program and include git in the package installation list.

You could also download GUI interfaces for git here. A Github/git plugin is also available for the Eclipse IDE (watch video).

Download

To download the humdrum-data repository, type these two commands in a terminal:

    git clone https://github.com/humdrum-tools/humdrum-data
    cd humdrum-data
    make update

The first command downloads the repository (which is fairly empty, and only contains links to other repositories). The second command changes the working directory to the newly downloaded humdrum-data repository, and the last command will download most all of the actual data files. Type make update whenever you want to download the most recent version of the humdrum-data repository.

Note that the ZIP file link found in the green button on the right-hand side of this github page will currently not include submodules (GitHub may add this functionality in the future), so it cannot be used to download this metarepository.

Data overview

Here is a alphabetic listing of data directories within the repository:

DirectoryDescription
bach-js   Music by Johann Sebastian Bach (1685–1750)
371-chorales   371 four-part chorales collected by C.P.E. Bach and Johann Kirnberger, published by Breitkopf & Härtel from kernScores
brandenburg   Brandenburg concertos
wtc   Well-tempered Claiver, Books I & II
beethoven   Music by Ludwig van Beethoven (1770–1827)
piano-sonatas   32 piano sonatas from kernScores
quartets   16 string quartets from kernScores
chopin   Music by Frédéric Chopin (1810–1839)
mazurkas   49 mazurkas from kernScores
preludes   24 preludes from kernScores
haydn   Music by Joseph Haydn (1732–1809)
quartets   String Quartets (converted from MuseData) also on kernScores
hummel   Music by Johann Nepomuk Hummel (1778–1837)
preludes   24 preludes from kernScores
joplin   Music by Scott Joplin (c1867–1917) from kernScores
jrp   Music of the early Renaissance (c1425–c1525) from the Josquin Research Project (and mirrored on kernScores)
Ano   Music by Anonymous fifteenth-century composers
Agr   Music by Alexander Agricola (c1446–1506)
Bus   Music by Antoine Busnoys (c1430–1492)
Com   Music by Loyset Compère (c1445–1518)
Duf   Music by Guillaume Du Fay (c1397–1474)
Jap   Music by Jean Japart (c1450–c1507)
Jos   Music by Josquin des Prez (c1450–1521)
Man   Music by Johannes Martini (c1440–c1497)
Mou   Music by Jean Mouton (c1459–1522)
Obr   Music by Jacob Obrecht (1458–1505)
Ock   Music by Johannes Ockeghem (c1425–1497)
Ort   Music by Marbrianus de Orto (c1460–1529)
Pip   Music by Matthaeus Pipelare (c1450–c1550)
Reg   Music by Johannes Regis (c1425–c1496)
Rue   Music by Pierre de la Rue (c1452–1518)
mozart   Music by Wolfgang Amadeus Mozart (1756–1791)
piano-sonatas   17 piano sonatas from kernScores
quartets   String quartets (converted from MuseData) also on kernScores
scarlatti-d   Music by Domenico Scarlatti (1685–1757)
piano-sonatas   59 keyboard sonatas from kernScores
songs/harmonized   Collections of songs with accompaniment
liederschatz   Deutscher Liederschatz, edited by Ludwig Erk (1859–1872), also available on kernScores
songs/unaccompanied   Collections of songs without accompaniment
nova-scotia   Songs and Ballads from Nova Scotia, collected by Helen Creightin (1932), also available on kernScores

Additional online data

Most of the repertories in this archive can also be found on kernScores (http://kern.humdrum.org), with mirrors at http://kern.ccarh.org and http://kernscores.stanford.edu. When this is the case, a file called kern/.kernscores within each submodule will give the kernScores address for the repertory, such as h://371chorales for the set of 371 Bach chorales.

You can access kernScores files from the command line using any of the Humdrum Extras tools. For example to download the 371 Bach chorale set from kernScores, you can use these commands:

   mkdir kernscore-chorales
   cd kernscore-chorales
   humsplit h://371chorales

Entire repertories can also be stored within a single file for convenience, such as:

   humcat -s h://371chorales > 371chorales.krns

Most Humdrum Toolkit and Humdrum Extras programs can process these repertory files:

   prange -g Soprano h://371chorales
   humcat -s h://371chorales | census -k
   humcat -s h://371chorales > 371chorales.krns; census -k 371chorales.krns

Any individual file on kernScores can be accessed with humextra programs as well (if you know the full name of the file), by downloading from the kernScores website directly from the command line:

   humcat h://371chorales/chor001.krn | less
   keycor h://371chorales/chor001.krn

This should produce the same result as running on a local file from this repository:

   less bach-js/371chorales/kern/chor001.krn
   keycor bach-js/371chorales/kern/chor001.krn

To access kernScores files on the command line with Humdrum Toolkit programs, use the Humdrum Extras humcat program:

   humcat -s h://371chorales | census -k
   humcat h://371chorales/chor001 | census -k

The -s option for humcat is needed to keep the files within separate data segments; otherwise, humcat tries to merge separate files into a single spine stream (and merging will fail if the spine structure of separate data segments do not match).

For musical data from the Josquin Research Project, there are several equivalent methods of downloading from the command line:

   prange jrp://Jos2721
   prange jrp://Jos2721-La_Bernardina
   prange jrp://Jos2721-La_Bernardina.krn
   prange humdrum://jrp/Jos/Jos2721-La_Bernardina.krn
   prange h://jrp/Jos/Jos2721-La_Bernardina.krn

The jrp:// URI form does not require a complete filename, so files can be downloaded by using only their JRP catalog number (Jos2721 for the song La Bernardina).

The jrp URI cannot be used to download an entire composer's works, but this can be done from the humdrum (or h for short) URI from kernScores:

   humcat -s h://jrp/Jos | census -k

There are also three useful genre groupings of JRP data which can be downloaded from the command line (useful when doing genre analysis):

   cd jrp
   (mkdir Zma; cd Zma; humsplit h://jrp/Zma)  # download all JRP mass sections
   (mkdir Zmo; cd Zmo; humsplit h://jrp/Zmo)  # download all JRP motets
   (mkdir Zso; cd Zso; humsplit h://jrp/Zso)  # download all JRP songs

Contributing

Encodings of public-domain editions of musical scores can be added to this repository by Humdrum users. The preferred method is for the curator of a repertory to maintain their own repository for the music on GitHub. That repository can then be attached to this repository as a submodule. For those unable to use GitHub, the repertory submodule can be hosted by the humdrum-data repository.

The ideal data organization will include a kern subdirectory if the data is primarily **kern data, or a different subdirectory name matching the primary data type (such as dynam if the data consists only of dynamic information for a work, etc.). In addition, a Makefile with various examples of repertory-specific processing for a submodule's data is useful. Most of the current submodules also include the ability to download PDFs for scans of the reference edition for the repertory. For example, to download the PDF files for the Beethoven piano sonatas repertory, type:

    cd beethoven/piano-sonatas
    make reference-edition

This will create the directory beethoven/piano-sonatas/reference-edition with one PDF for each of the kern files in beethoven/piano-sonatas/kern (i.e., PDFs segmented by movement).

Use the 371chorales and jrp repositories as a template for creating your own encoding repositories. The 371chorales repository is an example of a single-directory collection of music, and the jrp repository is an example of a more complex multiple directory repository of scores (one directory for each composer in this case).

You can’t perform that action at this time.