This R package provides several datasets of chord sequences. These datasets are expressly for research purposes only.
bach_chorales_1
: 370 chorales by J. S. bach from KernScores, represented as salami slices.bach_chorales_1b
: same asbach_chorales_1
, but converted to chord sequences using the algorithm of Pardo & Birmingham (2002).classical_1
: 1,022 classical pieces compiled from KernScores, converted to chord sequences using the algorithm of Pardo & Birmingham (2002).classical_1b
: same asclassical_1
, but represented as salami slicespopular_1
: 739 pieces from the McGill Billboard corpus (Burgoyne, 2011), converted from chord symbols to pitch-class sets by Harrison & Pearce (2018).jazz_1
: 1,186 pieces from the iRB corpus (Broze & Shanahan, 2013), converted from chord symbols to pitch-class sets by Harrison & Pearce (2018).
For more details, see the package’s documentation (e.g. ?classical_1
).
You can install the current version of hcorp
from Github by entering
the following commands into R:
if (!require(devtools)) install.packages("devtools")
devtools::install_github("hcorp")
The hcorp
package is best used in tandem with the hrep
package. The
hrep
package provides the underlying representations for the corpora
in hcorp
, as well as methods for manipulating and visualising them.
You can load these packages into the global namespace as follows:
library(hcorp)
library(hrep)
library(magrittr) # Provides the pipe operator, %>%
The hrep
package currently contains three corpora:
classical_1
#>
#> A corpus of 1022 sequences
#> total size = 199254 symbols
#> symbol type = 'pc_chord'
#> coded = true
#> (Metadata available)
popular_1
#>
#> A corpus of 739 sequences
#> total size = 74093 symbols
#> symbol type = 'pc_chord'
#> coded = true
#> (Metadata available)
jazz_1
#>
#> A corpus of 1186 sequences
#> total size = 42822 symbols
#> symbol type = 'pc_chord'
#> coded = true
#> (Metadata available)
Internally, a corpus is a list of encoded vectors.
classical_1[1:3] %>% as.list
#> [[1]]
#> Encoded vector of type 'pc_chord', length = 53 (metadata available)
#>
#> [[2]]
#> Encoded vector of type 'pc_chord', length = 47 (metadata available)
#>
#> [[3]]
#> Encoded vector of type 'pc_chord', length = 39 (metadata available)
Encoded vectors are objects of class coded_vec
.
classical_1[[1]] %>% class
#> [1] "coded_vec_pc_chord" "coded_vec" "integer"
Internally, encoded vectors are sequences of integers. This is good for memory efficiency, and useful for certain modelling approaches.
classical_1[[1]] %>% as.integer %>% head
#> [1] 14481 8473 12553 14481 4245 8465
These vectors can be decoded with the function decode
.
classical_1[[1]][1:3] %>% decode
#> Vector of type 'pc_chord', length = 3 (metadata available)
classical_1[[1]][1:3] %>% decode %>% as.list
#> [[1]]
#> Pitch-class chord: [7] 2 11
#>
#> [[2]]
#> Pitch-class chord: [4] 0 7 11
#>
#> [[3]]
#> Pitch-class chord: [6] 2 9
Corpora and sequences can optionally store metadata.
metadata(classical_1)
#> $description
#> [1] "A selection of common-practice Western tonal music"
metadata(classical_1[[1]])
#> $description
#> [1] "bach-chor001"
#>
#> $keysig
#> [1] 1
#>
#> $mode
#> [1] 0
Corpora and sequences can be subsetted and combined like lists.
classical_1[1:3]
#>
#> A corpus of 3 sequences
#> total size = 139 symbols
#> symbol type = 'pc_chord'
#> coded = true
#> (Metadata available)
classical_1[[1]]
#> Encoded vector of type 'pc_chord', length = 53 (metadata available)
classical_1[[1]][1:3]
#> Encoded vector of type 'pc_chord', length = 3 (metadata available)
c(classical_1[1:3],
popular_1[1:3])
#>
#> A corpus of 6 sequences
#> total size = 313 symbols
#> symbol type = 'pc_chord'
#> coded = true
Several of these corpora were converted into chord sequences using Pardo & Birmingham’s (2002) algorithm with an extended template dictionary. This extended dictionary is provided here:
Pitch classes | Name | Weight |
---|---|---|
0 4 7 | maj | 0.436 |
0 4 7 10 | dom7 | 0.219 |
0 3 7 | min | 0.194 |
0 3 6 9 | dim7 | 0.044 |
0 3 6 10 | hdim7 | 0.037 |
0 3 6 | dim | 0.018 |
0 4 7 11 | maj7 | 0.2 |
0 3 7 10 | min7 | 0.2 |
0 4 8 | aug | 0.02 |
0 7 | no3 | 0.05 |
0 7 10 | min7no3 | 0.05 |
Note: only the first 6 (maj to dim) are present in Pardo & Birmingham’s original paper, the rest were added for this work.
Broze, Y., & Shanahan, D. (2013). Diachronic changes in jazz harmony: A cognitive perspective. Music Perception, 31(1), 32–45. https://doi.org/10.1525/rep.2008.104.1.92
Harrison, P. M. C., & Pearce, M. T. (2018). An energy-based generative sequence model for testing sensory theories of Western harmony. In Proceedings of the 19th International Society for Music Information Retrieval Conference (pp. 160–167). Paris, France.
Pardo, B., & Birmingham, W. P. (2002). Algorithms for chordal analysis. Computer Music Journal, 26(2), 27–49. https://doi.org/10.1162/014892602760137167