-
Notifications
You must be signed in to change notification settings - Fork 1
/
ComplexSyntax.Rmd
176 lines (112 loc) · 6.34 KB
/
ComplexSyntax.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---
title: "Complex humdrum syntax"
author: "Nathaniel Condit-Schultz"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Complex humdrum syntax}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
source('vignette_header.R')
```
Welcome to "Complex humdrum syntax"!
This article explains how `r hm` handles spine paths and multi-stop data tokens.
# Complex humdrum syntax
The [humdrum syntax](HumdrumSyntax.html "The Humdrum Syntax") includes a few complex structures: *spine paths* and *multi-stops* (a.k.a., sub-tokens).
`r Hm` incorporates these complexities into its data model, no problem, but they do make things more complicated, and may require some thought depending on the analyses you are trying to do.
Understanding the how paths/stops are used is not at all necessary if the data you are interested in doesn't include spine paths or multi-stops!
You can always skip this article and come back to it at a later time.
The way `r hm` incorporates paths/stops is really just an extension of our basic data model, as described in our [Data Fields](DataFields.html "HumdrumR Data Fields article") article.
You should definitely read and understand that article before reading this one.
*Each and every* token, including tokens in various spine paths and various multi-stops, is recorded as a separate *row* in the humdrum table.
# Spine paths
`r Hm` treats spine paths as "sub-spines" of the main spine which they split from, and keeps track of each path (if any) in the `Path` field.
The starting path (leftmost) is numbered path `0`---in datasets with no spine paths, the `Path` field will be all zeros.
Other paths are numbered with higher integers.
Let's look at a simple example (found in the `humdrumRroot` example files):
```{r}
paths1 <- readHumdrum(humdrumRroot, "examples/Paths.krn")
paths1 |> print(view = "humdrum")
paths1 |> print(view = "table")
```
Here is a more complex example:
```{r}
paths2 <- readHumdrum('examples/Paths2.krn')
paths2
```
Notice that `r hm` prints paths in a way that is more readable than reading humdrum syntax directly:
paths are "shifted" over into columns that align.
This is an option to the function `as.matrix.humdrumR()`.
## Working with Paths
How you work with spine paths depends, as always, on the nature of the data.
If you are simply doing global counting of notes, for example, tokens in paths might be treated no differently than any other data.
In some data sets, information in spine paths might not be relevant to your analyses---for example, if the paths are used to store [ossia](https://en.wikipedia.org/wiki/Ossia).
In other analyses---especially, if you are analyzing data in a linear/melodic way---, incorporating/handling spine paths may be very difficult, with no obviously correct way to do it.
In fact, many of `r hm`'s standard "lagged" functions---like `ditto()`, `mint()`, and `timeline()`---can give weird results in the presence of spine paths.
Often, the simplest solution is to simply ignore/remove spine paths from your data.
Since the `Path` field is numbered starting from `0`, this can be achieved easily by filtering out anywhere where `Path > 0`.
```{r}
paths1 |>
filter(Path == 0) |>
removeEmptyPaths()
```
### Expanding Paths
One way of doing linear/melodic analyses in the presence of spine paths is to "expand" the paths into full spines.
The `expandPaths()` function will "expand" paths by copying the parts of spines that are shared by multiple paths into separate paths (or spines).
Observe:
```{r}
paths2 |> expandPaths(asSpines = TRUE)
```
`r Hm`'s [tidyverse methods][withinHumdrum], like `mutate()` and `within()`, also have an `expandPaths` argument.
If `expandPaths = TRUE`, these functions will expand spine paths, apply their expression, then "unexpand" the paths back to normal.
Look at the differnce between these two calls:
```{r}
paths2 |>
group_by(Spine, Path) |>
mutate(Enumerate = seq_along(Token))
paths2 |>
group_by(Spine, Path) |>
mutate(Enumerate = seq_along(Token), expandPaths = TRUE)
```
Notice how, when we *don't* use `expandPaths`, each path is counted entirely separately from the rest.
# Stops
In humdrum syntax, multiple tokens can be placed "in the same place" (i.e., same record, same spine) by simply separating them with spaces.
(This is most commonly used to represent chords in `**kern` data.)
In `r hm`, we call these "Stops"---as always, **every** humdrum token, including stops, get their own row in a `r hm` [humdrum table][humTable].
Thus, we need the `Stop` field to tell us which stop a token came from!
In much data, all/most tokens are simply `Stop == 1` (the first position), but if there are more than one tokens in the same record/spine, they will be numbered ascending from one.
Let's look at an example to make sense of this!
Let's start by looking at our humdrum-data view.
```{r}
stops <- readHumdrum(humdrumRroot, 'examples/Stops.krn')
stops |> print(view = 'humdrum')
stops |> print(view = 'data.frame')
```
Here we have a file with chords in the second spine: individual note tokens separated by spaces.
Now, we can switch back to table view:
You can see that each note of the chords gets its own row, numbered `1`, `2`, and `3` in the `Stop` field!
## Working with Multi-Stops
Working with multi-stops is often even more challenging than working with spine paths.
How you work with stops depends, as always, on the nature of the data.
Again, If you are simply doing global counting of notes, for example, tokens in stops might be treated no differently than any other data.
In other analyses---especially, if you are analyzing data in a linear/melodic way---, incorporating/handling spine paths may be very difficult, with no obviously correct way to do it.
In fact, many of `r hm`'s standard "lagged" functions---like `ditto()`, `mint()`, and `timeline()`---can give weird results in the presence of stops.
Often, the simplest solution is to simply ignore/remove stops from your data.
Since the `Stop` field is numbered starting from `1`, this can be achieved easily by filtering out anywhere where `Stop > 1`.
```{r}
stops |>
filter(Stop == 1) |>
removeEmptyStops()
```
<!---
### Unfolding stops
One way of doing analyses in the presence of stops is to "unfold" single-tokens to fill out the stops.
Try the `unfoldStops()` function:
#```{r}
#stops |>
# unfoldStops()
#
#```
-->