vignettes/ComplexSyntax.Rmd

---
title: "Complex humdrum syntax"
author: "Nathaniel Condit-Schultz"
date:   "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Complex humdrum syntax}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


```{r, include = FALSE}
source('vignette_header.R')

```

Welcome to "Complex humdrum syntax"!
This article explains how `r hm` handles spine paths and multi-stop data tokens.

# Complex humdrum syntax

The [humdrum syntax](HumdrumSyntax.html "The Humdrum Syntax") includes a few complex structures: *spine paths* and *multi-stops* (a.k.a., sub-tokens).
`r Hm` incorporates these complexities into its data model, no problem, but they do make things more complicated, and may require some thought depending on the analyses you are trying to do.
Understanding the how paths/stops are used is not at all necessary if the data you are interested in doesn't include spine paths or multi-stops!
You can always skip this article and come back to it at a later time.

The way `r hm` incorporates paths/stops is really just an extension of our basic data model, as described in our [Data Fields](DataFields.html "HumdrumR Data Fields article") article.
You should definitely read and understand that article before reading this one.
*Each and every* token, including tokens in various spine paths and various multi-stops, is recorded as a separate *row* in the humdrum table.


# Spine paths


`r Hm` treats spine paths as "sub-spines" of the main spine which they split from, and keeps track of each path (if any) in the `Path` field.
The starting path (leftmost) is numbered path `0`---in datasets with no spine paths, the `Path` field will be all zeros.
Other paths are numbered with higher integers.

Let's look at a simple example (found in the `humdrumRroot` example files):

```{r}

paths1 <- readHumdrum(humdrumRroot, "examples/Paths.krn")

paths1 |> print(view = "humdrum")

paths1 |> print(view = "table")

```


Here is a more complex example:

```{r}

paths2 <- readHumdrum('examples/Paths2.krn')

paths2

```

Notice that `r hm` prints paths in a way that is more readable than reading humdrum syntax directly:
paths are "shifted" over into columns that align.
This is an option to the function `as.matrix.humdrumR()`.

## Working with Paths

How you work with spine paths depends, as always, on the nature of the data.
If you are simply doing global counting of notes, for example, tokens in paths might be treated no differently than any other data.
In some data sets, information in spine paths might not be relevant to your analyses---for example, if the paths are used to store [ossia](https://en.wikipedia.org/wiki/Ossia).
In other analyses---especially, if you are analyzing data in a linear/melodic way---, incorporating/handling spine paths may be very difficult, with no obviously correct way to do it.
In fact, many of `r hm`'s standard "lagged" functions---like `ditto()`, `mint()`, and `timeline()`---can give weird results in the presence of spine paths.
Often, the simplest solution is to simply ignore/remove spine paths from your data.
Since the `Path` field is numbered starting from `0`, this can be achieved easily by filtering out anywhere where `Path > 0`.

```{r}
paths1 |> 
  filter(Path == 0) |>
  removeEmptyPaths()

```


### Expanding Paths


One way of doing linear/melodic analyses in the presence of spine paths is to "expand" the paths into full spines.
The `expandPaths()` function will "expand" paths by copying the parts of spines that are shared by multiple paths into separate paths (or spines).
Observe:

```{r}

paths2 |> expandPaths(asSpines = TRUE)

```

`r Hm`'s [tidyverse methods][withinHumdrum], like `mutate()` and `within()`, also have an `expandPaths` argument.
If `expandPaths = TRUE`, these functions will expand spine paths, apply their expression, then "unexpand" the paths back to normal.
Look at the differnce between these two calls:

```{r}
paths2 |>
  group_by(Spine, Path) |>
  mutate(Enumerate = seq_along(Token))


paths2 |>
  group_by(Spine, Path) |>
  mutate(Enumerate = seq_along(Token), expandPaths = TRUE)

```

Notice how, when we *don't* use `expandPaths`, each path is counted entirely separately from the rest.


# Stops

In humdrum syntax, multiple tokens can be placed "in the same place" (i.e., same record, same spine) by simply separating them with spaces.
(This is most commonly used to represent chords in `**kern` data.)
In `r hm`, we call these "Stops"---as always, **every** humdrum token, including stops, get their own row in a `r hm` [humdrum table][humTable].
Thus, we need the `Stop` field to tell us which stop a token came from!
In much data, all/most tokens are simply `Stop == 1` (the first position), but if there are more than one tokens in the same record/spine, they will be numbered ascending from one.

Let's look at an example to make sense of this!
Let's start by looking at our humdrum-data view.

```{r}
stops <- readHumdrum(humdrumRroot, 'examples/Stops.krn')

stops |> print(view = 'humdrum')

stops |> print(view = 'data.frame')
```


Here we have a file with chords in the second spine: individual note tokens separated by spaces.
Now, we can switch back to table view:
You can see that each note of the chords gets its own row, numbered `1`, `2`, and `3` in the `Stop` field!

## Working with Multi-Stops

Working with multi-stops is often even more challenging than working with spine paths.
How you work with stops depends, as always, on the nature of the data.
Again, If you are simply doing global counting of notes, for example, tokens in stops might be treated no differently than any other data.
In other analyses---especially, if you are analyzing data in a linear/melodic way---, incorporating/handling spine paths may be very difficult, with no obviously correct way to do it.
In fact, many of `r hm`'s standard "lagged" functions---like `ditto()`, `mint()`, and `timeline()`---can give weird results in the presence of stops.
Often, the simplest solution is to simply ignore/remove stops from your data.
Since the `Stop` field is numbered starting from `1`, this can be achieved easily by filtering out anywhere where `Stop > 1`.

```{r}
stops |> 
  filter(Stop == 1) |>
  removeEmptyStops()

```

<!---

### Unfolding stops


One way of doing analyses in the presence of stops is to "unfold" single-tokens to fill out the stops.
Try the `unfoldStops()` function:

#```{r}
#stops |> 
#  unfoldStops()
#
#```

-->