-
Notifications
You must be signed in to change notification settings - Fork 0
/
functions_usage.Rmd
258 lines (190 loc) · 7.21 KB
/
functions_usage.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
---
title: "Usage of pyramidi functions"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Usage of pyramidi functions}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
## Introduction
I started to write this vignette a while ago, before I knew object-oriented programming (OOP) in R.
So this might be interesting for you if you don't know OOP but want to learn more about all the internals of pyramidi.
If you just want to see some use cases or if you know well R6, the other vignettes might be a better place to start.
### Load libraries
First load some libraries:
```{r, include=FALSE, message=FALSE}
pyramidi::install_miditapyr(envname = "r-reticulate")
```
```{r setup, message=FALSE}
library(pyramidi)
library(dplyr)
library(tidyr)
library(purrr)
library(ggplot2)
library(zeallot)
```
### Extract midi into dataframe
We'll extract the information of a midi file into dataframe. We'll use the package internal midi file:
```{r midi_df}
midi_file_str <- system.file("extdata", "test_midi_file.mid", package = "pyramidi")
midifile <- mido$MidiFile(midi_file_str)
ticks_per_beat <- midifile$ticks_per_beat
```
Now we can load the information of the `midifile` into a dataframe:
```{r midi_df2, message=FALSE}
dfc = miditapyr$frame_midi(midifile)
head(dfc, 20)
```
This dataframe contains the columns of the track index `i_track`, `meta` (whether the midi event is a note event), and `msg` containing named lists of further midi event information.
The `MidiFile()` function of `mido` also yields the [`ticks_per_beat`](https://mido.readthedocs.io/en/latest/midi_files.html#tempo-and-beat-resolution) of the file:
```{r ticksperbeat}
ticks_per_beat
```
The `miditapyr$unnest_midi()` function transforms the `msg` column of the dataframe to a wide format, where every new column name corresponds to the names in the lists in `msg` (like `tidyr::unnest_wider()`):
```{r}
df <- miditapyr$unnest_midi(dfc) %>% as_tibble()
head(df, 20)
```
Except the `name` column this seems to be the same as
```{r}
dfc %>% unnest_wider(msg)
```
### Translate midi time information
In the midi format, time is treated as relative increments between events (measured in ticks).
In order to derive the total time passed, you can use the function `tab_measures()`:
```{r}
dfm <- tab_measures(df, ticks_per_beat, c("m", "b")) %>%
# create a variable `track` with the track name (in order to have it in the plot below)
mutate(track = ifelse(purrr::map_chr(name, typeof) != "character",
list(NA_character_),
name)) %>%
unnest(cols = track) %>%
fill(track)
dfm
```
This function adds further columns:
* `ticks`: specifying the total ticks passed,
* `t`: specifying the total time in seconds passed,
* `m`: specifying the total [measures](https://en.wikipedia.org/wiki/Bar_(music)) (bars) passed,
* `b`: specifying the total [beats](https://en.wikipedia.org/wiki/Beat_(music)) passed,
* `i_note`: unique ascending index for every track and midi note in the midi file.
### Further processing of the midi events
You can split the dataframe in two by whether the events are [meta](https://mido.readthedocs.io/en/latest/meta_message_types.html) or not:
```{r}
dfm %>%
miditapyr$split_df() %->% c(df_meta, df_notes)
```
```{r df_meta}
df_meta %>% as_tibble()
```
```{r df_notes}
df_notes %>% as_tibble()
```
### Pivot note dataframe to wide
Each note in the midi file is characterized by a `note_on` and a `note_off` event.
In order to generate a piano roll plot with ggplot2, we need to `tidyr::pivot_wider()` those events.
This can be done with the function `pivot_wide_notes()`:
```{r df_notes_wide}
df_not_notes <-
df_notes %>%
dplyr::filter(!stringr::str_detect(type, "^note_o[nf]f?$"))
df_notes_wide <-
df_notes %>%
dplyr::filter(stringr::str_detect(type, "^note_o[nf]f?$")) %>%
# tab_measures(df_meta, df_notes, ticks_per_beat) %>%
pivot_wide_notes() %>%
left_join(pyramidi::midi_defs)
df_notes_wide
```
In the new format, the data has half the number of rows.
The columns `m`, `b`, `t`, `ticks`, `time` and `velocity` are each replaced by
two columns with the suffix `_note_on` and `_note_off`.
### Plot midi information in piano roll plot
Now we have the midi data in the right format for the piano roll plot:
```{r midi_piano_roll}
df_notes_wide %>%
ggplot() +
geom_segment(
aes(
x = m_note_on,
y = note_name,
xend = m_note_off,
yend = note_name,
color = velocity_note_on
)
) +
# each midi track is printed into its own facet:
facet_wrap( ~ track,
ncol = 1,
scales = "free_y") +
guides(color=guide_colorbar(title="Note velocity")) +
labs(
title = "Piano roll of the note events in the midi file",
subtitle = "Only notes played are shown."
) +
xlab("Measures") +
scale_x_continuous(breaks = seq(0, 16, 4),
minor_breaks = 0:16) +
scale_colour_gradient() +
theme_minimal()
```
### Manipulation of the midi data
The new format also allows to easily manipulate the midi data. For instance, let's put the volume (called `velocity` in midi) of the first beat in every bar to the maximum (127), and to half of its original value otherwise:
```{r}
df_notes_wide_mod <- df_notes_wide %>%
mutate(
velocity_note_on = ifelse(
# As it's a 4/4 beat, the first beat of each bar is a multiple of 4:
b_note_on %% 4 == 0,
127,
velocity_note_on / 2
)
)
```
Let's compare the modified value to the original one:
```{r}
df_notes_wide %>%
select(b_note_on, velocity_note_on) %>%
bind_cols(
new = df_notes_wide_mod$velocity_note_on
)
```
With an `ifelse()` statement, we modified the volume of the midi notes, depending on if they're the first beat in the measure or not.
Other possible manipulations could be for instance:
* [Quantization](https://en.wikipedia.org/wiki/Quantization_(music)) by `round()`ing the `note_on`/`note_off` times,
* Chord generation, e.g. by applying a `group_by(floor(m_note_on))`-`summarize()` logic, or
* Arpeggiating chords by a `group_by(floor(m_note_on))` - `mutate()` logic.
### Pivot note data frame back to long format
We can transform the wide midi data back to the long format:
```{r pivot_long}
df_notes_long <- pivot_long_notes(df_notes_wide)
```
### Join non note events
We can now add the non note events:
```{r join_non_note_events}
df_midi_out <- merge_midi_frames(df_meta, df_notes_long, df_not_notes)
df_midi_out
```
The `time` value in midi format is given by the number of `ticks` passed between events.
### Write midi dataframe back to a midi file
Now we can transform the data back to a dataframe of the same format as the one we got with `miditapyr$frame_midi()`:
```{r dfc2}
dfc2 <-
df_midi_out %>%
# When reticulate converts R dataframes to pandas, there are complications
# with character columns containing missing values.
# repair_reticulate_conversion = TRUE, repairs that in the miditapyr python
# code:
miditapyr$nest_midi(repair_reticulate_conversion = TRUE)
as_tibble(dfc2)
```
And we can save it back to a midi file:
```{r write_midi, eval=FALSE}
miditapyr$write_midi(dfc2, ticks_per_beat, "test.mid")
```