This document contains the documentation of the work developed at the class Generative Music attended during the fall 2018 term at ITP, NYU. Also, under this repo can be found all the code written.
Generative Music Examples
This piece is an interesting experience of a system designed to make music which works only mechanically when the car moves in a certain direction. One execution is a destructive process that probably won't allow the system to make the same music pattern within other play. I selected this piece as generative music because is a system with certain rules and a way of play it in order to make music.
Even I didn't find a performed version of this piece I got very impressed of the notation used for composing it. This is a piece for two modular synths composed by Spiegel in 1970. It is interesting how the music sheet turns into a drawing paper -well, in essence it is a drawing paper- who supports expressive shapes: for example the noise -I think it is noise- is a dense bunch of lines who moves in the space and mixed to get fused in the end of the piece.
Listen to Terry Tiley's 'in C'
I listened a few versions of 'in C' I found in youtube. The one I liked the most is the version performed bay Brooklyn Raga Massive, a Brooklyn based collective of raga enthusiastic musicians who gather to listen and play raga music. Raga is an Indian style of music that can be defined as 'a melodic framework for improvisation. Each raga is an array of melodic structures with musical motifs, considered in the Indian tradition to have the ability to "color the mind" and affect the emotions of the audience' (source: wikipedia). Brooklyn Raga Massive: Terry Riley In C.
Other versions I listened:
- Terry Riley: In C, performed on 2012 Jan. 31 at CEU, Budapest
- Terry Riley - In C (1968) FULL ALBUM
- Terry Riley & s t a r g a z e – 'In C' – Boiler Room Amsterdam Live Performance
- In C (for solo electric guitar and electronics) by Terry Riley
Data sonification performance by Guillermo Montecinos for Generative Music class, NYU ITP Fall 2018.
Nicolasa is a data sonification piece created from the real operation of the Chilean electric power system. The daily generation of the biggest dam power plants are used to create a sound landscape in which the voice of Nicolasa Quintremán demands to the Chilean Government and the Spanish Company Endesa to respect the territory of the Mapuche in southern Chile.
The original idea for this performance was to sonify the data of the entire Chilean power system in order to describe the internal interactions between the different kind of technologies that compose it. For that purpose, hydropower plants were going to be mapped to high pitch notes played by instruments as strings or piano, whilst thermal power plants -as coal or gas- were going to be mapped to low pitch notes, played by analog -robust- synths.
During my investigation I realized that this performance was an opportunity to unveil the political issues that underlie the development of the Chilean power system. In many cases, the construction of power plants has had an irreversible impact for the communities that live in the area before the projects were designed. This is the case of many small fishing creeks that were deeply affected by coal thermal power plants constructed in the shore near them, which used sea water to refresh their system, killing the fishes of the area. Similar cases occurred in southern areas were wind farms attempted to be installed in sacred land for indigenous communities which are protected by the Convention C169.
The most emblematic case is Ralco, a hydro power plant developed by the Spanish company Endesa which in the process of construction forced an entire community of Mapuche people to move from their land and flooded an ancestral graveyard. The construction of this project was supported by the Chilean Government. Nicolasa Quintremán was the Mapuche leader who fought against Endesa and the Government and in her words can be found the proud and respect for her culture and her homeland. "We will never get out of here, unless we are dead. Because we own the right, we own the mother, is a live mother, is the land. Mi mother died, but the land is alive and will live forever."
For this performance, the music was built from the Chilean power system real performance data. Three MIDI tracks were created using MIDIUtil for Python, for three different instruments. A base track contains the base harmony of the song, which is built using the data of eight dam power plants of Chile. A MIDI note was assigned to each plant depending on the maximum power of it, whilst the daily generation was mapped to the velocity of the MIDI note. The main melody of the piece is "played" by Ralco, which daily generation was mapped to the possible MIDI notes centered in 60, following the C maj scale. Velocity for this track is fixed in 100. To complete the sound landscape a bass line was added which pitch is mapped from the daily generation of the coal thermal plant Bocamina II. This power plant is located in the bay area of Lota/Coronel in southern Chile, near Concepción, one of the most polluted areas in the country due to thermal plants.
The midi file generated in the process was exported to Ableton Live, where VST were assigned to play piano in the case of dam plants and cello in the case of Ralco. Both of them were also processed with a Digidelay pedal. The bass line was used to control a Moog Minitaur via MIDI. To complete the piece, the voice of Nicolasa Quintremán was imported as samples in order to the narrative of the piece. All the samples were taken from the video documentary Ralco by Esteban Larraín.
The full piece can be found here.
Bias in Machine Learning
by Guillermo Montecinos, Generative Music Class, ITP Fall 2018.
Bias in Machine Learning is a thing since it exists due it's a set of mathematical rules arbitrary stablished by a human mind, but became a problem when the designed models started to output undesirable or unexpected results. As I could learn from a brief research, in ML there are many places for bias as layers are in the process of design and train a model, and interpret its results. All these three places are: 1) the model itself, 2) the data used for training the model, and 3) the perception process in which we -humans- try to give sense to ML results.
A model is just a model, it's not a truth or a dogma. That is what I learnt at school when in optimization class I realized the only way to develop a model that could approach a truly version of a phenomena was by adding infinite variables. A model is an abstraction of the world thought by a person -probably in most cases a white male- who attempts to represent through his/hers vision of the world the logic behind certain phenomena. And it's also a limited version of his/hers world abstraction because since computing power is limited, model must be limited.
Thus, it is mandatory to be aware of the original goal the model was developed for, because a model designed for estimate credit risk of bank costumers will behave differently of a model developed for estimate terrorist behaviors, and there is a bias embedded in the model's logic.
Training Data Bias
On the other hand, independent of the inherent bias the model could have, the training data plays a main role in the bias that a ML process may output, and a lot has been written about it. Since ML models are basically statistical prediction systems, they attempt to predict the future by looking the past, so as in history it depends of which past we tell to the model how it well behave in the future.
There are many examples of biased ML or AI outputs due to biased datasets used to train those models, but I'd like to mention two of them. The first is Microsoft's twitter bot Tay, a ML/AI experiment released in 2015 that attempted to simulate the behavior of an US-American female teenager on twitter. Tay was designed to learn from her network, so as she computed her followers behavior she became extremely fast into a anti-semite, pro-Hitler and pro-Trump Twitter user. The change on her was so extreme that Microsoft had to close the account after one day.
The other example corresponds to IBM Watson, an AI machine designed to supply easy ML service to the clients through a close chat box. With that goal in mind, IBM developers used Urban Dictionary as a training dataset for the model, clearly intending to incorporate informal idioms to Watson. The problem came when the machine learnt violent expressions.
Finally, the third layer of bias in ML is the perception process whereby we try to assign a meaning to something. I found an interesting post by Memo Atken in which he talks about bias in ML processes. I won't extensively describe what he states, but I want to focus on the idea of "projecting meaning into noise" which talks about when we have a trained model -any kind of model, let's say an image generation one-, what we really have is an array of nodes that take an input and reorganize it according with what it learnt from a bunch of patterns we passed to it.
So, for example if we trained a model for creating Rorschach's Test style images and we input it an array of random pixels, it will probable output a Rorschach-style shape -generated from noise. So, what Atken states is that the meaning of the ML output is not inherent to the result itself but is given by our perception system trying to find patterns in the shape -because in the core it's just random noise sorted following patterns (the training dataset). And during this process what we are actually doing is projecting all our biases in order to find a mining -or project a meaning- into noise.
Bias in ML -as in any other science- is a multi-variable problem that can't be addressed from only one direction. Dataset bias is a thing -very important by the way- but is a problem embedded into another problem which -the bias in the model conception. In order to be fair regarding the bias on a ML output, we always have to be aware of all the layers that are in front of us and recognize that under no circumstances an output of a mathematical process can be objective.
Yes I'm Feelings
Generative music piece built with Machine Learning from Radiohead's data by Sofía Suazo and Guillermo Montecinos for Generative Music class, NYU ITP Fall 2018.
Yes I'm feelings is 2:19 minutes a musical piece inspired in Radiohead's Fitter Happier song from the album Ok Computer. It was build by predicting Radiohead-styled lyrics and melodies from two databases, which were live triggered from two computers. This piece was collaboration with Sofía Suazo for the midterm of the Generative Music class at NYU ITP, Fall 2018.
Besides, the music was generated using a long short-term memory (LSMT) recurrent neural network (RRN) trained with data collected from MIDI files. The original idea was to compose a Thom Yorke-styled melody from of a bunch of Radiohead's songs but the difficulty of properly interpreting time from MIDI files brought us to reinterpret the result of the prediction process.
For addressing the goal of predicting a Radiohead-style melody a dataset of 18 MIDI files were collected from internet. Due to the MIDI protocol is old there is a huge variety and quality of files for the same song which brought us to manually clean the data set by listening every file using MuseScore and checking the quality -or presence- of a melody in a separated track. After that process we came to a sub dataset composed by the following songs -which can be found in the dataset folder:
For training the LSTM model MIDI data was converted into text sequences, process in which only two parameters were collected: pitch and time. The python library mido to import each MIDI message from a file and get from them the pitch and time of each note using this script.
Each song was interpreted into a concatenation of words that were composed by the MIDI pitch in values from 0 - 127, and the time in ticks -each note looked like "66_238". The processed data can be found here. Finally, as the model training script used requires a unified dataset as an input, all songs were concatenated into one file called input.txt.
Model training and simulations with ml5.js
For the model training process the LSTM training framework developed by Hannah Davis was used. This model was trained with the data gotten from the main melody of 12 Radiohead songs.
Magenta: Exploring MelodyRNN
During week 11 I continued exploring the generation of melodies based on Radiohead's data but with Magenta's model MelodyRNN.
I trained the model with data collected from the LAKH dataset and the MIDI files used for the last performance which summarized 39 files. The training process took about 30 minutes, was set with 2000 steps and the final loss was of 0.49556.
10 MIDI files were generated by this process. Within this run, just 2/10 outputs sound similar to Radiohead's melodies, but they really sound like it. It doesn't mean that Magenta's models are inaccurate, by contrast the accurate results are very likely. It's important to consider that the model is trained with the entire files, which means that it takes information from every instrument of the song that can affect the melody output.
Listen the generated melodies here.