# Report

## Group project for *Advanced topics in machine learning* lecture (2019)
#### Benjamin Ellenberger, Nicolas Deperrois, Laura Kriener

## Chosen Topic: **Music generation**

### **Motivation and State-of-the-Art**

The topics for the group project work was chosen in the very beginning of the course.
As it was announced that the course will nearly always use images for demonstrations, examples and excercises, we decided that our project should be on a different form of data.
We decided to work in the broad field of music generation with deep learing.

To narrow down the topic we investigated what the state of the art in this field is.
The most impressive recent results are produced by Google and OpenAI.
The [Google Magenta project](https://magenta.tensorflow.org/) covers a wide range of applications such as harmonization, drum-machines and a music generating network using the transformer network architecture with attention.

An other very recent result in the field of generating music was published by OpenAI. The [MuseNet](https://openai.com/blog/musenet/) uses the recently published GPT2-architecture which is a large-scale transformer network as well. 

The Google and OpenAI approaches as well as other (less famous) approaches have in common, that they employ very complicated network architectures in combination with the use of immense computational resources.

As the required computational power is far out of our reach, we wondered if this level of complexity is really unavoidable.
And so the question **How much can you do with how little?** became the leading theme for our project.
We want to see, what results can be achieved using much simpler network architecutres (i.e. architectures within the scope of the lecture)?
Which aspects of music generation can be achieved and which have to be ignored? For example can you generate a resonable melody line without considering the rhythm?

### Challenges
The main challenge is that music generation is a very broad topic. Before we can even start we have to answer a couple of questions:

- What exactly do we want to generate? Melody? Rhythm? Harmony?

This will depend on the network structures we try out. For example for simple feed-forward networks we only focus on melody, while we include rhythm in the LSTMs.

- How to feed music into a network? Spectrum? Pitches? Intervals?

We will not use spectrum or audio-data, instead we will work with pitches and intervals and note-lengths extracted from midi-files.

- What kind of music?

As we try to keep things as simple as possible, we decided to use the widely used bach chorale data-set (see `data/raw/bach`).

- How do we evaluate the result?

Music and the quality of music is a highly subjective topic. It is very difficult to find a metric that evaluates how good a produced piece of music is.
We plan to evaluate the pieces of music produced by the different network architectures by comparison and by the networks ability to capture different aspects of music (e.g. if one architecture can only produce melodies without considering rhythm and the other includes rhythm the second one is better suited for music generation). Additionally a measure for quality can be the similarity to the music style it was trained on. 

### Data sets

We will work on the Bach chorale dataset which we included in the repository (see `data/raw/bach`).
Midi-files are binary, therefore it is difficult to modify and create them directly.
We are using the python libaries `pianoroll` and `midicsv` which translates a midi-file into a human-readable (and modifiable) csv-string.
From this we have written our own utility functions that allow us to extract high-level information about the music (e.g. tempo, tonality) and perform changes to the tracks and write them back to a midi file. The utility functions are located in the file `src/midi_utils.py`. A demonstration on how to use them can be found in the notebook `demo_midi_utils.ipynb`. The functions to create a Pytorch compatible dataset from the midi-data are in the file `src/dataset_utils.py`.

### **Feed-Forward Networks**

#### Network structure

![Feed-forward architecture](graphics/forward2.png)


#### Generation mechanism

![Schematic drawing of generation procedure](graphics/forward3.png)



### **LSTMs**

### **Auto-Encoder**