## LSTM and Music Generation

Credits: https://towardsdatascience.com/how-to-generate-music-using-a-lstm-neural-network-in-keras-68786834d4c5

### RNN: Classic Unrolling
![rnn](https://chunml.github.io/images/projects/creating-text-generator-using-recurrent-neural-network/vanilla_RNN.png)



Hidden layer at time step $t$ ($h_t$), is computed from using $h_{t-1}$ (from previous time step) ($\sigma$ = sigmoid):

$$h_t=\sigma(W_{xh}x_t+W_{hh}h_{t−1})$$

Output (only based on latest $h_t$):

$$y_t = softmax(W_{hy}h_t)$$

### RNN problem: Vanishing Gradient
- As we back-propagate, gradient of the cost function w.r.t. weights tends to diminish.
- This means we forget stuff from earlier time steps.
- Training becomes very slow because RNN doesn't remember much 

![vanishing gradient](https://cdn-images-1.medium.com/max/2000/1*FWy4STsp8k0M5Yd8LifG_Q.png)

### LSTM

![lstm](https://chunml.github.io/images/projects/creating-text-generator-using-recurrent-neural-network/LSTM.png)

- $h_{t-1}$ is the output at time step $t-1$
- Cell state ($C_t$) holds the "long-short term memory" and is controlled by 3 gates:
  - Input gate: decides which values to update ($i_t$)
  - Forget gate: decides which values to forget ($f_t$)
  - Output gate: decides which values to output ($o_t$)
  
- $h_t$ is the output at time step $t$

## Attention vs. LSTM / RNNs

In recent months, LSTMs and RNNs have fallen out of favour (machine learning is like the fashion industry). 

We won't address this in this workbook, but may cover it in the future.

Some background if you are curious:

https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0

https://arxiv.org/abs/1502.03044

https://github.com/philipperemy/keras-attention-mechanism

## Music Generation

Train a neural network to generate midi files

Repository: https://github.com/Skuldur/Classical-Piano-Composer

Blog post: https://towardsdatascience.com/how-to-generate-music-using-a-lstm-neural-network-in-keras-68786834d4c5

### Setup

[Music21](http://web.mit.edu/music21/doc/about/what.html) is a Python toolkit for computer-aided musicology (study of music, editing and composing music)

Install to your environment:

```
(mldds03) pip install music21
```


In [2]:
# clone the repository
!git clone https://github.com/Skuldur/Classical-Piano-Composer

Cloning into 'Classical-Piano-Composer'...


### Predict

As a starting point, we won't modify the script or setup, but try out the demo to make sure it still works.

Later on, we can add our own midi files, tweak the LSTM network, etc.

In [14]:
# !cd is needed because the ! syntax always evaluates from the current directory
!cd Classical-Piano-Composer & dir

 Volume in drive D is DATA
 Volume Serial Number is B200-6E0E

 Directory of D:\mldds-courseware\03_TextImage\Classical-Piano-Composer

02/08/2018  02:43 PM    <DIR>          .
02/08/2018  02:43 PM    <DIR>          ..
02/08/2018  02:43 PM                66 .gitattributes
02/08/2018  02:43 PM    <DIR>          data
02/08/2018  02:43 PM             3,978 lstm.py
02/08/2018  02:43 PM    <DIR>          midi_songs
02/08/2018  02:43 PM        43,837,652 new_weights.hdf5
02/08/2018  02:43 PM             4,771 predict.py
02/08/2018  02:43 PM               917 README.md
02/08/2018  02:43 PM        43,837,652 weights.hdf5
02/08/2018  02:43 PM    <DIR>          __pycache__
               6 File(s)     87,685,036 bytes
               5 Dir(s)  786,515,230,720 bytes free


There is a pre-trained LSTM network (`weights.hdf5`) already present. Let's run it to generate music.

In [1]:
!cd Classical-Piano-Composer & python predict.py

2018-08-02 14:54:50.643288: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Using TensorFlow backend.


In [3]:
# if you have Visual Studio Code installed, you can run this to inspect predict.py
# if you don't have Visual Studio Code installed, you can replace
# code with notepad or your text editor
!cd Classical-Piano-Composer & code predict.py


This produces the following file:

`test_output.mid`

In [1]:
# Reference: https://blog.ouseful.info/2016/09/13/making-music-and-embedding-sounds-in-jupyter-notebooks/

def play_midi(filename):
    """Plays a midi file
    Args:
        filename - path to the midi file
    """
    from music21 import midi
    
    mf = midi.MidiFile()
    mf.open(filename)
    mf.read()
    mf.close()
    stream = midi.translate.midiFileToStream(mf)
    stream.show('midi')

play_midi('Classical-Piano-Composer/test_output.mid')

### Train

Now that we've verified that the pre-trained network works, let's try training using our own midi files.

Note that the original network takes about 20 hours to train, so we'll just be training a smaller version of 
the network for demonstration purposes.

### Training with GPU

If you have a machine with an NVidia GPU, you can use keras-gpu to speed up training (about 6x speedup).

This requires uninstalling keras (cpu version) and install keras-gpu:

```
(mldds03) conda uninstall keras
(mldds03) conda install keras-gpu
``` 

Steps:

1. Rename the midi_songs folder to midi_songs_original

2. Rename `weights.hdf5` to `weights.hdf5.original`

3. Create an empty midi_songs folder.  Download about 5-10 .mid files of your choice into it.  The original network was trained on a single instrument.
- As an experimentation, you can add more instruments to see what happens (maybe gibberish),
- Or if you want to play it safe, pick midi files from just 1 instrument.

   - Example sources: 
    - http://meteorheaven.tripod.com/frame/mchi_male.htm
    - http://sanjeevmusic.com/


4. Create a copy of lstm.py called lstm_exercise.py. Edit this file:

   a. Edit the `train` function to reduce the number of epochs from 200 to 10. Outside of class, you can always re-start training and train for a longer period of time.

   b. Add a TensorBoard callback to monitor training progress.

    ```
    from keras.callbacks import TensorBoard
    from time import time

    ...

    tensorboard = TensorBoard(log_dir='./logs/{}'.format(time()),
                              histogram_freq=0,
                              batch_size=64,
                              write_graph=True)

    ...

    callbacks_list = [checkpoint, tensorboard]

    ```

5. Launch tensorboard:
    ```
   (mldds03) D:\mldds-courseware\03_TextImage\Classical-Piano-Composer>tensorboard --logdir logs --host=0.0.0.0
    
    2018-08-02 15:32:56.198899: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
    TensorBoard 1.8.0 at http://0.0.0.0:6006 (Press CTRL+C to quit)
    
    ```
   Open a browser window to http://localhost:6006

6. Start training:
    ```
    python lstm_exercise.py
    ```


If all goes to plan you should see output like this (depending on whether you are running with or without GPU):
   
### CPU-only
```

    (mldds03) D:\mldds-courseware\03_TextImage\Classical-Piano-Composer>python lstm_exercise.py
    Using TensorFlow backend.
    Parsing midi_songs\aidehuhuan.mid
    Parsing midi_songs\kewang.mid
    Parsing midi_songs\parapara.mid
    Parsing midi_songs\shuinenggaoshuwo.mid
    Parsing midi_songs\zhaomi.mid
    2018-08-02 15:35:15.560156: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
    Epoch 1/10
    5344/5344 [==============================] - 223s 42ms/step - loss: 4.4112
    Epoch 2/10
     832/5344 [===>..........................] - ETA: 3:06 - loss: 4.1681

```

### GPU

```
    (mldds03) D:\mldds-courseware\03_TextImage\Classical-Piano-Composer>python lstm_exercise.py
    Using TensorFlow backend.
    Parsing midi_songs\aidehuhuan.mid
    Parsing midi_songs\kewang.mid
    Parsing midi_songs\parapara.mid
    Parsing midi_songs\shuinenggaoshuwo.mid
    Parsing midi_songs\zhaomi.mid
    2018-08-02 15:48:55.160750: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
    2018-08-02 15:48:55.635377: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
    name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
    pciBusID: 0000:01:00.0
    totalMemory: 11.00GiB freeMemory: 9.01GiB
    2018-08-02 15:48:55.722389: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 1 with properties:
    name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
    pciBusID: 0000:02:00.0
    totalMemory: 11.00GiB freeMemory: 9.01GiB
    2018-08-02 15:48:55.727757: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0, 1
    2018-08-02 15:48:57.789074: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
    2018-08-02 15:48:57.793350: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]      0 1
    2018-08-02 15:48:57.796512: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0:   N N
    2018-08-02 15:48:57.800165: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 1:   N N
    2018-08-02 15:48:57.805051: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8713 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
    2018-08-02 15:48:58.227726: I C:\users\nwani\_bazel_nwani\mmtm6wb6\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 8713 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)

    Epoch 1/10
    5344/5344 [==============================] - 37s 7ms/step - loss: 4.3308
    Epoch 2/10
    1856/5344 [=========>....................] - ETA: 22s - loss: 4.1595
```

Tensorboard should show the network graph, but it will take some time before a loss curve is shown.

LSTM graph:

![tensorboard](assets/lstm/tensorboard_1.png)

Initial loss values:

![tensorboard](assets/lstm/tensorboard_2.png)

### Predict (own trained)

1. copy weights-improvement-10-4.0982-bigger.hdf5 weights.hdf5
2. python predict.py

In [7]:
play_midi('Classical-Piano-Composer/test_output.mid')