# DeepClassic

Reference
* [Asking RNNs+LTSMs: What Would Mozart Write?](http://www.wise.io/tech/asking-rnn-and-ltsm-what-would-mozart-write)

## music21 UserSetting
* http://web.mit.edu/music21/doc/tutorials/environment.html#environment
* [music21](https://gist.github.com/Vesnica/f8862277e4e3a27593f4ca300eedf07e)

### Install 

      sudo apt install musescore scipy timidity lilypond

In [3]:
from music21 import *

In [4]:
us = environment.UserSettings()
us.getSettingsPath()

'/home/carnd/.music21rc'

In [39]:
#us["musicxmlPath"] = "/usr/bin/gedit"
us["musicxmlPath"] = "/usr/bin/mscore"
us["midiPath"] = "/usr/bin/timidity"
us["showFormat"] = "lilypond"
us["writeFormat"] = "lilypond"
us["musescoreDirectPNGPath"] = "/usr/bin/mscore"

## Prepare Data

In [5]:
!mkdir composer

In [62]:
import glob
REP="@\n"
def trim_metadata(output_path, glob_path):
    comp_txt = open(output_path,"w")
    ll = glob.glob(glob_path)
    for song in ll:
        lines = open(song,"r").readlines()
        out = []
        found_first = False
        for l in lines:
            if l.startswith("="):
                ## new measure, replace the measure with the @ sign, not part of humdrum
                out.append(REP)
                found_first = True
                continue
            if not found_first:
                ## keep going until we find the end of the header and metadata
                continue
            if l.startswith("!"):
                ## ignore comments
                continue
            out.append(l)
        comp_txt.writelines(out)
    comp_txt.close()

### Get kern data from github database
*  [automata/ana-music: Automatic analysis of classical music for generative composition](https://github.com/automata/ana-music)

In [4]:
!git clone https://github.com/automata/ana-music.git

Cloning into 'ana-music'...
remote: Counting objects: 1876, done.[K
remote: Compressing objects: 100% (1576/1576), done.[K
remote: Total 1876 (delta 240), reused 1876 (delta 240), pack-reused 0[K
Receiving objects: 100% (1876/1876), 8.55 MiB | 5.47 MiB/s, done.
Resolving deltas: 100% (240/240), done.
Checking connectivity... done.


In [63]:
composers = ["mozart","beethoven","chopin","scarlatti","haydn"]
for composer in composers:
    output_path = "composer/" + composer + ".txt"
    glob_path = "ana-music/corpus/{composer}/*.krn".format(composer=composer)
    trim_metadata(output_path, glob_path)

In [64]:
!ls composer/*.txt

composer/beethoven.txt	composer/haydn.txt   composer/scarlatti.txt
composer/chopin.txt	composer/mozart.txt


### Get Data from KernScore
* [KernScores](http://kern.humdrum.org/)

In [None]:
%mkdir kernscore
%mkdir kernscore/bach

In [76]:
from urllib.request import urlopen
for i in range(1,15+1):
    filename = "inven{0:02d}.krn".format(i)
    file = urlopen("http://kern.humdrum.org/cgi-bin/ksdata?l=osu/classical/bach/inventions&file=%s&f=kern"%filename)
    with open("kernscore/bach/"+filename,'wb') as output:
        output.write(file.read())

In [78]:
output_path = "composer/bach.txt"
glob_path = "kernscore/bach/*.krn"
trim_metadata(output_path, glob_path)

## Setup

In [1]:
import numpy as np

In [2]:
filename = 'composer/bach.txt'
with open(filename, 'r') as f:
    text=f.read()
vocab = set(text)

In [3]:
text[:50]

'@\n4.r\t16dL\n.\t16e\n.\t16f\n.\t16g\n.\t16a\n.\t16b-J\n@\n4.r\t1'

In [45]:
chars = sorted(list(set(text)))
vocab_size = len(vocab)
vocab_size

43

In [5]:
char_indices = dict((c, i) for i, c in enumerate(vocab))
indices_char = dict((i, c) for i, c in enumerate(vocab))

In [6]:
idx = [char_indices[c] for c in text]

In [7]:
idx[:10]

[15, 3, 2, 16, 0, 40, 41, 21, 26, 13]

## Preprocess

In [3]:
maxlen = 1
sentences = []
next_chars = []
for i in range(0, len(idx) - maxlen+1):
    sentences.append(idx[i: i + maxlen])
    next_chars.append(idx[i+1: i+maxlen+1])
print('nb sequences:', len(sentences))

NameError: name 'idx' is not defined

In [2]:
sentences = np.concatenate([[np.array(o)] for o in sentences[:-6]])
next_chars = np.concatenate([[np.array(o)] for o in next_chars[:-6]])

NameError: name 'encoded' is not defined

In [60]:
sentences.shape, next_chars.shape

((46468, 1), (46468, 1))

## Build Model

In [61]:
import keras
from keras.models import Sequential
from keras.layers import Embedding, LSTM, TimeDistributed, Activation
from keras.layers.core import Dense, Dropout
from keras.optimizers import Adam

In [62]:
n_fac = 24

In [63]:
model=Sequential([
        Embedding(vocab_size, n_fac, input_length=maxlen),
        LSTM(units=512, input_shape=(n_fac,),return_sequences=True, dropout=0.2, recurrent_dropout=0.2,
             implementation=2),
        Dropout(0.2),
        LSTM(512, return_sequences=True, dropout=0.2, recurrent_dropout=0.2,
             implementation=2),
        Dropout(0.2),
        TimeDistributed(Dense(vocab_size)),
        Activation('softmax')
    ])

In [64]:
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())

##  Training

In [74]:
model.fit(sentences, np.expand_dims(next_chars,-1), batch_size=64, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x7fb3bb796630>

In [75]:
def print_example():
    seed_string="@"
    for i in range(3000):
        x=np.array([char_indices[c] for c in seed_string[-1:]])[np.newaxis,:]
        preds = model.predict(x, verbose=0)[0][-1]
        preds = preds/np.sum(preds)
        next_char = np.random.choice(chars, p=preds)
        seed_string = seed_string + next_char
    print(seed_string)

In [76]:
print_example()

@#rr#####rrr###r#rr##cM###rr#####Gr#rrr###8
{.rrr#r##r#rrrrr#GSGdO##r###r###8[  {.rrr#rr#rr#r#GS##r####ScS##G'8CS#rr#8e2CSGr#####rr##rrr##Gr#rrr##8CS##r#rrrrrr#rrr##rrr##Gr#r##rrr####GS#r###rrrr#8*2[{{{{{{{{{{{{{{{{{{{{{{{{{.rr##Gr#8
{{{{{{{{{{{{{{{{{{{{.rrr###8[ {.rrrrr#GSc#Gr#rrrr####rrrr#r#rr##r###r#1FS###r##Gr##r#1FS###rr#b#r###8D[{{{{{.###r#r#rrr####r#8}#8Mr##rr#######Sc#rrr#Gr##rr###r###rrr#####r#r#####r####r##r#r#8O#r#r#rr##rr#S##r#cM8_r#r#r####GS#8	{{{{{.#Gd###rrrrrrrr#r###rr####8_rr###8eDr#8
.#####8_##r#r#1Fr###Gr####r#r###8dO##r#rr###r##r##rr#8Drr##rr#rr#rrr#r#8;rr##rr###r#rrr##r##rr###r#1Fr#rr####r#rr#r##GS#rrr#rr#r##rr#r#r#rrr##rr#Grr###8e##rrrr#rr#cScS#Gr###S####rr########rrr#c##rr####rrr###r#rrrrrrr#r#r########rr#GSc#Grrrrr#r###rr###SGr###r#1FScSGr#rr#rrrrrr##r#########rr#######cM#Grr###nr#Gr#8}#rrrr#Gr####8_#r#r###8e8Arr#r#8Drrrrr##r##rrrrrrrrr#4}#rrrrr#8grr##rr#rr#r#8Crr#rrr##Grrrr##rr#rrr#Gb#r#####rrrr##rrrrrr#rrrrrr#S####8CScMr###8_#r###8}rrr##rrr##r#r#8Drr##r#G8g#r##