# DeepClassic

Reference
* [Asking RNNs+LTSMs: What Would Mozart Write?](http://www.wise.io/tech/asking-rnn-and-ltsm-what-would-mozart-write)

## music21 UserSetting
* http://web.mit.edu/music21/doc/tutorials/environment.html#environment
* [music21](https://gist.github.com/Vesnica/f8862277e4e3a27593f4ca300eedf07e)

### Install 

      sudo apt install musescore scipy timidity lilypond

In [11]:
from music21 import *

In [12]:
us = environment.UserSettings()
us.getSettingsPath()

'/home/tsu-nera/.music21rc'

In [39]:
#us["musicxmlPath"] = "/usr/bin/gedit"
us["musicxmlPath"] = "/usr/bin/mscore"
us["midiPath"] = "/usr/bin/timidity"
us["showFormat"] = "lilypond"
us["writeFormat"] = "lilypond"
us["musescoreDirectPNGPath"] = "/usr/bin/mscore"

## Prepare Data

In [5]:
!mkdir composer

In [62]:
import glob
REP="@\n"
def trim_metadata(output_path, glob_path):
    comp_txt = open(output_path,"w")
    ll = glob.glob(glob_path)
    for song in ll:
        lines = open(song,"r").readlines()
        out = []
        found_first = False
        for l in lines:
            if l.startswith("="):
                ## new measure, replace the measure with the @ sign, not part of humdrum
                out.append(REP)
                found_first = True
                continue
            if not found_first:
                ## keep going until we find the end of the header and metadata
                continue
            if l.startswith("!"):
                ## ignore comments
                continue
            out.append(l)
        comp_txt.writelines(out)
    comp_txt.close()

### Get kern data from github database
*  [automata/ana-music: Automatic analysis of classical music for generative composition](https://github.com/automata/ana-music)

In [4]:
!git clone https://github.com/automata/ana-music.git

Cloning into 'ana-music'...
remote: Counting objects: 1876, done.[K
remote: Compressing objects: 100% (1576/1576), done.[K
remote: Total 1876 (delta 240), reused 1876 (delta 240), pack-reused 0[K
Receiving objects: 100% (1876/1876), 8.55 MiB | 5.47 MiB/s, done.
Resolving deltas: 100% (240/240), done.
Checking connectivity... done.


In [63]:
composers = ["mozart","beethoven","chopin","scarlatti","haydn"]
for composer in composers:
    output_path = "composer/" + composer + ".txt"
    glob_path = "ana-music/corpus/{composer}/*.krn".format(composer=composer)
    trim_metadata(output_path, glob_path)

In [64]:
!ls composer/*.txt

composer/beethoven.txt	composer/haydn.txt   composer/scarlatti.txt
composer/chopin.txt	composer/mozart.txt


### Get Data from KernScore
* [KernScores](http://kern.humdrum.org/)

In [None]:
%mkdir kernscore
%mkdir kernscore/bach

In [76]:
from urllib.request import urlopen
for i in range(1,15+1):
    filename = "inven{0:02d}.krn".format(i)
    file = urlopen("http://kern.humdrum.org/cgi-bin/ksdata?l=osu/classical/bach/inventions&file=%s&f=kern"%filename)
    with open("kernscore/bach/"+filename,'wb') as output:
        output.write(file.read())

In [78]:
output_path = "composer/bach.txt"
glob_path = "kernscore/bach/*.krn"
trim_metadata(output_path, glob_path)

## Setup

In [7]:
import numpy as np

In [79]:
filename = 'composer/bach.txt'
with open(filename, 'r') as f:
    text=f.read()
vocab = set(text)

In [80]:
text[:50]

'@\n4.r\t16dL\n.\t16e\n.\t16f\n.\t16g\n.\t16a\n.\t16b-J\n@\n4.r\t1'

In [81]:
vocab_size = len(vocab)
vocab_size

43

In [82]:
char_indices = dict((c, i) for i, c in enumerate(vocab))
indices_char = dict((i, c) for i, c in enumerate(vocab))

In [83]:
idx = [char_indices[c] for c in text]

In [84]:
idx[:10]

[14, 13, 39, 17, 40, 22, 19, 26, 30, 9]

## Preprocess

In [85]:
maxlen = 40
sentences = []
next_chars = []
for i in range(0, len(idx) - maxlen+1):
    sentences.append(idx[i: i + maxlen])
    next_chars.append(idx[i+1: i+maxlen+1])
print('nb sequences:', len(sentences))

nb sequences: 46435


In [86]:
sentences = np.concatenate([[np.array(o)] for o in sentences[:-6]])
next_chars = np.concatenate([[np.array(o)] for o in next_chars[:-6]])

In [87]:
sentences.shape, next_chars.shape

((46429, 40), (46429, 40))

## Build Model

In [88]:
import keras
from keras.models import Sequential
from keras.layers import Embedding, LSTM, TimeDistributed, Activation
from keras.layers.core import Dense, Dropout
from keras.optimizers import Adam

In [89]:
n_fac = 24

In [90]:
model=Sequential([
        Embedding(vocab_size, n_fac, input_length=maxlen),
        LSTM(units=512, input_shape=(n_fac,),return_sequences=True, dropout=0.2, recurrent_dropout=0.2,
             implementation=2),
        Dropout(0.2),
        LSTM(512, return_sequences=True, dropout=0.2, recurrent_dropout=0.2,
             implementation=2),
        Dropout(0.2),
        TimeDistributed(Dense(vocab_size)),
        Activation('softmax')
    ])

In [91]:
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())

In [92]:
model.fit(sentences, np.expand_dims(next_chars,-1), batch_size=64, epochs=1)

Epoch 1/1
 1152/46429 [..............................] - ETA: 1754s - loss: 3.2811

KeyboardInterrupt: 