![logo](https://awl.co.jp/wp-content/themes/awl/img/img-header-logo.png)

# AI Music Generator Project (Polyphony_RNN)

In [2]:
from IPython.display import HTML
import random

def hide_toggle(for_next=False):
    this_cell = """$('div.cell.code_cell.rendered.selected')"""
    next_cell = this_cell + '.next()'

    toggle_text = 'Toggle show/hide'  # text shown on toggle link
    target_cell = this_cell  # target cell to control with toggle
    js_hide_current = ''  # bit of JS to permanently hide code in current cell (only when toggling next cell)

    if for_next:
        target_cell = next_cell
        toggle_text += ' next cell'
        js_hide_current = this_cell + '.find("div.input").hide();'

    js_f_name = 'code_toggle_{}'.format(str(random.randint(1,2**64)))

    html = """
        <script>
            function {f_name}() {{
                {cell_selector}.find('div.input').toggle();
            }}

            {js_hide_current}
        </script>

        <a href="javascript:{f_name}()">{toggle_text}</a>
    """.format(
        f_name=js_f_name,
        cell_selector=target_cell,
        js_hide_current=js_hide_current, 
        toggle_text=toggle_text
    )

    return HTML(html)
hide_toggle()

### Configurations

**Basic**

This configuration acts as a baseline for melody generation with an LSTM model. It uses basic one-hot encoding to represent extracted melodies as input to the LSTM. For training, all examples are transposed to the MIDI pitch range [48, 84] and outputs will also be in this range.

**Mono**

This configuration acts as a baseline for melody generation with an LSTM model. It uses basic one-hot encoding to represent extracted melodies as input to the LSTM. While basic_rnn is trained by transposing all inputs to a narrow range, mono_rnn is able to use the full 128 MIDI pitches.

**Lookback**


Lookback RNN introduces custom inputs and labels. The custom inputs allow the model to more easily recognize patterns that occur across 1 and 2 bars. They also help the model recognize patterns related to an events position within the measure. The custom labels reduce the amount of information that the RNN’s cell state has to remember by allowing the model to more easily repeat events from 1 and 2 bars ago. This results in melodies that wander less and have a more musical structure.

**Attention**

In this configuration we introduce the use of attention. Attention allows the model to more easily access past information without having to store that information in the RNN cell's state. This allows the model to more easily learn longer term dependencies, and results in melodies that have longer arching themes.



### Training Settings
*   **Number of Training Steps:**  Number of update steps to perform, before exiting the loop. This usually should be around 10000 to 20000.
*   **Training/Evaluation Ratio:**  How to split your dataset. For example, a ratio of 10% will allocate 90% of your data to training and 10% of your data to evaluation.
*   **Batch Size:** Number of examples used in one update step. This value will affect the speed of training. Recommended value is 64

## Generate MIDIs from the Trained Model


### Generation Settings
*  **Output_Folder** : Name of the MIDI Output Folder. It is recommended to change the name for every individual run. For example, "run1", "run2", and so on...
*  **Number_of_MIDIs** : Number of MIDI that will be generated
* **Number_of_Steps** :  Length of each generated MIDI. Note that 128 steps is approximately 15 seconds. In addition, if the number of steps is too large, the model may experience difficulty generating the MIDI file.
* **Primer_Pitches**: Also known as the **MIDI Note Number**. This is the note/chord which will start the generated sequence. Refer to the images below converting between MIDI Note Numbers and Note Names.

![Pitches1](http://c2rexplugins.weebly.com/uploads/1/4/2/6/14264557/627278711.png)
![Pitches2](https://raw.githubusercontent.com/Ilya-Simkin/MusicGuru-RNN-Composer/master/images/pianopitchMidi.jpg)

[Alternate Reference #1 ](http://www.inspiredacoustics.com/en/MIDI_note_numbers_and_center_frequencies)

[Alternate Reference #2 ](http://www.inspiredacoustics.com/en/MIDI_note_numbers_and_center_frequencies) 


In [11]:
from ipywidgets import widgets


training_steps = widgets.IntSlider(
    value=500,
    min=0,
    max=4000,
    step=10,
    description='Training Steps:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

train_validation_percent = widgets.FloatSlider(
    value=0.9,
    min=0.1,
    max=1,
    step=0.05,
    description='Train/Val Ratio',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.2f',
)


batch_size= widgets.RadioButtons(
    options=['64', '128', '264'],
    description='Batch Size:',
    disabled=False
)

output_folder = widgets.Text(
    value='run1',
    placeholder='output folder name',
    description='MIDI Generation Folder Name:',
    disabled=False
)

num_midis = widgets.IntSlider(
    value=2,
    min=1,
    max=10,
    step=1,
    description='Num of Midis:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

Number_of_Steps = widgets.IntSlider(
    value=128,
    min=64,
    max=1000,
    step=64,
    description='Number_of_Steps:',
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

primer_pitches = widgets.Text(
        value="[60,63]",
        placeholder='Type something',
        description='String:',
        disabled=False
    )

primer_pitch = str(primer_pitches.value)

display(training_steps)
display(train_validation_percent)
display(batch_size)
display(output_folder)
display(num_midis)
display(Number_of_Steps)
display(primer_pitches)
hide_toggle()


IntSlider(value=500, continuous_update=False, description=u'Training Steps:', max=4000, step=10)

FloatSlider(value=0.9, continuous_update=False, description=u'Train/Val Ratio', max=1.0, min=0.1, step=0.05)

RadioButtons(description=u'Batch Size:', options=('64', '128', '264'), value='64')

Text(value=u'run1', description=u'MIDI Generation Folder Name:', placeholder=u'output folder name')

IntSlider(value=2, continuous_update=False, description=u'Num of Midis:', max=10, min=1)

IntSlider(value=128, continuous_update=False, description=u'Number_of_Steps:', max=1000, min=64, step=64)

Text(value=u'[60,63]', description=u'String:', placeholder=u'Type something')

In [29]:
import os

drive_dir = "/root/magenta-data"

training_dir = os.path.join(drive_dir,"midi-data/training-set")
generated_midi_dir = os.path.join(drive_dir,"midi-data/generated_data/polyphony_rnn")

note_dir = os.path.join(drive_dir,"tmp")
seq_dir = os.path.join(drive_dir,"tmp/polyphony_rnn/sequence_examples")

#congfig
os.environ['CONFIG']=str(config.value)


os.environ['TRAIN_SET_DIR'] = str(os.path.join(drive_dir,"midi-data/training-set"))
os.environ['NOTESEQ_FILE'] = str(os.path.join(drive_dir,"tmp/notesequences.tfrecord"))
os.environ['SEQ_DIR'] = str(os.path.join(drive_dir,"tmp/polyphony_rnn/sequence_examples"))
os.environ['SEQ_FILE']= str(os.path.join(drive_dir,"tmp/polyphony_rnn/sequence_examples/training_poly_tracks.tfrecord"))
os.environ['RUN_DIR'] = str(os.path.join(drive_dir,"tmp/polyphony_rnn/logdir/run1"))
os.environ['OUTPUT_DIR'] = str(os.path.join(drive_dir,"midi-data/generated_data/polyphony_rnn"))


os.environ['TRAINING_STEPS']=str(training_steps.value)
os.environ['TRAIN_VAL_RATIO']= str(train_validation_percent.value)
os.environ['BATCH_SIZE']=str(batch_size.value)

#hparams
hparams='batch_size='+str(batch_size.value) +",rnn_layer_sizes="+str("[64,64]")
os.environ['HPARAMS']=hparams

# generation
generated_midi_dir_temp=os.path.join(generated_midi_dir, output_folder.value)

os.environ['OUTPUT_DIR']= generated_midi_dir_temp
os.environ['MIDI_OUTPUT']=str(num_midis.value)
os.environ['GEN_STEPS']=str(Number_of_Steps.value)
os.environ['PRIMER_PITCH']=primer_pitch

hide_toggle()

In [7]:
!convert_dir_to_note_sequences \
--input_dir="$TRAIN_SET_DIR" \
--output_file="$NOTESEQ_FILE"

INFO:tensorflow:Converting files in '/root/magenta-data/midi-data/training-set/'.
INFO:tensorflow:0 files converted.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/Kimi_wo_Nosete_animenz_synthesia.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/kiki26.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/kiki24.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/kiki29.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/Merry Go Round of Life (piano solo).mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/doom_kumo_no_wana.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/laputamidi3.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/kiki15.mid.
INFO:tensorflow:Converted MIDI file /root/magenta-data/midi-data/training-set/Joe Hisaishi -

In [30]:
!polyphony_rnn_create_dataset \
--input="$NOTESEQ_FILE" \
--output_dir="$SEQ_DIR" \
--eval_ratio=$TRAIN_VAL_RATIO 













INFO:tensorflow:

Completed.

INFO:tensorflow:Processed 32 inputs total. Produced 162 outputs.
INFO:tensorflow:DAGPipeline_PolyExtractor_eval_polyphonic_track_lengths_in_bars:
  [1,10): 54
  [10,20): 45
  [20,30): 45
  [30,40): 9
INFO:tensorflow:DAGPipeline_PolyExtractor_eval_polyphonic_tracks_discarded_more_than_1_program: 1404
INFO:tensorflow:DAGPipeline_PolyExtractor_eval_polyphonic_tracks_discarded_too_long: 108
INFO:tensorflow:DAGPipeline_PolyExtractor_eval_polyphonic_tracks_discarded_too_short: 9801
INFO:tensorflow:DAGPipeline_PolyExtractor_training_polyphonic_track_lengths_in_bars:
  [20,30): 9
INFO:tensorflow:DAGPipeline_PolyExtractor_training_polyphonic_tracks_discarded_more_than_1_program: 0
INFO:tensorflow:DAGPipeline_PolyExtractor_training_polyphonic_tracks_discarded_too_long: 9
INFO:tensorflow:DAGPipeline_PolyExtractor_training_polyphonic_tracks_discarded_too_short: 0
INFO:tensorflow:DAGPipeline_RandomPartition_eval_poly_tracks_count: 30
INFO:tensorflow:DAGPipeline_RandomP

In [31]:
!polyphony_rnn_train \
--run_dir="$RUN_DIR" \
--sequence_example_file="$SEQ_FILE" \
--hparams=$HPARAMS \
--num_training_steps=$TRAINING_STEPS

INFO:tensorflow:hparams = {'rnn_layer_sizes': [64, 64], 'attn_length': 0, 'dropout_keep_prob': 0.5, 'batch_size': 64, 'use_cudnn': False, 'clip_norm': 5, 'learning_rate': 0.001, 'residual_connections': False}
INFO:tensorflow:Train dir: /root/magenta-data/tmp/polyphony_rnn/logdir/run1/train
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Counting records in /root/magenta-data/tmp/polyphony_rnn/sequence_examples/training_poly_tracks.tfrecord.
INFO:tensorflow:Total records: 9
INFO:tensorflow:[<tf.Tensor 'random_shuffle_queue_Dequeue:0' shape=(?, 259) dtype=float32>, <tf.Tensor 'random_shuffle_queue_Dequeue:1' shape=(?,) dtype=int64>, <tf.Tensor 'random_shuffle_queue_Dequeue:2' shape=() dtype=int32>]
Instructions for updating:
This class is deprecated, please use tf.nn.rnn_cell.LSTMCell, which supports all the feature this cell currently has. Please replace 

In [32]:
!polyphony_rnn_generate \
--run_dir="$RUN_DIR" \
--hparams=$HPARAMS \
--output_dir="$OUTPUT_DIR" \
--num_outputs=$MIDI_OUTPUT \
--num_steps=$GEN_STEPS \
--primer_pitches=$PRIMER_PITCH \
--condition_on_primer=true \
--inject_primer_during_generation=false


INFO:tensorflow:hparams = {'rnn_layer_sizes': [64, 64], 'attn_length': 0, 'dropout_keep_prob': 0.5, 'batch_size': 1, 'use_cudnn': False, 'clip_norm': 5, 'learning_rate': 0.001, 'residual_connections': False}
Instructions for updating:
This class is deprecated, please use tf.nn.rnn_cell.LSTMCell, which supports all the feature this cell currently has. Please replace the existing code with tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell').
2019-04-25 05:54:33.778952: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-04-25 05:54:33.897522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties: 
name: Quadro RTX 5000 major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:2d:00.0
totalMemory: 15.72GiB freeMemory: 15.16GiB
2019-04-25 05:54:33.897564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2019-04-25 