Copyright 2017 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music.
### ___Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck___

[MusicVAE](https://g.co/magenta/music-vae) learns a latent space of musical scores, providing different modes
of interactive musical creation, including:

* Random sampling from the prior distribution.
* Interpolation between existing sequences.
* Manipulation of existing sequences via attribute vectors.

Examples of these interactions can be generated below, and selections can be heard in our
[YouTube playlist](https://www.youtube.com/playlist?list=PLBUMAYA6kvGU8Cgqh709o5SUvo-zHGTxr).

For short sequences (e.g., 2-bar "loops"), we use a bidirectional LSTM encoder
and LSTM decoder. For longer sequences, we use a novel hierarchical LSTM
decoder, which helps the model learn longer-term structures.

We also model the interdependencies between instruments by training multiple
decoders on the lowest-level embeddings of the hierarchical decoder.

For additional details, check out our [blog post](https://g.co/magenta/music-vae) and [paper](https://goo.gl/magenta/musicvae-paper).
___

This colab notebook is self-contained and should run natively on google cloud. The [code](https://github.com/tensorflow/magenta/tree/master/magenta/models/music_vae) and [checkpoints](http://download.magenta.tensorflow.org/models/music_vae/checkpoints.tar.gz) can be downloaded separately and run locally, which is required if you want to train your own model.

# Basic Instructions

1. Double click on the hidden cells to make them visible, or select "View > Expand Sections" in the menu at the top.
2. Hover over the "`[ ]`" in the top-left corner of each cell and click on the "Play" button to run it, in order.
3. Listen to the generated samples.
4. Make it your own: copy the notebook, modify the code, train your own models, upload your own MIDI, etc.!

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
# Install hmmlearn before downgrading Python
!pip install hmmlearn

# Downgrade Python
!apt-get update -y
!apt-get install python3.8
!update-alternatives --set python3 /usr/bin/python3.8
!curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
!python get-pip.py
import sys
# This path is Colab-runtime specific, check path in other systems.
_ = (sys.path.append("/usr/local/lib/python3.8/dist-packages"))


# Preinstall legacy packages
!pip install numba==0.48
!pip install numpy==1.23
!pip install packaging>=21.3
!pip install librosa==0.7.2

# Install Magenta
!pip install magenta

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting hmmlearn
  Downloading hmmlearn-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (160 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m160.4/160.4 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: hmmlearn
Successfully installed hmmlearn-0.3.0
Get:1 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ InRelease [3,622 B]
Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease [1,581 B]
Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages [1,011 kB]
Get:4 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu focal InRelease [18.1 kB]
Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Hit:6 http://archive.ubuntu.com/ubuntu focal InRelease
Get:7 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Hit:8 http

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting magenta
  Downloading magenta-2.1.4-py3-none-any.whl (1.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m17.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting absl-py==1.2.0 (from magenta)
  Downloading absl_py-1.2.0-py3-none-any.whl (123 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m123.4/123.4 kB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting dm-sonnet==2.0.0 (from magenta)
  Downloading dm_sonnet-2.0.0-py3-none-any.whl (254 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m254.5/254.5 kB[0m [31m14.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting imageio==2.20.0 (from magenta)
  Downloading imageio-2.20.0-py3-none-any.whl (3.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.4/3.4 MB[0m [31m59.7 MB/s[0m eta [36m0:00:00[0m
Collecting matplotlib==3.5.2 (from mage

In [3]:
!pip install fluidsynth

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fluidsynth
  Downloading fluidsynth-0.2.tar.gz (3.7 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: fluidsynth
  Building wheel for fluidsynth (setup.py) ... [?25l[?25hdone
  Created wheel for fluidsynth: filename=fluidsynth-0.2-py3-none-any.whl size=4488 sha256=5eb09318a7f47c5d45f5bd8a906a025aefc982bee1c479ae2a3212046c6c7ba5
  Stored in directory: /root/.cache/pip/wheels/d4/e6/bf/921b2deb780e2681b0e1626a13995e504dbbd455b47e7eedd4
Successfully built fluidsynth
Installing collected packages: fluidsynth
Successfully installed fluidsynth-0.2
[0m

# Environment Setup
Includes package installation for sequence synthesis. Will take a few minutes.


In [4]:
#@title Setup Environment
#@test {"output": "ignore"}

import glob

BASE_DIR = "gs://download.magenta.tensorflow.org/models/music_vae/colab2"

print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth2 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -q pyfluidsynth
!pip install -qU magenta

# Hack to allow python to pick up the newly-installed fluidsynth lib.
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library


print('Importing libraries and defining some helper functions...')
from google.colab import files
import magenta.music as mm
from magenta.models.music_vae import configs
from magenta.models.music_vae.trained_model import TrainedModel
import numpy as np
import os
import tensorflow.compat.v1 as tf

tf.disable_v2_behavior()

# Necessary until pyfluidsynth is updated (>1.2.5).
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

def play(note_sequence):
  mm.play_sequence(note_sequence, synth=mm.fluidsynth)

# def interpolate(model, start_seq, end_seq, num_steps, max_length=32,
#                 assert_same_length=True, temperature=0.5,
#                 individual_duration=4.0):
#   """Interpolates between a start and end sequence."""
#   note_sequences = model.interpolate(
#       start_seq, end_seq,num_steps=num_steps, length=max_length,
#       temperature=temperature,
#       assert_same_length=assert_same_length)

#   print('Start Seq Reconstruction')
#   play(note_sequences[0])
#   print('End Seq Reconstruction')
#   play(note_sequences[-1])
#   print('Mean Sequence')
#   play(note_sequences[num_steps // 2])
#   print('Start -> End Interpolation')
#   interp_seq = mm.sequences_lib.concatenate_sequences(
#       note_sequences, [individual_duration] * len(note_sequences))
#   play(interp_seq)
#   mm.plot_sequence(interp_seq)
#   return interp_seq if num_steps > 3 else note_sequences[num_steps // 2]

def download(note_sequence, filename):
  mm.sequence_proto_to_midi_file(note_sequence, filename)
  files.download(filename)

print('Done')

Installing dependencies...
Selecting previously unselected package fluid-soundfont-gm.
(Reading database ... 122542 files and directories currently installed.)
Preparing to unpack .../fluid-soundfont-gm_3.1-5.1_all.deb ...
Unpacking fluid-soundfont-gm (3.1-5.1) ...
Selecting previously unselected package libinstpatch-1.0-2:amd64.
Preparing to unpack .../libinstpatch-1.0-2_1.1.2-2build1_amd64.deb ...
Unpacking libinstpatch-1.0-2:amd64 (1.1.2-2build1) ...
Selecting previously unselected package timgm6mb-soundfont.
Preparing to unpack .../timgm6mb-soundfont_1.3-3_all.deb ...
Unpacking timgm6mb-soundfont (1.3-3) ...
Selecting previously unselected package libfluidsynth2:amd64.
Preparing to unpack .../libfluidsynth2_2.1.1-2_amd64.deb ...
Unpacking libfluidsynth2:amd64 (2.1.1-2) ...
Setting up fluid-soundfont-gm (3.1-5.1) ...
Setting up timgm6mb-soundfont (1.3-3) ...
update-alternatives: using /usr/share/sounds/sf2/TimGM6mb.sf2 to provide /usr/share/sounds/sf2/default-GM.sf2 (default-GM.sf2)

  if bend_int is not 0:
  if bend_int is not 0:
Instructions for updating:
non-resource variables are not supported in the long term


Done


# Setup and parameters

In [5]:
!pip install mido
!pip install bokeh
!pip install note_seq

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
[0mLooking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
[0mLooking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
[0m

In [6]:
tf.enable_eager_execution()
%matplotlib inline
import note_seq
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [7]:
# All paths
all_paths = dict()

# CE-MCTS paths
all_paths['CE-MCTS'] = [
    '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-06-03-37/ckpt/ckpt_41',
    '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-07-04-25/ckpt/ckpt_11',
    '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-07-06-20/ckpt/ckpt_41',
    '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-15-02-37/ckpt/ckpt_68',
    '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-15-05-48/ckpt/ckpt_42'
]

# Finetune with last layer paths
all_paths['finetune_last'] = [
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-05-00-19/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-06-04-58/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-06-05-38/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-07-18-00/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-04-22-38/train/model.ckpt-2000'
]

# Finetune with all layers paths
all_paths['finetune_all'] = [
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-07-18-49/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-07-19-34/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-07-22-10/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-07-22-58/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-08-01-47/train/model.ckpt-2000'
]

# Non-transfer paths
all_paths['non-transfer'] = [
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-06-05-00/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-06-05-51/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-07-18-46/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-07-20-02/train/model.ckpt-2000',
    '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-07-21-55/train/model.ckpt-2000'
]

# Zero-shot paths
all_paths['zero-shot'] = [
    BASE_DIR + '/checkpoints/mel_2bar_big.ckpt'
]

finetune_path = '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_lastlayer_2000-23-03-04-22-38/train/model.ckpt-2000'
finetune_all_path = '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/finetune_big_all_2000-23-03-07-18-49/train/model.ckpt-2000'
nontransfer_path = '/content/drive/MyDrive/Magenta/magenta/data/tmp/Persian/from_scratch_big_2000_steps-23-03-07-21-55/train/model.ckpt-2000'
BASE_DIR + '/checkpoints/mel_2bar_big.ckpt'
persian_path_train = '/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_1_test.tfrecord'
persian_path_test = '/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_1_test.tfrecord'

In [None]:
# A bunch of useful paths
# data_path = './data/tfrecord/Bo_Burnham_eval.tfrecord'
# data_path = './data/tfrecord/Bo_Burnham_train.tfrecord'
# data_path = '/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_4_test.tfrecord'
# data_path = '/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_4_train.tfrecord'

# Functions and Metrics

In [8]:
# loads a *.tfrecord dataset as lists of note sequences
def get_dataset_noteseq(data_path):
  from magenta.models.music_vae import data
  from magenta.models.music_vae import data
  tf_file_reader = tf.data.TFRecordDataset
  file_reader = tf.python_io.tf_record_iterator

  mel_2bar_config = configs.CONFIG_MAP['cat-mel_2bar_big']
  mel_2bar_config = configs.update_config(mel_2bar_config, dict(eval_examples_path=data_path))
  mel_2bar_config.hparams.batch_size = 1

  dataset = data.get_dataset(
    mel_2bar_config,
    tf_file_reader=tf_file_reader,
    is_training=False,
    cache_dataset=False)
  dataset = dataset.take(-1)

  batch_size = mel_2bar_config.hparams.batch_size
  iterator = tf.data.make_one_shot_iterator(dataset)

  input_seqs, output_seqs, control_seqs, sequence_lengths = [], [], [], []

  for i, o, c, sl in iterator:
    input_seqs.append(mel_2bar_config.data_converter.from_tensors(i)[0])
    # input_seqs.append(i)
    sequence_lengths.append(sl)

  return input_seqs, output_seqs, control_seqs, sequence_lengths


In [9]:
def note_length(seq_list, lim=np.inf):
  note_length_list = list()
  count = 0
  for mel in seq_list:
    if mel.total_time and count < lim and len(mel.notes)>=1:
      note_count = 0
      notes = 0
      for note in mel.notes:
        notes += note.end_time - note.start_time
        note_count += 1
      note_length_list.append(notes/note_count)
      count += 1

  return note_length_list

In [10]:
def gap_length(seq_list, lim=np.inf):
  gap_length_list = list()
  count = 0
  for mel in seq_list:
    if mel.total_time and count < lim and len(mel.notes)>=1:
      end = 0
      total_gaps = 0
      note_count = 0
      for note in mel.notes:
        gap = note.start_time - end
        total_gaps += gap
        end = note.end_time
        note_count += 1
      gap_length_list.append(total_gaps/note_count)
      count += 1

  return gap_length_list

In [11]:
def note_diversity(seq_list, lim=np.inf):
  note_diversity_list = list()
  count = 0
  for mel in seq_list:
    if mel.total_time and count < lim and len(mel.notes)>=1:
      note_count = 0
      notes = list()
      for note in mel.notes:
        if note.pitch not in notes:
          note_count += 1
          notes.append(note.pitch)
      note_diversity_list.append(note_count)
      count += 1

  return note_diversity_list

In [12]:
def pitch_range(seq_list, lim=np.inf):
  pitch_range_list = list()
  count = 0
  for mel in seq_list:
    if mel.total_time and count < lim and len(mel.notes)>=1:
      min_pitch, max_pitch = mel.notes[0].pitch, mel.notes[0].pitch
      for note in mel.notes:
        if note.pitch > max_pitch:
          max_pitch = note.pitch
        if note.pitch < min_pitch:
          min_pitch = note.pitch
      
      pitch_range_list.append(max_pitch - min_pitch)
      count += 1

  return pitch_range_list

In [13]:
def note_density(seq_list, lim=np.inf):
  note_density_list = list()
  count = 0
  for mel in seq_list:
    if mel.total_time and count < lim:
      note_density_list.append(len(mel.notes) / mel.total_time)
      count += 1

  return note_density_list

In [14]:
def get_metrics_dataframe(note_seq_list):
  df = pd.DataFrame({
      'note_density': note_density(note_seq_list),
      'pitch_range': pitch_range(note_seq_list),
      'note_diversity': note_diversity(note_seq_list),
      'note_length': note_length(note_seq_list),
      'gap_length' : gap_length(note_seq_list)
  })
  return df


In [15]:
def plot_histogram(density_list, range_list, diversity_dict, length_list, bins=30, lim = 4):
  import matplotlib.pyplot as plt
  name_dict = dict({'less than':0, 'greater than':0})
  for x in density_list:
    if x > lim:
      name_dict['greater than'] += 1
    else:
      name_dict['less than'] += 1

  # fig, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(1, 5, figsize=(10,5))
  fig, axes = plt.subplots(2, 3, figsize=(10,5))
  axes[0,0].hist(density_list, bins=bins, color='blue', alpha=0.5)
  axes[0,1].bar(name_dict.keys(), name_dict.values(), color='purple', alpha=0.5)
  # ax1.hist(density_list, bins=bins, color='red', cumulative=True, density=False, histtype='step')#, alpha=0.5)
  axes[0,2].hist(range_list, bins=bins, color='red', alpha=0.5)
  # axes[1,0].bar(diversity_dict.keys(), diversity_dict.values(), color='orange', alpha=0.5)
  axes[1,0].hist([key for key, val in diversity_dict.items() for _ in range(val)], bins=50, color='orange', alpha=0.5)
  axes[1,1].hist(length_list, bins=100, color='green', alpha=0.5)
  # ax.hist(density_list, bins=10, linewidth=0.5, edgecolor="white")
  # ax.set(xlim=(0, 10), xticks=np.arange(0, 10),
  #      ylim=(0, 50), yticks=np.linspace(0, 50, 11))
  # fig, ax = plt.subplots(1, 2, 2)
  # ax.hist(range_list, bins=10, linewidth=0.5, edgecolor="white")
  # ax.set(xlim=(0, 10), xticks=np.arange(0, 10),
  #      ylim=(0, 50), yticks=np.linspace(0, 50, 11))
  
  # set the title and axis labels for each subplot
  axes[0,0].set(xlim=(0, 8), ylim=(0, 14))
  axes[0,0].set_title('Note Density Histogram')
  axes[0,0].set_xlabel('Value')
  axes[0,0].set_ylabel('Frequency')
  axes[0,2].set(xlim=(0, 50), ylim=(0, 16))
  axes[0,2].set_title('Pitch range Histogram')
  axes[0,2].set_xlabel('Value')
  axes[0,2].set_ylabel('Frequency')

  # adjust the layout and spacing of the subplots
  fig.tight_layout()

  # show the plot
  plt.show()

# Generate Samples

## Generate Dataframes

In [None]:
# dataframe containers
baseline_df_dict = dict()
mel_2bar_config = configs.CONFIG_MAP['cat-mel_2bar_big']
sample_count = 25
temperature = 0.9

for baseline in all_paths.keys():
  print(baseline)
  count = sample_count / len(all_paths[baseline])
  samples = list()
  for fold_path in all_paths[baseline]:
    # Load model
    print(fold_path)
    mel_2bar = TrainedModel(mel_2bar_config, batch_size=4, checkpoint_dir_or_path=fold_path)
    mel_2bar.sample(n=5, length=32, temperature=temperature)
    # normalized_df=(df-df.min())/(df.max()-df.min())
  
  # create dataframe
  # baseline_df_dict[baseline] = get_metrics_dataframe(samples)

In [None]:
mel_2bar = TrainedModel(mel_2bar_config, batch_size=4, checkpoint_dir_or_path=all_paths['CE-MCTS'][0])

In [None]:
t_list = list()

In [None]:
t_list += [1,2,3,4]

In [None]:
t_list

In [None]:
mel_2bar.sample(n=100, length=32, temperature=temperature)

## Mean and Std

In [None]:
for name in baseline_df_dict.keys():
  print(f"{name} Baseline")
  baseline_df_dict[name].describe()
  print("-"*50)

In [None]:
filenames = list({'/content/mel_2bar_sample_0 (9).mid',
              '/content/mel_2bar_sample_0 (10).mid',
              '/content/mel_2bar_sample_0 (13).mid',
              '/content/mel_2bar_sample_1 (5).mid',
              '/content/mel_2bar_sample_1 (9).mid',
              '/content/mel_2bar_sample_1 (13).mid',
              '/content/mel_2bar_sample_2 (11).mid',
              '/content/mel_2bar_sample_2 (13).mid',
              '/content/mel_2bar_sample_2 (13).mid',
              '/content/mel_2bar_sample_3 (7).mid',
              '/content/mel_2bar_sample_3 (10).mid'})

In [None]:
for f in filenames:
  print(f)
  # plot_sequence(note_seq.midi_file_to_note_sequence(f))
  # note_seq.plot_sequence(note_seq.midi_file_to_note_sequence(f))
  plot_sequence(note_seq.midi_file_to_note_sequence(f))

In [None]:
seq = note_seq.midi_file_to_note_sequence('/content/mel_2bar_sample_0 (10).mid')

In [None]:
note_seq.plot_sequence(seq)

In [None]:
fig, axes = plt.subplots(figsize=(20,10), nrows=2, ncols=3)

# fold_dfs[0].plot(ax=axes[0,0])
# fold_dfs[1].plot(ax=axes[0,1])
# fold_dfs[2].plot(ax=axes[0,2])
# fold_dfs[3].plot(ax=axes[1,0])
# fold_dfs[4].plot(ax=axes[1,1])

axes[0,0].set(xlim=(0, 8), ylim=(0, 100))
# axes[0,1].set(xlim=(0, 8), ylim=(0, 100))
axes[0,2].set(xlim=(0, 15), ylim=(0, 100))
axes[1,0].set(xlim=(0, 25), ylim=(0, 100))



column = ['gap_length']
bins = 30
df.plot.hist(column=['note_density'], bins=range(9), ax=axes[0,0])
df.plot.hist(column=['pitch_range'], bins=30, ax=axes[0,1])
df.plot.hist(column=['note_diversity'], bins=range(17), ax=axes[0,2])
df.plot.hist(column=['note_length'], bins=range(11), ax=axes[1,0])
df.plot.hist(column=['gap_length'], bins=25, ax=axes[1,1])

In [None]:
df

## Generate Samples

In [None]:
!pip install mido
!pip install bokeh
!pip install note_seq

In [None]:
def plot_sequence(sequence, show_figure=True):
  """Creates an interactive pianoroll for a NoteSequence.
  Example usage: plot a random melody.
    sequence = mm.Melody(np.random.randint(36, 72, 30)).to_sequence()
    bokeh_pianoroll(sequence)
  Args:
     sequence: A NoteSequence.
     show_figure: A boolean indicating whether or not to show the figure.
  Returns:
     If show_figure is False, a Bokeh figure; otherwise None.
  """
  # import base64
  import collections
  # import functools
  # import io
  # import os
  # import urllib

  import bokeh
  import bokeh.plotting
  from bokeh.models import Range1d

  # from IPython import display
  # from note_seq import midi_synth
  # import numpy as np
  import pandas as pd
  # from scipy.io import wavfile
  def _sequence_to_pandas_dataframe(sequence):
    """Generates a pandas dataframe from a sequence."""
    pd_dict = collections.defaultdict(list)
    for note in sequence.notes:
      pd_dict['start_time'].append(note.start_time)
      pd_dict['end_time'].append(note.end_time)
      pd_dict['duration'].append(note.end_time - note.start_time)
      pd_dict['pitch'].append(note.pitch)
      pd_dict['bottom'].append(note.pitch - 0.4)
      pd_dict['top'].append(note.pitch + 0.4)
      pd_dict['velocity'].append(note.velocity)
      pd_dict['fill_alpha'].append(note.velocity / 128.0)
      pd_dict['instrument'].append(note.instrument)
      pd_dict['program'].append(note.program)

    # If no velocity differences are found, set alpha to 1.0.
    if np.max(pd_dict['velocity']) == np.min(pd_dict['velocity']):
      pd_dict['fill_alpha'] = [1.0] * len(pd_dict['fill_alpha'])

    return pd.DataFrame(pd_dict)

  # These are hard-coded reasonable values, but the user can override them
  # by updating the figure if need be.
  fig = bokeh.plotting.figure(
      tools='hover,pan,box_zoom,reset,save')
  fig.plot_width = 500
  fig.plot_height = 500
  fig.xaxis.axis_label = 'time (sec)'
  fig.yaxis.axis_label = 'pitch (MIDI)'
  fig.y_range = Range1d(0, 127)
  fig.ygrid.ticker = bokeh.models.BasicTicker(base=10)
  # fig.yaxis.ticker = bokeh.models.SingleIntervalTicker(interval=12)
  # fig.ygrid.ticker = bokeh.models.SingleIntervalTicker(interval=12)
  # Pick indexes that are maximally different in Spectral8 colormap.
  spectral_color_indexes = [7, 0, 6, 1, 5, 2, 3]

  # Create a Pandas dataframe and group it by instrument.
  dataframe = _sequence_to_pandas_dataframe(sequence)
  instruments = sorted(set(dataframe['instrument']))
  grouped_dataframe = dataframe.groupby('instrument')
  for counter, instrument in enumerate(instruments):
    instrument_df = grouped_dataframe.get_group(instrument)
    color_idx = spectral_color_indexes[counter % len(spectral_color_indexes)]
    color = bokeh.palettes.Spectral8[color_idx]
    source = bokeh.plotting.ColumnDataSource(instrument_df)
    fig.quad(top='top', bottom='bottom', left='start_time', right='end_time',
             line_color='black', fill_color=color,
             fill_alpha='fill_alpha', source=source)
  fig.select(dict(type=bokeh.models.HoverTool)).tooltips = (
      {'pitch': '@pitch',
       'program': '@program',
       'velo': '@velocity',
       'duration': '@duration',
       'start_time': '@start_time',
       'end_time': '@end_time',
       'velocity': '@velocity',
       'fill_alpha': '@fill_alpha'})

  if show_figure:
    bokeh.plotting.output_notebook()
    bokeh.plotting.show(fig)
    return None
  return fig


In [None]:
len(mel_2_samples[0].notes) / mel_2_samples[0].total_time

In [None]:
mel_2_samples[0]

In [None]:
plot_sequence(mel_2_samples[0])

In [None]:
#@title Generate 4 samples from the prior.
temperature = 0.9 #@param {type:"slider", min:0.1, max:1.5, step:0.1}
mel_2_samples = mel_2bar.sample(n=4, length=32, temperature=temperature)
# for ns in mel_2_samples:
#   play(ns)

In [None]:
mel_2_samples[0]

In [None]:
#@title Optionally download samples.
for i, ns in enumerate(mel_2_samples):
  download(ns, 'mel_2bar_sample_%d.mid' % i)
  note_seq.plot_sequence(ns)


In [None]:
mm.plot_sequence(mel_2_samples[0])

In [None]:
mm.plot

# Fold Analysis


In [None]:
fold_dfs = list()
for i in [1, 2, 3, 4, 5]:
  data_path = f'/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_{i}_test.tfrecord'
  input_seqs = get_dataset_noteseq(data_path=data_path)[0]
  df = get_metrics_dataframe(input_seqs)
  normalized_df=(df-df.min())/(df.max()-df.min())
  fold_dfs.append(normalized_df)
  # density_list, count_d = note_density(input_seqs)
  # range_list, count_r = pitch_range(input_seqs)
  # diversity_dict, count_div = note_diversity(input_seqs)
  # length_list, count_l = note_length(input_seqs)
  # print (f'Analysis for fold #{i}: ')
  # print(f'Count_d = {count_d}, Count_r = {count_r}')
  # print(f'Mean density = {np.mean(density_list)}, Standard Dev density = {np.std(density_list)}, Median = {np.median(density_list)}')
  # print(f'Mean range = {np.mean(range_list)}, Standard Dev range = {np.std(range_list)}')

  # plot_histogram(density_list, range_list, diversity_dict, length_list, lim=4.4)
  # print("-------------------------------------------------------------------"*2)

In [None]:
fold_dfs[3]

In [None]:
import matplotlib.pyplot as plt

In [None]:
normalized_df.plot.hist(column=['note_diversity'], bins=30, alpha=0.5)

In [None]:
plt.show()

In [None]:
normalized_df.index

In [None]:
fig, axes = plt.subplots(figsize=(20,10), nrows=2, ncols=3)

# fold_dfs[0].plot(ax=axes[0,0])
# fold_dfs[1].plot(ax=axes[0,1])
# fold_dfs[2].plot(ax=axes[0,2])
# fold_dfs[3].plot(ax=axes[1,0])
# fold_dfs[4].plot(ax=axes[1,1])

column = ['gap_length']
bins = 30
fold_dfs[0].plot.hist(column=column, bins=bins, ax=axes[0,0])
fold_dfs[1].plot.hist(column=column, bins=bins, ax=axes[0,1])
fold_dfs[2].plot.hist(column=column, bins=bins, ax=axes[0,2])
fold_dfs[3].plot.hist(column=column, bins=bins, ax=axes[1,0])
fold_dfs[4].plot.hist(column=column, bins=bins, ax=axes[1,1])


In [None]:
fold_dfs[0].plot(subplots=True)

In [None]:
plt.show()

In [None]:
fold_dfs[3].plot()

In [None]:
fold_densities = list()
fold_counts = list()
for i in [1, 2, 3, 4, 5]:
  data_path = f'/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Persian/persian_100_v1/fold_{i}_test.tfrecord'
  input_seqs = get_dataset_noteseq(data_path=data_path)[0]
  density_list, count_d = note_density(input_seqs)
  range_list, count_r = pitch_range(input_seqs)
  diversity_dict, count_div = note_diversity(input_seqs)
  length_list, count_l = note_length(input_seqs)
  print (f'Analysis for fold #{i}: ')
  print(f'Count_d = {count_d}, Count_r = {count_r}')
  print(f'Mean density = {np.mean(density_list)}, Standard Dev density = {np.std(density_list)}, Median = {np.median(density_list)}')
  print(f'Mean range = {np.mean(range_list)}, Standard Dev range = {np.std(range_list)}')

  plot_histogram(density_list, range_list, diversity_dict, length_list, lim=4.4)
  print("-------------------------------------------------------------------"*2)

In [None]:
range(10)

In [None]:
note_diversity_dict = dict()
for i in range(128):
  note_diversity_dict[i] = 0

In [None]:
fold_densities = list()
fold_counts = list()
data_path = f'/content/drive/MyDrive/Magenta/magenta/data/tfrecord/Bo_Burnham_eval.tfrecord'
input_seqs = get_dataset_noteseq(data_path=data_path)[0]
density, count_d = note_density(input_seqs)
range, count_r = pitch_range(input_seqs)
print(f'Count_d = {count_d}, Count_r = {count_r}')
print(f'Mean density = {np.mean(density)}, Standard Dev density = {np.std(density)}')
print(f'Mean range = {np.mean(range)}, Standard Dev range = {np.std(range)}')

plot_histogram(density, range, bins=8)
print("-------------------------------------------------------------------"*2)

In [None]:
for i, seq in enumerate(input_seqs[90:100]):
  download(seq, 'fold1_test_%d.mid' % (i+90))

# Temp


In [16]:
input_seqs = get_dataset_noteseq(data_path=persian_path_train)[0]

In [18]:
fold_dfs = list()
for i in [1]:
  data_path = persian_path_train
  input_seqs = get_dataset_noteseq(data_path=data_path)[0]
  data_path = persian_path_test
  input_seqs += get_dataset_noteseq(data_path=data_path)[0]
  df = get_metrics_dataframe(input_seqs)
  normalized_df=(df-df.min())/(df.max()-df.min())
  fold_dfs.append(normalized_df)
  # density_list, count_d = note_density(input_seqs)
  # range_list, count_r = pitch_range(input_seqs)
  # diversity_dict, count_div = note_diversity(input_seqs)
  # length_list, count_l = note_length(input_seqs)
  # print (f'Analysis for fold #{i}: ')
  # print(f'Count_d = {count_d}, Count_r = {count_r}')
  # print(f'Mean density = {np.mean(density_list)}, Standard Dev density = {np.std(density_list)}, Median = {np.median(density_list)}')
  # print(f'Mean range = {np.mean(range_list)}, Standard Dev range = {np.std(range_list)}')

  # plot_histogram(density_list, range_list, diversity_dict, length_list, lim=4.4)
  # print("-------------------------------------------------------------------"*2)

In [19]:
df

Unnamed: 0,note_density,pitch_range,note_diversity,note_length,gap_length
0,2.500000,12,6,0.400000,0.000000
1,2.500000,10,5,0.400000,0.000000
2,1.500000,12,5,0.645833,0.020833
3,1.500000,7,4,0.645833,0.020833
4,1.500000,7,4,0.604167,0.062500
...,...,...,...,...,...
191,1.250000,15,5,0.800000,0.000000
192,1.250000,5,4,0.800000,0.000000
193,3.750000,12,4,0.125000,0.141667
194,5.419355,12,5,0.125000,0.059524


In [20]:
df.describe()

Unnamed: 0,note_density,pitch_range,note_diversity,note_length,gap_length
count,196.0,196.0,196.0,196.0,196.0
mean,3.102414,10.846939,4.663265,0.340745,0.190655
std,1.950909,8.930905,2.864259,0.316356,0.354935
min,0.25,0.0,1.0,0.125,0.0
25%,1.75,5.0,3.0,0.145833,0.020833
50%,2.36129,8.5,4.0,0.198214,0.074176
75%,5.0,14.0,6.0,0.4,0.25
max,8.0,45.0,20.0,2.0,2.625
