# Magenta Model Format Investigation

In the `MagentaDemo.ipynb` notebook, we demonstrate Magenta's libraries and a pre-trained model. For this model (or similar) to work in our hardware environment, we need the ability to convert it to a TensorFlow-Lite model. Magenta defines their own definition of what a "model" is by their "TrainedModel" class, so this conversion is not as straight forward as we may desire. This notebook is a sandbox to experiment with methods of converting the Magenta model to a TensorFlow model which can be easily converted to a TensorFlow-Lite.

For the following code to successfuly run, you will need to follow the setup listed in `MagentaDemo.ipynb`

In [40]:
#@title Setup Environment and Define all helper functionality
import tensorflow as tf

# General / Math / Sound libraries
import copy, warnings, librosa, numpy as np
warnings.filterwarnings("ignore", category=DeprecationWarning)

# Magenta specific stuff
import magenta
from magenta.models.music_vae import configs
from magenta.models.music_vae.trained_model import TrainedModel
from magenta.models.music_vae import data

# Load model checkpoint
GROOVAE_2BAR_TAP_FIXED_VELOCITY = "model_checkpoints/groovae_2bar_tap_fixed_velocity.tar"
config_2bar_tap = configs.CONFIG_MAP['groovae_2bar_tap_fixed_velocity']
# Create a TrainedModel (Magenta class) from config and checkpoint
# The config specifies the type of the model, and their TrainedModel class constructs the
#   appropriate back-end tensorflow graph for that specific model
groovae_2bar_tap = TrainedModel(config_2bar_tap, 1, checkpoint_dir_or_path=GROOVAE_2BAR_TAP_FIXED_VELOCITY)

print(config_2bar_tap)
print("\n")
print(config_2bar_tap.hparams)
print("\n")
print(config_2bar_tap.hparams.enc_rnn_size)
print(groovae_2bar_tap) # magenta.models.music_vae.trained_model.TrainedModel object
print(config_2bar_tap.model) # magenta.models.music_vae.base_model.MusicVAE
print(config_2bar_tap.model.encoder) # magenta.models.music_vae.lstm_models.BidirectionalLstmEncoder 
#print(config_2bar_tap.model.encoder._cells) # THIS GIVES AN ERROR.... BUT WHY???

INFO:tensorflow:Building MusicVAE model with BidirectionalLstmEncoder, GrooveLstmDecoder, and hparams:
{'max_seq_len': 32, 'z_size': 256, 'free_bits': 48, 'max_beta': 0.2, 'beta_rate': 0.0, 'batch_size': 1, 'grad_clip': 1.0, 'clip_mode': 'global_norm', 'grad_norm_clip_to_zero': 10000, 'learning_rate': 0.001, 'decay_rate': 0.9999, 'min_learning_rate': 1e-05, 'conditional': True, 'dec_rnn_size': [256, 256], 'enc_rnn_size': [512], 'dropout_keep_prob': 0.3, 'sampling_schedule': 'constant', 'sampling_rate': 0.0, 'use_cudnn': False, 'residual_encoder': False, 'residual_decoder': False, 'control_preprocessing_rnn_size': [256]}
INFO:tensorflow:
Encoder Cells (bidirectional):
  units: [512]

INFO:tensorflow:
Decoder Cells:
  units: [256, 256]





INFO:tensorflow:Unbundling checkpoint.
INFO:tensorflow:Restoring parameters from C:\Users\RYANHE~1\AppData\Local\Temp\tmp9hk47kie\groovae_2bar_tap_fixed_velocity/model.ckpt-3668
Config(model=<magenta.models.music_vae.base_model.MusicVAE object at 0x0000028A4278FE08>, hparams=HParams([('batch_size', 512), ('beta_rate', 0.0), ('clip_mode', 'global_norm'), ('conditional', True), ('control_preprocessing_rnn_size', [256]), ('dec_rnn_size', [256, 256]), ('decay_rate', 0.9999), ('dropout_keep_prob', 0.3), ('enc_rnn_size', [512]), ('free_bits', 48), ('grad_clip', 1.0), ('grad_norm_clip_to_zero', 10000), ('learning_rate', 0.001), ('max_beta', 0.2), ('max_seq_len', 32), ('min_learning_rate', 1e-05), ('residual_decoder', False), ('residual_encoder', False), ('sampling_rate', 0.0), ('sampling_schedule', 'constant'), ('use_cudnn', False), ('z_size', 256)]), note_sequence_augmenter=None, data_converter=<magenta.models.music_vae.data.GrooveConverter object at 0x0000028A4278FE88>, train_examples_path=

# Initial Conclusions and Next Steps

It does not appear that you can access the underlying TensorFlow graphs directly from the library code that is exposed in our `MagentaDemo.ipynb`. So instead, I dug into the code that makes up that library. I found that their `TrainedModel` class wraps around a multitude of model types. For this case, they have defined a `MusicVAE` model. This model is really another wrapper around two internal neural networks. One of the networks is the "encoder" and the other is the "decoder". The encoder / decoder structure is essential to how a VAE (Variational Auto-Encoder) operates. The encoder and decoder are the neural network structures which get constructed using TensorFlow elements. This is the component we want to translate to use TF-Lite. The following code is an effort to reproduce the TensorFlow structure of what their encoder would look like. There will be some more work to figure out how to use all their library code considering it does not expose what we need... but reproducing what they have in the back-end is a good start.

I found this all by digging through https://github.com/magenta/magenta/tree/2d0fd456d7faa272733b57d286f5f26998082cf8/magenta/models/music_vae

In [41]:
from magenta.models.music_vae.lstm_models import BidirectionalLstmEncoder

# Initialize Encoder
encoder = BidirectionalLstmEncoder()
print(encoder)

# Build the Encoder using the same parameters from the groovae_2bar_tap config
# Notice we are NOT training
encoder.build(config_2bar_tap.hparams, is_training=False) 
# Notice the output this gives is also present in the output from the code above. 
# Hopefully this is enough confidence that the same code is running behind the scenes.

#print(encoder._cells) # tuple of 2 <tensorflow.python.keras.layers.legacy_rnn.rnn_cell_impl.MultiRNNCell objects

fw_encoder_cells = encoder._cells[0][0]
print(fw_encoder_cells)

# Save MultiRNNCell module to a SavedModel (ONLY HAVE TO RUN THIS ONCE)
# tf.saved_model.save(fw_encoder_cells, "./saved_models/lstm_saved") 

#reloaded_encoder = tf.saved_model.load("./saved_models/lstm_saved") 
#print(reloaded_encoder.signatures)

# Translate SavedModel to TF-Lite (CURRENTLY DOES NOT WORK 1/26 6:21 PM)
converter = tf.lite.TFLiteConverter.from_saved_model("./saved_models/lstm_saved", signature_keys=None) # path to the SavedModel directory
tflite_model = converter.convert()


<magenta.models.music_vae.lstm_models.BidirectionalLstmEncoder object at 0x0000028A4443C548>
INFO:tensorflow:
Encoder Cells (bidirectional):
  units: [512]

<tensorflow.python.keras.layers.legacy_rnn.rnn_cell_impl.MultiRNNCell object at 0x0000028A45A846C8>


NotImplementedError: We could not automatically infer the static shape of the layer's output. Please implement the `compute_output_shape` method on your layer (MultiRNNCell).