<a href="https://colab.research.google.com/github/michaellomuscio/LSTM_Sound_Example/blob/main/LSTM_Music_Sequence_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Music Sequence Prediction using LSTMs

*Written by Dr. Michael Lomuscio*

This Google Colab notebook demonstrates a basic implementation of a music sequence prediction model using Long Short-Term Memory (LSTM) networks, a type of Recurrent Neural Network (RNN). The primary focus of this project is to predict the next musical note in a sequence based on the preceding notes, simulating how music might be composed or predicted based on prior patterns. Below, we'll provide a brief introduction to LSTMs and how they are utilized in this project.

## Key Concepts and Models Used

### 1. LSTM Networks
LSTMs are a type of Recurrent Neural Network designed to recognize patterns in sequences of data. They are particularly useful for time-series data like music, where past notes have an influence on future notes. In this code, LSTMs are employed to learn the relationships between notes and predict the next note in a sequence. LSTMs are ideal for this task because they have memory cells that can maintain information over longer time intervals, making them well-suited for dealing with sequential data where the order of elements matters.

### 2. Music Note Mapping and Sequence Creation
The model first converts a set of musical notes (C, D, E, F, G, A, B) into numeric representations, which are more suitable for machine learning algorithms. This numerical encoding allows the model to process the notes as numerical data, which is essential for training the LSTM. In this notebook, a sequence of musical notes is defined, and input-output pairs are generated to train the model.

### 3. Model Structure
The model uses a Sequential API from TensorFlow's Keras library to create a simple LSTM-based neural network. It has two main layers:
- **LSTM Layer:** This layer processes the sequence of notes and captures the temporal dependencies among them.
- **Dense Layer:** The Dense layer with a softmax activation outputs a probability distribution over the possible notes, allowing the model to predict the next note.

### 4. Training Process
The model is trained to predict the next note in a sequence based on a sliding window of previous notes. The training data is constructed by slicing the original sequence into smaller overlapping subsequences. For example, given the sequence [C, D, E, F, G, A, B, C, D, E], the model learns using smaller chunks like [C, D, E] to predict F, [D, E, F] to predict G, and so on.

## How to Use This Code

1. **Dependencies**  
   Make sure you have TensorFlow and NumPy installed. You can install them in Google Colab using the following commands:
   ```python
   !pip install tensorflow
   !pip install numpy


In [2]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Step 1: Define the set of notes
# Define the musical notes that will be used in the sequence
the_notes = ['C', 'D', 'E', 'F', 'G', 'A', 'B']

# Create mappings from notes to integers and vice versa
# This will allow conversion between notes and numerical representations
note_to_int = dict((note, number) for number, note in enumerate(the_notes))
int_to_note = dict((number, note) for number, note in enumerate(the_notes))

# Step 2: Create a sequence of notes
# Define a sequence of notes that will be used as input data
sequence = ['C', 'D', 'E', 'F', 'G', 'A', 'B', 'C', 'D', 'E']

# Convert the sequence of notes to integers using the mapping
dense_sequence_int = [note_to_int[note] for note in sequence]
print(f"Converted sequence to integers: {dense_sequence_int}")

# Step 3: Prepare the data
# Define the length of the input sequences (number of notes to consider in each step)
seq_length = 3 # Number of previous notes to consider
X = []
y = []

# Create sequences for training the LSTM model
# X will contain sequences of notes, and y will contain the next note in the sequence
for i in range(len(dense_sequence_int) - seq_length):
  X.append(dense_sequence_int[i:i + seq_length]) # Add the input sequence of length `seq_length`
  y.append(dense_sequence_int[i + seq_length]) # Add the next note as the target output
print(f"Sequence {i}: Input: {X[-1]}, Output: {y[-1]}")

# Convert the input and output to numpy arrays for LSTM compatibility
X = np.array(X)
y = np.array(y)
print(f"X shape: {X.shape}, y shape: {y.shape}")

# Reshape X to be [samples, time steps, features] as required by LSTM input
X = X.reshape((X.shape[0], X.shape[1], 1))
print(f"Reshaped X: {X.shape}")

# Normalize input data by dividing by the number of notes (to have values between 0 and 1)
X = X / float(len(the_notes))
print(f"Normalized X: {X}")

# Step 4: Build the model
# Create a Sequential LSTM model
model = Sequential()

# Add an LSTM layer with 50 units
# The input shape is defined as (sequence length, 1 feature per time step)
model.add(LSTM(50, input_shape=(X.shape[1], X.shape[2])))
print("Added LSTM layer with 50 units.")

# Add a Dense layer with output size equal to the number of notes
# Use 'softmax' activation to predict the probability of each note
model.add(Dense(len(the_notes), activation='softmax'))
print("Added Dense layer with softmax activation.")

# Compile the model using sparse categorical crossentropy as the loss function
# and 'adam' as the optimizer to train efficiently
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
print("Compiled the model with sparse categorical crossentropy and adam optimizer.")

# Step 5: Train the model
# Train the LSTM model on the prepared data
# Set the number of epochs to 200, with a batch size of 1
print("Starting model training...")
model.fit(X, y, epochs=200, batch_size=1, verbose=2)
print("Model training completed.")

# Step 6: Make a prediction
# Select a random starting point in the input data for prediction
start = np.random.randint(0, len(X) - 1)
print(f"Random starting point for prediction: {start}")

# Extract the pattern to start the prediction (input sequence of length `seq_length`)
pattern = X[start]
pattern_original = pattern.copy() # Store a copy of the original input pattern
print(f"Original pattern for prediction: {pattern_original}")

# Reshape the pattern to match the LSTM input requirements (1 sample, sequence length, 1 feature)
prediction_input = pattern.reshape(1, seq_length, 1)
print(f"Reshaped pattern for prediction: {prediction_input}")

# Predict the next note using the trained model
# The prediction contains probabilities for each note
prediction = model.predict(prediction_input, verbose=0)
print(f"Prediction probabilities: {prediction}")

# Extract the index of the note with the highest probability
index = np.argmax(prediction)
print(f"Predicted index: {index}")

# Convert the predicted index back to the note using `int_to_note` mapping
result = int_to_note[index]
print(f"Predicted next note: {result}")

# Convert the original input pattern back to note names for printing
input_notes = [int_to_note[int(n * len(the_notes))] for n in pattern_original.flatten()]

# Print the input sequence and the predicted next note
print(f"Input Sequence: {input_notes}")
print(f"Predicted Next Note: {result}")



Converted sequence to integers: [0, 1, 2, 3, 4, 5, 6, 0, 1, 2]
Sequence 6: Input: [6, 0, 1], Output: 2
X shape: (7, 3), y shape: (7,)
Reshaped X: (7, 3, 1)
Normalized X: [[[0.        ]
  [0.14285714]
  [0.28571429]]

 [[0.14285714]
  [0.28571429]
  [0.42857143]]

 [[0.28571429]
  [0.42857143]
  [0.57142857]]

 [[0.42857143]
  [0.57142857]
  [0.71428571]]

 [[0.57142857]
  [0.71428571]
  [0.85714286]]

 [[0.71428571]
  [0.85714286]
  [0.        ]]

 [[0.85714286]
  [0.        ]
  [0.14285714]]]


  super().__init__(**kwargs)


Added LSTM layer with 50 units.
Added Dense layer with softmax activation.
Compiled the model with sparse categorical crossentropy and adam optimizer.
Starting model training...
Epoch 1/200
7/7 - 2s - 292ms/step - loss: 1.9631
Epoch 2/200
7/7 - 0s - 6ms/step - loss: 1.9565
Epoch 3/200
7/7 - 0s - 6ms/step - loss: 1.9541
Epoch 4/200
7/7 - 0s - 5ms/step - loss: 1.9509
Epoch 5/200
7/7 - 0s - 5ms/step - loss: 1.9492
Epoch 6/200
7/7 - 0s - 6ms/step - loss: 1.9466
Epoch 7/200
7/7 - 0s - 6ms/step - loss: 1.9445
Epoch 8/200
7/7 - 0s - 5ms/step - loss: 1.9429
Epoch 9/200
7/7 - 0s - 6ms/step - loss: 1.9414
Epoch 10/200
7/7 - 0s - 6ms/step - loss: 1.9396
Epoch 11/200
7/7 - 0s - 6ms/step - loss: 1.9370
Epoch 12/200
7/7 - 0s - 6ms/step - loss: 1.9354
Epoch 13/200
7/7 - 0s - 6ms/step - loss: 1.9330
Epoch 14/200
7/7 - 0s - 6ms/step - loss: 1.9308
Epoch 15/200
7/7 - 0s - 6ms/step - loss: 1.9284
Epoch 16/200
7/7 - 0s - 6ms/step - loss: 1.9258
Epoch 17/200
7/7 - 0s - 6ms/step - loss: 1.9231
Epoch 18/200


## Custom Input Sequence Prediction Using LSTM

This section of the Google Colab notebook extends the functionality of our LSTM-based music prediction model by allowing us to make predictions on a custom input sequence of notes. The idea is to feed a sequence of notes into the pre-trained model and have it predict the next note in the sequence. This is particularly useful for experimenting with different musical sequences, understanding how the model interprets them, and exploring the potential of generating new music based on user-defined inputs.

### Overview of Key Steps and Ideas

**1. Define the Custom Sequence of Notes**  
- The code begins by allowing you to define a custom sequence of notes that you wish to use for prediction. For instance, you can replace the sequence `['E', 'A', 'A']` with any set of notes you are interested in. The ability to specify your own sequence offers flexibility in testing the model's ability to predict different musical contexts.

**2. Data Validation and Conversion**  
- The next step ensures that all notes in the custom sequence are valid. Specifically, the code checks if each note is part of the predefined set of notes (`the_notes`). If any note is not recognized, the code raises an error to prevent incorrect input from being processed.
- Once validated, the custom sequence is converted to its integer representation using the mapping dictionary (`note_to_int`). This conversion is necessary because the LSTM model expects numerical input rather than text-based note labels.

**3. Prepare Input Data for the Model**  
- The custom sequence, now represented by integers, is converted into a NumPy array. To feed it into the model, the sequence is reshaped to meet the LSTM input requirements, which expect input in the shape of `[samples, time steps, features]`. Here, the input is reshaped to have one sample, with the length of the custom sequence representing the time steps, and each time step having one feature.
- Additionally, the input data is normalized by dividing it by the number of notes (`float(len(the_notes))`). Normalization helps improve the model's learning efficiency by ensuring that all input values fall within the range [0, 1]. This step is crucial for maintaining consistency between the format of the training data and the custom input data.

**4. Making Predictions with the Model**  
- After preparing the input, the code uses the pre-trained LSTM model to predict the next note in the sequence. The model outputs a probability distribution over all possible notes, indicating the likelihood of each note being the next in the sequence.
- The `np.argmax()` function is then used to identify the note with the highest probability, which is considered the model's prediction for the next note.
- Finally, the predicted note index is converted back to its original note label using the `int_to_note` mapping dictionary. This allows us to display the result in a human-readable format, showing both the input sequence and the predicted next note.

### How to Use This Code

1. **Modify the Custom Sequence**  
   - You can customize the sequence of notes by changing the `custom_sequence` variable. For example, you might choose `['C', 'G', 'E']` or any other combination of notes to see how the model continues the sequence.

2. **Run the Code**  
   - Ensure that the model has already been trained before running this custom prediction code. Otherwise, the model won't be able to generate meaningful predictions.
   - Execute each step in order to convert the notes, prepare the input, and make a prediction.

3. **Interpreting the Output**  
   - The code will print the custom input sequence you provided, as well as the predicted next note. The prediction is based on the patterns learned by the LSTM during training, and it is interesting to see whether the prediction aligns with typical musical expectations.

### Practical Applications and Extensions

- **Music Composition:** This functionality can be used for music composition by allowing users to iteratively generate new notes, potentially creating entire pieces of music by repeatedly feeding predictions back into the model.
- **Experimenting with Different Sequences:** You can explore how different input sequences lead to different predictions, helping you understand the types of musical patterns the model has learned.
- **Interactive Music Generation:** By wrapping this process in a user interface, you could build an interactive music generation tool where users can input sequences and hear the model's predicted continuations in real time.

This extension of the LSTM model to custom input sequences provides a powerful tool for anyone interested in understanding or creating music using deep learning. It highlights the potential of AI in generating creative outputs that can augment human creativity.


In [3]:
# --- Custom Input Sequence Prediction ---

# Step 1: Define your custom sequence
# Define a custom sequence of notes to use for prediction
custom_sequence = ['E', 'A', 'A'] # Replace with your desired notes
print(f"Custom input sequence: {custom_sequence}")

# Step 2: Convert notes to integers
# Check if all notes are in the defined set of notes
for note in custom_sequence:
  if note not in note_to_int:
    raise ValueError(f"Note '{note}' is not in the defined set of notes.")
print("All custom sequence notes are valid.")

# Convert the custom sequence to integers using the mapping
custom_sequence_int = [note_to_int[note] for note in custom_sequence]
print(f"Custom sequence converted to integers: {custom_sequence_int}")

# Step 3: Prepare the input data
# Convert the custom sequence into a numpy array and reshape it
custom_input = np.array(custom_sequence_int)
custom_input = custom_input.reshape(1, len(custom_input), 1)
print(f"Reshaped custom input: {custom_input}")

# Normalize the custom input data by dividing by the number of notes (to have values between 0 and 1)
custom_input = custom_input / float(len(the_notes))
print(f"Normalized custom input: {custom_input}")

# Step 4: Make the prediction
# Predict the next note based on the custom input sequence
prediction = model.predict(custom_input, verbose=0)
print(f"Custom input prediction probabilities: {prediction}")

# Extract the index of the note with the highest probability
index = np.argmax(prediction)
print(f"Predicted index for custom input: {index}")

# Convert the predicted index back to the note using `int_to_note` mapping
predicted_note = int_to_note[index]
print(f"Predicted next note for custom input: {predicted_note}")

# Display the result
# Print the custom input sequence and the predicted next note
print(f"Input Sequence: {custom_sequence}")
print(f"Predicted Next Note: {predicted_note}")



Custom input sequence: ['E', 'A', 'A']
All custom sequence notes are valid.
Custom sequence converted to integers: [2, 5, 5]
Reshaped custom input: [[[2]
  [5]
  [5]]]
Normalized custom input: [[[0.28571429]
  [0.71428571]
  [0.71428571]]]
Custom input prediction probabilities: [[5.7655270e-04 5.4495154e-06 4.8819984e-08 1.6732538e-06 1.2921812e-02
  9.0109181e-01 8.5402712e-02]]
Predicted index for custom input: 5
Predicted next note for custom input: A
Input Sequence: ['E', 'A', 'A']
Predicted Next Note: A
