**Social Computing Mini-Project**

**"Graph Neural Networks for Harmonius Music Composition"**

Team Members:
- Vishwanath Sridhar [PES1UG22AM194]
- Venkat Subramanian [PES1UG22AM188]
- Shri Hari [PES1UG22AM154]
- Vismaya Vadana [PES1UG22AM195]

Installing the torch geometric library so we can handle graphs.

In [1]:
!pip install torch_geometric

Collecting torch_geometric
  Downloading torch_geometric-2.6.1-py3-none-any.whl.metadata (63 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/63.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.1/63.1 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
Downloading torch_geometric-2.6.1-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m23.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: torch_geometric
Successfully installed torch_geometric-2.6.1


Extracting the dataset to work on it.

In [2]:
!unzip '/content/maestro-v3.0.0-midi.zip'

Archive:  /content/maestro-v3.0.0-midi.zip
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_08_R1_2004_01-02_ORIG_MID--AUDIO_08_R1_2004_01_Track01_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_09_R1_2004_05_ORIG_MID--AUDIO_09_R1_2004_06_Track06_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_14_R1_2004_01-03_ORIG_MID--AUDIO_14_R1_2004_01_Track01_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_01_R1_2004_01-02_ORIG_MID--AUDIO_01_R1_2004_03_Track03_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_SMF_13_01_2004_01-05_ORIG_MID--AUDIO_13_R1_2004_09_Track09_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_18_R1_2004_01-02_ORIG_MID--AUDIO_18_R1_2004_03_Track03_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_19_R1_2004_01-02_ORIG_MID--AUDIO_19_R1_2004_01_Track01_wav.midi  
  inflating: maestro-v3.0.0/2004/MIDI-Unprocessed_XP_04_R1_2004_06_ORIG_MID--AUDIO_04_R1_2004_08_Track08_wav.midi  
  inflatin

Importing the necessary libraries.

Converting into a dataframe using pandas and inspect it manually.

We choose a small arbitrary subset to work with (long training times otherwise!).

In [64]:
import os
import json
import matplotlib.pyplot as plt
from music21 import converter, note, chord, stream, tempo
import torch
import torch.nn.functional as F
from torch_geometric.data import Data, DataLoader
from torch_geometric.nn import GCNConv, global_mean_pool
import networkx as nx
import pandas as pd

# Load MAESTRO metadata (adjust the path to where the dataset is stored)
with open("maestro-v3.0.0/maestro-v3.0.0.json") as f:
    maestro_metadata = json.load(f)

maestro_metadata_df = pd.DataFrame.from_dict(maestro_metadata)
print(maestro_metadata_df.head())

# Select a subset of MIDI files for quick experimentation
midi_files = list(maestro_metadata_df['midi_filename'])[330:350]
print(midi_files)

   canonical_composer                canonical_title       split  year  \
0          Alban Berg                   Sonata Op. 1       train  2018   
1          Alban Berg                   Sonata Op. 1       train  2008   
2          Alban Berg                   Sonata Op. 1       train  2017   
3  Alexander Scriabin  24 Preludes Op. 11, No. 13-24       train  2004   
4  Alexander Scriabin               3 Etudes, Op. 65  validation  2006   

                                       midi_filename  \
0  2018/MIDI-Unprocessed_Chamber3_MID--AUDIO_10_R...   
1  2008/MIDI-Unprocessed_03_R2_2008_01-03_ORIG_MI...   
2  2017/MIDI-Unprocessed_066_PIANO066_MID--AUDIO-...   
3  2004/MIDI-Unprocessed_XP_21_R1_2004_01_ORIG_MI...   
4  2006/MIDI-Unprocessed_17_R1_2006_01-06_ORIG_MI...   

                                      audio_filename    duration  
0  2018/MIDI-Unprocessed_Chamber3_MID--AUDIO_10_R...  698.661160  
1  2008/MIDI-Unprocessed_03_R2_2008_01-03_ORIG_MI...  759.518471  
2  2017/MIDI-Unpr

Another inspection!

In [14]:
print(midi_files[0])

2008/MIDI-Unprocessed_07_R2_2008_01-05_ORIG_MID--AUDIO_07_R2_2008_wav--1.midi


Converting midi files to graphs using this function while extracting notes, durations and offsets.

In [15]:
def midi_to_graph(midi_path):
    midi_data = converter.parse(midi_path)
    notes = []
    durations = []
    offsets = []

    for element in midi_data.flatten().notes:
        if isinstance(element, note.Note):
            notes.append(element.pitch.midi)
            durations.append(element.quarterLength)
            offsets.append(element.offset)
        elif isinstance(element, chord.Chord):
            notes.append([p.midi for p in element.pitches])
            durations.append(element.quarterLength)
            offsets.append(element.offset)

    # Build a graph using NetworkX
    G = nx.Graph()
    for i, (n, d, o) in enumerate(zip(notes, durations, offsets)):
        G.add_node(i, pitch=n, duration=d, offset=o)

        if i > 0:
            G.add_edge(i - 1, i, weight=abs(offsets[i] - offsets[i - 1]))

    return G

# Convert all MIDI files to graphs
graphs = [midi_to_graph("maestro-v3.0.0/" + midi_file) for midi_file in midi_files]


Extracting common melodic intervals and note sequences from the graph.

In [80]:
def extract_melodic_patterns(G):
    melodic_patterns = []
    prev_pitch = None  # Track previous pitch for interval calculation

    for node in G.nodes:
        pitch = G.nodes[node]['pitch']

        if isinstance(pitch, int):  # Single note
            if prev_pitch is not None:  # If there's a previous note to compare with
                melodic_patterns.append(pitch - prev_pitch)  # Store interval between consecutive notes
            prev_pitch = pitch  # Update the previous pitch for the next iteration

        # If it's a chord (a list of pitches), we could consider different strategies, like:
        elif isinstance(pitch, list):  # If it's a chord
            prev_pitch = sum(pitch) / len(pitch)  # Take the average pitch of the chord (for simplicity)

    return melodic_patterns

Extracts rhythmic durations from the graph nodes.

In [None]:
def extract_rhythmic_patterns(G):
    rhythmic_patterns = []
    for node in G.nodes:
        duration = G.nodes[node]['duration']

        # If the duration is a tuple, return the average
        if isinstance(duration, tuple):
            duration = sum(duration) / len(duration)

        rhythmic_patterns.append(duration)

    return rhythmic_patterns

Extract harmonic patterns based on chords and their transitions.

In [None]:
def extract_harmonic_patterns(G):
    harmonic_patterns = []
    for node in G.nodes:
        pitch = G.nodes[node]['pitch']
        if isinstance(pitch, list):  # If it's a chord
            harmonic_patterns.append(sorted(pitch))  # Add sorted chord
    return harmonic_patterns

Extract chord progressions based on graph connectivity and pitch proximity.

In [81]:
def extract_chord_progressions(G):
    chords = []
    for node in G.nodes:
        pitch = G.nodes[node]['pitch']
        if isinstance(pitch, list):  # If it's already a chord
            chords.append(sorted(pitch))  # Add sorted pitches
        else:
            # Check neighbors to form triads
            neighbors = list(G.neighbors(node))
            if len(neighbors) >= 2:  # Look for at least two neighbors
                triad = sorted(
                    [G.nodes[neighbor]['pitch'] for neighbor in neighbors if isinstance(G.nodes[neighbor]['pitch'], int)]
                )[:3]  # Limit to a triad
                if len(triad) == 3:
                    chords.append(triad)
    return chords


Converts a graph into a PyTorch Geometric Data object.

Extracts node features mentioned previously('pitch', 'duration', 'offset') and edge attributes ('weight') from the graph, and returns them in a format compatible with PyTorch Geometric.

In [102]:
def graph_to_data(G):
    x = []
    edge_index = []
    edge_attr = []

    for node, data in G.nodes(data=True):
        x.append([data['pitch'] if isinstance(data['pitch'], int) else sum(data['pitch']) / len(data['pitch']),
                  data['duration'], data['offset']])

    for edge in G.edges(data=True):
        edge_index.append([edge[0], edge[1]])
        edge_attr.append([edge[2]['weight']])

    x = torch.tensor(x, dtype=torch.float)
    edge_index = torch.tensor(edge_index, dtype=torch.long).t().contiguous()
    edge_attr = torch.tensor(edge_attr, dtype=torch.float)

    return Data(x=x, edge_index=edge_index, edge_attr=edge_attr)

# Convert all graphs to PyTorch Geometric Data objects
data_list = [graph_to_data(G) for G in graphs]


A Graph Neural Network (GNN) for music analysis and generation.

Uses Graph Convolutional Networks(GCNs) for processing music data.


In [109]:
class MusicGNN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(MusicGNN, self).__init__()
        self.conv1 = GCNConv(3, hidden_channels)  # Input size: 3 (pitch, duration, offset)
        self.conv2 = GCNConv(hidden_channels, hidden_channels)
        self.lin = torch.nn.Linear(hidden_channels, 2)  # Output size: 2 (pitch, duration)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        # Graph Convolutions
        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = self.conv2(x, edge_index)

        # Global mean pooling
        x = global_mean_pool(x, data.batch)

        # Fully connected layer for final output (2 values: pitch, duration)
        return self.lin(x)


# Instantiate the model, optimizer, and define the device
model = MusicGNN(hidden_channels=64)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)


Training the model with a DataLoader to batch the data.

In [110]:
# Create a DataLoader
data_loader = DataLoader(data_list, batch_size=2, shuffle=True)

# Training loop
model.train()
for epoch in range(15):
    total_loss = 0
    for data in data_loader:
        data = data.to(device)
        optimizer.zero_grad()
        out = model(data)
        loss = F.mse_loss(out, torch.ones(out.size()).to(device))  # Placeholder target
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f'Epoch {epoch+1}, Loss: {total_loss/len(data_loader)}')


Epoch 1, Loss: 848.2954544067383
Epoch 2, Loss: 189.3859145283699
Epoch 3, Loss: 48.897486650943755
Epoch 4, Loss: 18.25457297563553
Epoch 5, Loss: 3.2637901708483694
Epoch 6, Loss: 1.4552348479628563
Epoch 7, Loss: 0.8213270887732506
Epoch 8, Loss: 0.2000082464888692
Epoch 9, Loss: 0.05332934651523828
Epoch 10, Loss: 0.056991237495094535
Epoch 11, Loss: 0.06166550638154149
Epoch 12, Loss: 0.04691046047955751
Epoch 13, Loss: 0.026401508320122957
Epoch 14, Loss: 0.021027505211532117
Epoch 15, Loss: 0.017016294179484248


Generates music from a seed graph, ensuring the total duration matches the target duration.

In [111]:
def generate_music(model, seed_graph, target_duration=30):
    model.eval()
    generated_stream = stream.Stream()

    # Extract patterns from the graph
    melodic_patterns = extract_melodic_patterns(seed_graph)
    rhythmic_patterns = extract_rhythmic_patterns(seed_graph)

    total_duration = 0  # Keep track of total duration in seconds
    quarter_length_to_seconds = 0.5  # We will assume a quarter note equals 0.5 seconds initially

    # Use the seed graph to create input data for the model
    seed_data = graph_to_data(seed_graph).to(device)

    while total_duration < target_duration:
        # Use the model to predict the next note's pitch and duration
        model_output = model(seed_data)
        predicted_pitch, predicted_duration = model_output[0]  # Extract predicted pitch and duration

        # Optionally, you could use a more sophisticated logic for rhythmic patterns, but we'll stick with it for now
        index = int(total_duration) % len(melodic_patterns)
        next_melody_interval = melodic_patterns[index]
        next_rhythm = rhythmic_patterns[index]

        # Generate the next note based on the model's prediction
        next_pitch = predicted_pitch.item() + next_melody_interval  # Apply melodic interval to predicted pitch
        note_duration = predicted_duration.item()  # Use predicted duration

        # Create the note object
        new_note = note.Note(next_pitch, quarterLength=note_duration)
        generated_stream.append(new_note)

        # Update total duration based on the note's rhythm (in seconds)
        total_duration += note_duration * quarter_length_to_seconds

    # Scale the generated music to fit exactly into the target duration
    if total_duration > target_duration:
        scale_factor = target_duration / total_duration
        for new_note in generated_stream:
            new_note.quarterLength *= scale_factor

    return generated_stream


A few sample outputs that are generated on specific graphs.

In [122]:
generated_music = generate_music(model, graphs[4], target_duration=5)
generated_music.show('midi')

In [116]:
generated_music = generate_music(model, graphs[8], target_duration=10)
generated_music.show('midi')

**:D ------THANK YOU!------ :D**