# TECHIN 509: Melody Representation Assignment

**Name:** Rushav  
**Date:** 10/23/25

---

## Part 1: Storing a Sequence of Notes (C D E F G)

### My Choice: Using a List of Strings

I chose to represent a melody as a **list of strings**, where each string represents a note.
This is also based on using the format from the README.md: `[Note][Accidental][Octave]-[Duration]`

In [1]:
# Example: Storing the melody C D E F G
melody = ["C4-q", "D4-q", "E4-q", "F4-q", "G4-q"]

print("My melody:", melody)
print("First note:", melody[0])
print("Last note:", melody[-1])

My melody: ['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q']
First note: C4-q
Last note: G4-q


### Justification

**Answering the guiding questions:**

1. **If a melody is like a sentence, what would each word be?**
   - Each individual note would be a "word" - just like how a sentence is made up of words in order, a melody is made up of notes in order.

2. **Which Python type keeps items in order?**
   - A **list** keeps items in order and in this case for music, the order matters a lot.
   - Lists maintain the sequence, which is important for music.

3. **How to group several melodies together?**
   - I can use a **list of lists or list within lists** to group multiple melodies:

In [3]:
# Grouping multiple melodies together
melodies = [
    ["C4-q", "D4-q", "E4-q", "F4-q", "G4-q"],   # melody 1
    ["G4-q", "E4-q", "C4-q"],                   # melody 2
    ["C4-h", "E4-h", "G4-w"]                    # melody 3
]

print("Collection of melodies:")
for i, mel in enumerate(melodies, 1):
    print(f"  Melody {i}: {mel}")

Collection of melodies:
  Melody 1: ['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q']
  Melody 2: ['G4-q', 'E4-q', 'C4-q']
  Melody 3: ['C4-h', 'E4-h', 'G4-w']


**Why strings for individual notes?**
- The note representation format (`"C4-q"`) contains multiple pieces of information in one compact form:
  - Note letter (C)
  - Octave (4)
  - Duration (q = quarter note)
- Strings are easy to read and debug
- Later, I can parse/split the string to extract specific information when needed because it is standardized

---

## Part 2: Storing Training Data

### My Preferred Approach: Turn it into one big list

When given multiple melodies for training, it would be easier to  **turn them into a single list** of all notes. This makes it easier to analyze patterns across the entire dataset.

In [4]:
# Starting with multiple melodies
melody1 = ["C4-q", "D4-q", "E4-q", "F4-q", "G4-q"]
melody2 = ["G4-q", "E4-q", "C4-q"]
melody3 = ["C4-h", "E4-h", "G4-w"]

print("Original melodies:")
print("  Melody 1:", melody1)
print("  Melody 2:", melody2)
print("  Melody 3:", melody3)

Original melodies:
  Melody 1: ['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q']
  Melody 2: ['G4-q', 'E4-q', 'C4-q']
  Melody 3: ['C4-h', 'E4-h', 'G4-w']


In [8]:
# Option 1: Start with empty list and use extend()
all_notes = []
all_notes.extend(melody1)
all_notes.extend(melody2)
all_notes.extend(melody3)

print("Flattened using extend():")
print(all_notes)

Flattened using extend():
['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q', 'G4-q', 'E4-q', 'C4-q', 'C4-h', 'E4-h', 'G4-w']


In [11]:
# Option 2: Use list concatenation with + operator
all_notes_concat = melody1 + melody2 + melody3

print("\nFlattened using concatenation:")
print(all_notes_concat)

# Verify they're the same
print("\nAre both methods equal:", all_notes == all_notes_concat)


Flattened using concatenation:
['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q', 'G4-q', 'E4-q', 'C4-q', 'C4-h', 'E4-h', 'G4-w']

Are both methods equal: True


### Justification

**Answering the guiding questions:**

1. **How can I "flatten" a collection?**
   - By combining all the individual lists into one continuous sequence using `.extend()` or `+` operator

2. **Can I start with an empty list and add notes?**
   - Using `.extend()` method adds all items from one list to another

3. **What tool combines lists?**
   - The `+` operator concatenates lists
   - The `.extend()` method
   - A loop that adds each note one by one (too slow I think)

**Why flatten the data?**
- Makes it easier to analyze patterns across ALL melodies at once
- I can count how often each note appears in the entire dataset
- I can see which notes tend to follow which other notes across all examples
- Simpler to work with for later analysis

---

## Part 3: Information to Extract for Training

### What I Want to Learn from the Training Melodies

To make my music composer generate realistic-sounding melodies, I need to extract several types of information from the training data:

In [None]:
# Sample training data (after being processed for simplification))
all_notes = ["C4-q", "D4-q", "E4-q", "F4-q", "G4-q", "G4-q", "E4-q", "C4-q", "C4-h", "E4-h"]

print("Training data:", all_notes)
print("Total notes:", len(all_notes))

Training data: ['C4-q', 'D4-q', 'E4-q', 'F4-q', 'G4-q', 'G4-q', 'E4-q', 'C4-q', 'C4-h', 'E4-h']
Total notes: 10


### 1. Note Frequency - How often does each note appear?

In [12]:
# Count how many times each note appears
note_counts = {}

for note in all_notes:
    if note in note_counts:
        note_counts[note] += 1
    else:
        note_counts[note] = 1

print("Note frequencies:")
for note, count in note_counts.items():
    print(f"  {note}: {count} times")

Note frequencies:
  C4-q: 2 times
  D4-q: 1 times
  E4-q: 2 times
  F4-q: 1 times
  G4-q: 2 times
  C4-h: 1 times
  E4-h: 1 times
  G4-w: 1 times


### 2. Note Transitions - What note typically follows another?

In [14]:
# Track what note comes after each note
# I used ChatGPT to help me approach this part of the problem and then I modified the code to fit my understanding better
transitions = {}

for i in range(len(all_notes) - 1):
    current_note = all_notes[i]
    next_note = all_notes[i + 1]
    
    if current_note not in transitions:
        transitions[current_note] = []
    transitions[current_note].append(next_note)

print("Note transitions (what comes after each note):")
for note, following_notes in transitions.items():
    print(f"  After {note}: {following_notes}")

Note transitions (what comes after each note):
  After C4-q: ['D4-q', 'C4-h']
  After D4-q: ['E4-q']
  After E4-q: ['F4-q', 'C4-q']
  After F4-q: ['G4-q']
  After G4-q: ['G4-q', 'E4-q']
  After C4-h: ['E4-h']
  After E4-h: ['G4-w']


**Why this matters:**
- This is the information for generating natural-sounding melodies since many songs follow this general guidline in the real world
- Instead of picking random notes, we can pick notes that typically follow the current note to make it sound more natural

### 3. Starting and Ending Notes

In [15]:
# What notes do melodies start and end with?

melody1 = ["C4-q", "D4-q", "E4-q", "F4-q", "G4-q"]
melody2 = ["G4-q", "E4-q", "C4-q"]
melody3 = ["C4-h", "E4-h", "G4-w"]

melodies_list = [melody1, melody2, melody3]

starting_notes = []
ending_notes = []

for melody in melodies_list:
    starting_notes.append(melody[0])
    ending_notes.append(melody[-1])

print("Starting notes:", starting_notes)
print("Ending notes:", ending_notes)

# Count most common starting/ending notes
from collections import Counter
print("\nMost common starting notes:", Counter(starting_notes))
print("Most common ending notes:", Counter(ending_notes))

Starting notes: ['C4-q', 'G4-q', 'C4-h']
Ending notes: ['G4-q', 'C4-q', 'G4-w']

Most common starting notes: Counter({'C4-q': 1, 'G4-q': 1, 'C4-h': 1})
Most common ending notes: Counter({'G4-q': 1, 'C4-q': 1, 'G4-w': 1})


**Why this matters:**
- Melodies often start and end on specific notes (like the root note of the key)
- From my README, I mentioned ending on the root note or fifth for resolution
- This helps create satisfying beginnings and endings

### 4. Duration Patterns

In [16]:
# Extract just the durations from notes
durations = []

for note in all_notes:
    # Split by '-' and get the duration part
    duration = note.split('-')[1]
    durations.append(duration)

print("Duration sequence:", durations)

# Count duration frequencies
duration_counts = {}
for dur in durations:
    if dur in duration_counts:
        duration_counts[dur] += 1
    else:
        duration_counts[dur] = 1

print("\nDuration frequencies:")
for dur, count in duration_counts.items():
    print(f"  {dur}: {count} times")

Duration sequence: ['q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'h', 'h', 'w']

Duration frequencies:
  q: 8 times
  h: 2 times
  w: 1 times


**Why this matters:**
- Helps create realistic rhythm patterns
- Some durations might be more common than others (e.g., quarter notes)

### Summary: Key Information to Extract

1. **Note Frequency** - Which notes appear most often?
2. **Note Transitions** - After note X, what typically comes next?
3. **Starting/Ending Notes** - How do melodies begin and end?
4. **Duration Patterns** - What rhythms are common?
5. **Common Sequences** - Are there repeated patterns?

### How This Helps My Music Composer

- **For randomness with rules**: Instead of picking any random note in general, I can pick a random note that's likely to follow the previous note based on the training data
- **Sounds more musical**: The generated melody will have patterns similar to real music
- **Respects the key signature**: If I train on C major melodies, the composer learns what a C-major might sound like
- **Creates natural rhythm**: Duration patterns from real music make the rhythm feel authentic