# Articulation Synthesis for Stringed Instruments
#### By: Shresta Bangaru

##### **Goal:** Synthesize various articulations (vibrato, staccato, etc.) for a note played on a stringed instrument (e.g. violin). Compare the quality of the sound generated by pre-existing libraries vs. code that I have written on my own using techniques from the class. I will also work with samples that I have recorded, in order to determine how effective the technique is for instruments of a specific quality. 

The articulations we will focus on are: vibrato, staccato, legato, and accent. (Articulations such as pizzicato require plucking, and those such as slurs require multiple notes, so they are different in a technical sense.) 

We will focus on violin notes, and due to their similar construction, the results of this project will be applicable to the viola and cello as well. 

##### **Why?** Articulation synthesis can make applications such as GarageBand seamlessly integrate a wider range of technique from stringed instruments. Perfecting and evaluating articulation synthesis is also useful for applications that read and playback existing repertoire, especially for student and professional musicians to compose their own pieces as well as sightread pieces that they want to play. 

##### **Inspiration:** I am a violinist in the Illini Strings. I have been playing the instrument for ~12 years. I love music and ever since I began my CS major at U of I, I've been meaning to work on an audio computing project related to my passion for the violin. 

### **Part 1: Articulation sound file generation**

### Step 1: Use a pre-existing library to synthesize articulations

In [None]:
# !pip install librosa

Currently using an online audio file that plays the open A string on the violin. 

In [None]:
!wget https://github.com/shresta4/CS448-files/raw/main/Final_Proj/violin_a.mp3

--2023-05-07 03:54:37--  https://github.com/shresta4/CS448-files/raw/main/Final_Proj/violin_a.mp3
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/shresta4/CS448-files/main/Final_Proj/violin_a.mp3 [following]
--2023-05-07 03:54:37--  https://raw.githubusercontent.com/shresta4/CS448-files/main/Final_Proj/violin_a.mp3
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1509805 (1.4M) [audio/mpeg]
Saving to: ‘violin_a.mp3’


2023-05-07 03:54:37 (25.9 MB/s) - ‘violin_a.mp3’ saved [1509805/1509805]



Vibrato 

In [None]:
import librosa
import soundfile as sf
import numpy as np
import matplotlib.pyplot as plt

def create_vibrato(filename, out_filename): 
  # Load the audio file
  y, sr = librosa.load(filename)

  # Set the vibrato parameters
  vibrato_rate = 3 # Hz
  vibrato_depth = 0.0005 # seconds

  # Generate the vibrato effect
  vibrato = np.sin(2 * np.pi * vibrato_rate * np.arange(len(y))/sr)
  vibrato *= vibrato_depth * sr
  y_vibrato = np.interp(np.arange(len(y)), np.arange(len(y))+vibrato, y)

  # Save the output file
  sf.write(out_filename, y_vibrato, sr)

In [None]:
create_vibrato("violin_a.mp3", "violin_a_vibrato.mp3")

Legato

In [None]:
def create_legato(filename, out_filename): 
  # Load the input file using librosa
  y, sr = librosa.load(filename, sr=44100)

  # Compute the envelope of the sound waveform
  mfccs = librosa.feature.mfcc(y=y, sr=sr)
  env = librosa.amplitude_to_db(np.mean(mfccs, axis=0), ref=np.max)

  # Adjust the envelope to the length of the input waveform
  env = librosa.util.fix_length(env, size=y.shape[0])

  # Apply the envelope to the sound waveform
  y_legato = y * librosa.db_to_amplitude(env)

  # Save the output waveform to a file using librosa
  sf.write(out_filename, y_legato, sr)

Source: https://www.ee.columbia.edu/~dpwe/LabROSA/doc/HTKBook21/node51.html

In [None]:
create_legato("violin_a.mp3")

Staccato

In [None]:
def create_staccato(filename, out_filename): 
  # Load the input file using librosa
  y, sr = librosa.load(filename, sr=44100)

  # Compute the onset strength envelope
  onset_env = librosa.onset.onset_strength(y=y, sr=sr)

  # Set the onset threshold using the median of the onset strength envelope
  onset_thresh = np.median(onset_env)

  # Identify the onset time points
  onset_frames = librosa.onset.onset_detect(y=y, sr=sr, onset_envelope=onset_env, backtrack=False)

  # Set the duration of each staccato note (in seconds)
  note_duration = 0.2

  # Initialize the staccato waveform
  y_staccato = np.zeros_like(y)

  # Iterate over the onset time points and replace the corresponding waveform segments with staccato notes
  for onset_frame in onset_frames:
      start = librosa.frames_to_samples(onset_frame)
      end = start + int(note_duration * sr)
      if end < y.shape[0]:
          y_staccato[start:end] = y[start:end]

  # Save the output waveform to a file using soundfile
  sf.write(out_filename, y_staccato, sr)

In [None]:
create_staccato("violin_a.mp3")

Accent

In [None]:
def create_accent(filename, out_filename): 
  # Load audio file
  y, sr = librosa.load('violin_a.mp3')

  # Set accent factor
  accent_factor = 0.7

  # Apply accent effect
  y_acc = np.multiply(y, y + np.abs(y) * accent_factor)

  # Save accented audio file
  sf.write(out_filename, y_acc, sr)

In [None]:
create_accent("violin_a.mp3")

### Step 2: Download all sound files that I played on the violin (16 total notes)

I recorded the following notes: 


* String G: G, A, B, C#
* String D: D, E, F#, G
* String A: A, B, C#, D
* String E: E, F#, G, A


In [None]:
all_notes = [
    "StringENoteA", # 0 
    "StringENoteG", # 1
    "StringENoteFSharp", # 2
    "StringENoteE", # 3
    "StringANoteD", # 4
    "StringANoteCSharp", # 5
    "StringANoteB", # 6
    "StringANoteA", # 7
    "StringDNoteG", # 8
    "StringDNoteFSharp", # 9
    "StringDNoteE", # 10
    "StringDNoteD", # 11
    "StringGNoteCSharp", # 12
    "StringGNoteB", # 13
    "StringGNoteA", # 14
    # "StringGNoteG.m4a", # 15 # TODO: uncomment this 
]

In [None]:
for note in all_notes: 
  filename = "https://github.com/shresta4/CS448-files/raw/main/Final_Proj/" + note + ".m4a"
  !wget $filename

--2023-05-07 03:55:37--  https://github.com/shresta4/CS448-files/raw/main/Final_Proj/StringENoteA.m4a
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/shresta4/CS448-files/main/Final_Proj/StringENoteA.m4a [following]
--2023-05-07 03:55:37--  https://raw.githubusercontent.com/shresta4/CS448-files/main/Final_Proj/StringENoteA.m4a
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 80453 (79K) [audio/mp4]
Saving to: ‘StringENoteA.m4a’


2023-05-07 03:55:37 (6.05 MB/s) - ‘StringENoteA.m4a’ saved [80453/80453]

--2023-05-07 03:55:37--  https://github.com/shresta4/CS448-files/raw/main/Final_Proj/StringENot

### Step 2: Convert all sounds to .wav

In [None]:
# !pip install pydub

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Installing collected packages: pydub
Successfully installed pydub-0.25.1


In [None]:
from pydub import AudioSegment

def m4a_to_wav(filename, out_filename): 
  audio = AudioSegment.from_file(filename)
  audio.export(out_filename, format="wav")

In [None]:
for note in all_notes: 
  m4a_to_wav(note + ".m4a", note + ".wav")

### Step 3: Save trimmed versions of all these files

In [None]:
def extract_segment(filename, out_filename, start_time, duration, file_type): # milliseconds
  # Load the audio file into a Pydub AudioSegment object
  audio = AudioSegment.from_file(filename, format = file_type)

  if start_time + duration > len(audio):
    duration = len(audio) - start_time
    print("Warning: Duration exceeds the file length. Snipping up to the end of the file.")

  # Extract the specified segment from the audio
  segment = audio[start_time:start_time + duration]

  # Save the extracted segment as a new audio file
  segment.export(out_filename, format = file_type)

Maximum 2 seconds

Might have to create a  directory called "trimmed" 

In [None]:
for note in all_notes: 
  filename = note + ".wav"

  trimmed_filename = "/content/trimmed/" + note + ".wav"
  extract_segment(filename, trimmed_filename, 2000, 2000, "wav")

### Step 4: Save all articulations of these sounds

In [None]:
for note in all_notes: 
  filename = note + ".wav"

  vibrato_filename = "/content/vibrato/" + note + "_vibrato.wav"
  create_vibrato(filename, vibrato_filename)

  legato_filename = "/content/legato/" + note + "_legato.wav"
  create_vibrato(filename, legato_filename)

  staccato_filename = "/content/staccato/" + note + "_staccato.wav"
  create_vibrato(filename, staccato_filename)

  accented_filename = "/content/accented/" + note + "_accented.wav"
  create_accent(filename, accented_filename)

### Step 5: Design my own implementation of each articulation. 

In [1]:
import wave
import struct
import math

In [2]:
def create_vibrato_v2(input_file, output_file, depth, speed, rate):
    with wave.open(input_file, 'rb') as input_wav:
        num_channels = input_wav.getnchannels()
        sample_width = input_wav.getsampwidth()
        frame_rate = input_wav.getframerate()
        num_frames = input_wav.getnframes()

        omega = 2.0 * math.pi * speed / frame_rate
        phase = 0.0

        with wave.open(output_file, 'wb') as output_wav:
            output_wav.setnchannels(num_channels)
            output_wav.setsampwidth(sample_width)
            output_wav.setframerate(frame_rate)
            output_wav.setnframes(num_frames)

            for _ in range(num_frames):
                frame = input_wav.readframes(1)
                samples = struct.unpack('<' + ('h' * num_channels), frame)

                new_samples = []
                for sample in samples:
                    vibrato = depth * math.sin(phase)
                    new_sample = int(sample + vibrato)
                    new_samples.append(new_sample)

                    phase += omega
                    if phase >= 2.0 * math.pi:
                        phase -= 2.0 * math.pi

                output_frame = struct.pack('<' + ('h' * num_channels), *new_samples)
                output_wav.writeframes(output_frame)

In [3]:
def create_legato_v2(input_file, output_file, overlap_ratio):
    with wave.open(input_file, 'rb') as input_wav:
        num_channels = input_wav.getnchannels()
        sample_width = input_wav.getsampwidth()
        frame_rate = input_wav.getframerate()
        num_frames = input_wav.getnframes()

        overlap_frames = int(num_frames * overlap_ratio)

        with wave.open(output_file, 'wb') as output_wav:
            output_wav.setnchannels(num_channels)
            output_wav.setsampwidth(sample_width)
            output_wav.setframerate(frame_rate)
            output_wav.setnframes(num_frames + overlap_frames)

            frame = input_wav.readframes(1)
            output_wav.writeframes(frame)

            while frame != b'':
                frame = input_wav.readframes(1)

                samples = struct.unpack('<' + ('h' * num_channels), frame)

                for sample in samples:
                    output_wav.writeframes(struct.pack('<h', sample))

            for _ in range(overlap_frames):
                frame = input_wav.readframes(1)
                output_wav.writeframes(frame)

In [4]:
def create_staccato_v2(input_file, output_file, gap_ratio):
    with wave.open(input_file, 'rb') as input_wav:
        num_channels = input_wav.getnchannels()
        sample_width = input_wav.getsampwidth()
        frame_rate = input_wav.getframerate()
        num_frames = input_wav.getnframes()

        gap_frames = int(num_frames * gap_ratio)

        with wave.open(output_file, 'wb') as output_wav:
            output_wav.setnchannels(num_channels)
            output_wav.setsampwidth(sample_width)
            output_wav.setframerate(frame_rate)
            output_wav.setnframes(num_frames - gap_frames)

            for _ in range(gap_frames):
                frame = input_wav.readframes(1)

            frame = input_wav.readframes(1)
            while frame != b'':
                samples = struct.unpack('<' + ('h' * num_channels), frame)
                
                for sample in samples:
                    output_wav.writeframes(struct.pack('<h', sample))

                frame = input_wav.readframes(1)

In [5]:
def create_accent_v2(input_file, output_file, accent_ratio, accent_strength):
    with wave.open(input_file, 'rb') as input_wav:
        num_channels = input_wav.getnchannels()
        sample_width = input_wav.getsampwidth()
        frame_rate = input_wav.getframerate()
        num_frames = input_wav.getnframes()

        accent_frames = int(num_frames * accent_ratio)

        with wave.open(output_file, 'wb') as output_wav:
            output_wav.setnchannels(num_channels)
            output_wav.setsampwidth(sample_width)
            output_wav.setframerate(frame_rate)
            output_wav.setnframes(num_frames)

            for i in range(num_frames):
                frame = input_wav.readframes(1)

                samples = struct.unpack('<' + ('h' * num_channels), frame)

                new_samples = []
                for sample in samples:
                    if i < accent_frames:
                        accented_sample = sample * accent_strength
                        new_sample = int(accented_sample)
                    else:
                        new_sample = sample

                    new_samples.append(new_sample)
                    
                output_frame = struct.pack('<' + ('h' * num_channels), *new_samples)
                output_wav.writeframes(output_frame)