# Digital Signal Processing Project

Instituto Superior Técnico

Digital Signal Processing 

Prof. Luis Caldas de Oliveira

April 2023

## Authors

**Group 03**

Yandi Jiang, 96344

Bruno Pedro, 96363

## Instructions and Description

This notebook concerns the final version of the project of Digital Signal Processing (DSP) course, at Instituto Superior Técnico, University of Lisbon. Its main objetive is the extraction of features from audio recordings of instruments interpreting musical melodies and its reinterpretation by a digital polyphonic synthesizer. 

The features to be extracted include the detection of note onsets, the pitch, amplitude and duration of each note and, also, the beat and tempo of the melody. The features extracted are, then, organized and used to synthesize a new audio with a different timbre, representing a new interpretation of the same musical melody. The synthesized audio can be modified in order to represent different instruments, and, also, to change some characteristics of the original melody, such as the beat and tempo, the ADSR (attack, decay, sustain, release) envelope of the sound waveform, the octave of the notes or, finally, the addition of an overlaid delayed sound.

The code of this project is organized in three main parts. The first part, which includes the actual code functions developed, is stored outside this Jupyter Notebook, in separated python script files (.py) that are imported here from the GitHub repository (https://github.com/brunopedro1/projectPDS.git) where all the project files are stored. The script feature_extration.py contains all the code necessary to perform the onset detection, including functions to detect the onsets, find the best threshold values and evaluate the performance of the detection, and, also, all the code to perform the extraction of pitch, amplitude and duration of the notes and organize them in a composition. The script file synth.py contains all the code necessary to perform the polyphonic synthesis of an audio from a composition, which includes functions to find the frequency of notes, generate and synthesize wavetables, define a ADSR envelope of the waveform, synthesize audio from a composition according to the chosen wavetable and envolpe, modify the beat, add overlaid delayed sound, and plot the waveform and spectrogram of the audio. 

The second part is where the developed code is applied to some audio recordings and the performance of the obtained results is assessed. First, concerning the onset detection, the performance of the implemented algorithms is evaluated by comparison with human annotations of the true onsets and the results are presented in tables that show the number of True Positives, False Positives and False Negatives onsets, and Precision, Recall and F-measure values. The avarage F-measure for each onset detection method is, also, computed and the results are shown and commented. Having the best onset detector selected for each audio, the features are, finally, extracted from the audios, namely, the onsets, pitch, amplitude and duration of notes, and beat of the melody. Then, the developed polyphonic synthesizer and audio modification features are applied, in detail, to a specific audio recording of piano ('silhuette.wav')Finally, some final results to demonstrate all the capabilities of the code are presented using different audios of viola and piano. In this case, a qualitative evaluation of the results is done, since it is hard to quantify the performance of synthesized audio. 

The third part addresses some simple code tests of the main functions developed, namely, the onset detection, extraction of pitch, amplitude and duration of notes and the capacity of polyphony synthesis. For the code tests, some audios with well-known characteristics are used.

## Video of the Project

In [None]:
%%html
<iframe width="560" height="315" src="https://www.youtube.com/embed/hHfP0OxZERU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

## Libraries


In [None]:
!rm -rf projectPDS
!git clone https://github.com/brunopedro1/projectPDS.git

In [None]:
!pip install madmom
import librosa
import numpy as np
from tabulate import tabulate
from IPython.display import Audio
from scipy.signal import sawtooth, square

# Functions PDS PROJECT: synth.py, feature_extraction.py
from projectPDS.synth import synthesizer, plot_audio
from projectPDS.feature_extration import find_threshold, onset_analyse_performance
from projectPDS.feature_extration import onset_CNN, onset_RNN, onset_complexFlux, onset_superFlux, onset_detection
from projectPDS.feature_extration import get_notes, get_amplitude, get_duration, make_composition, get_rms

# MADMOM LIBRARY
from madmom.features.onsets import CNNOnsetProcessor
from madmom.audio.signal import FramedSignal
from madmom.features.onsets import RNNOnsetProcessor
from madmom.features.onsets import superflux
from madmom.features.onsets import complex_flux
from madmom.features.onsets import OnsetPeakPickingProcessor
from madmom.evaluation.onsets import OnsetEvaluation

import warnings
warnings.filterwarnings("ignore")

## Load Audio Recordings

The audio recordings are stored in a Github repository called "projectPDS" that needs to be cloned before running the code.

In [None]:
audios = ['projectPDS/recordings/minuet.wav', 
          'projectPDS/recordings/telemann.wav', 
         'projectPDS/recordings/gurenge.wav',
        'projectPDS/recordings/naruto.wav',
        'projectPDS/recordings/silhouette.wav'
         ]
human_onsets =['projectPDS/HumanOnsets/minuet.txt',
              'projectPDS/HumanOnsets/telemann.txt',
              'projectPDS/HumanOnsets/gurenge.txt',
              'projectPDS/HumanOnsets/naruto.txt',
              'projectPDS/HumanOnsets/silhouette.txt']
          
# Load Audio Files
s_minuet, fs_minuet = librosa.load('projectPDS/recordings/minuet.wav', sr=48000)
s_telemann, fs_telemann = librosa.load('projectPDS/recordings/telemann.wav',sr=48000)
s_gurenge, fs_gurenge = librosa.load('projectPDS/recordings/gurenge.wav', sr=48000)
s_naruto, fs_naruto = librosa.load('projectPDS/recordings/naruto.wav', sr=48000)
s_silhouette, fs_silhouette = librosa.load('projectPDS/recordings/silhouette.wav', sr=48000)
#print(color.BOLD + 'Original Audios\n' + color.END)

for audio in audios:
    print(audio.rsplit('/',1)[-1])
    x, fs = librosa.load(audio) # get audio
    display(Audio(data=x, rate=fs));

## Onset Detection

Some good onset detection tools from MADMOM library were tried out, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), a method with maximum filter vibrato suppression stage called SuperFlux, a method based on SuperFlux but adds an additional local group delay based tremolo suppression (ComplexFlux). 

In this part, some important utility functions are used, namely, functions to find the best threshold to use in the onset detectors, to evaluate the onset detection using a Madmom tool, to display the audio recordings and to print the performance results.

Five audio recordings ("minuet.wav", "telemann.wav", "gurenge.wav", "naruto.wav", "silhouette.wav") with repective human onset annotations are tested and the results of the performance evaluation are presented in tables that show the number of True Positives, False Positives and False Negatives onsets, and Precision, Recall and F-measure values. 

### Comment about threshold 


The threshold defines the amplitude that identifies the onset. One way to automatically compute the best threshold is to test a range of values and evaluate the F-measure that results when testing the onset detector on an audio recording with human onset annotations for comparison, the find_threshold function is made for this purpose. 

This function was used to compute the best threshold for each Madmom onset detection method and each recording, but, since only 5 recordings are used here, the resulting values were stored manually, to avoid the unnecessary computation load of finding threshold values that are constant everytime the notebook is executed. 

In [None]:
#cnn_threshold=find_threshold(audios,CNNOnsetProcessor(),human_onsets, limit=1,fps=100,window_size=0.1)
#rnn_threshold=find_threshold(audios,RNNOnsetProcessor(),human_onsets, limit=1,fps=100,window_size=0.1)
#superflux_threshold=find_threshold(audios,superflux,human_onsets, limit=10,fps=200,window_size=0.1)
#complexflux_threshold=find_threshold(audios,complex_flux,human_onsets,limit=10,fps=200,window_size=0.1)

cnn_threshold = [0.55, 0.69, 0.99, 0.99, 0.99]
rnn_threshold = [0.38, 0.29, 0.48, 0.49, 0.68]
superflux_threshold =[2.8, 2.8, 9.99, 9.99, 9.99]
complexflux_threshold=[1.92, 1.92, 8.24, 9.99, 9.99]

### Evaluate Performance of Onset Detectors

CNN Performance


In [None]:
cnn_performance=onset_analyse_performance(audios, human_onsets,cnn_threshold, onset_CNN , False)

RNN Performance

In [None]:
rnn_performance=onset_analyse_performance(audios, human_onsets,rnn_threshold, onset_RNN , False)

Super Flux Performance

In [None]:
superflux_performance=onset_analyse_performance(audios, human_onsets,superflux_threshold, onset_superFlux , False)

Complex Flux Performance

In [None]:
comlexflux_performance=onset_analyse_performance(audios, human_onsets, complexflux_threshold, onset_complexFlux , False)

Onset methods Performance Conclusion

In [None]:
perform_results_fmeasure = [['Madmom CNN', cnn_performance],
                   ['Madmom RNN', rnn_performance],
                   ['Madmom Super Flux',superflux_performance],
                    ['Madmom Complex Flux', comlexflux_performance]]

print(tabulate(perform_results_fmeasure, headers=['Onset detection method', 'Avarage F-measure']))

It is possible to verify, from the previous results, that different onset detection methods give different results for the same audios. Meaning that no method was found that outperformed all the others. However, using the F-measure as the main metric to assess the performance of a method, it can be seen that, in general, the Madmom CNN and RNN are the methods that provide the best results. Both these methods are characterized to use Deep Learning models that are trained with thousands of anootated onsets.

The best methods to detect onsets for each audio are:


| Instrument | Audio | Onset Method | F-measure | threshold 
| :-: | :---: | :---: | :---: | :-:                          
| Viola | Minuet | RNN | 0.94 | 0.38                        
| Viola | Telemann | CNN |  1.00 |0.69
| Piano | Gurenge | CNN |  1.00 |0.99
| Piano | Naruto | CNN |  1.00 |0.99
| Piano | Silhouette | CNN |  1.00 |0.99



## Extracting the features from Audio Files

Now, the same 5 audio recording will be used to show the results of the extraction of onsets, pitch, amplitude and duration of the notes, and beat of the melody.

### Onset Extraction

Knowing the best onset detection methods for each audio recording, their onsets are, now, computed.

In [None]:
#Threshold values
minuet_threshold = 0.38
telemann_threshold = 0.69
gurenge_threshold=0.99
naruto_threshold=0.99
silhouette_threshold = 0.99
# Find onsets
minuet_onsets=onset_detection('projectPDS/recordings/minuet.wav', RNNOnsetProcessor(),lim=minuet_threshold)
telemann_onsets=onset_detection('projectPDS/recordings/telemann.wav', CNNOnsetProcessor(),lim=telemann_threshold)
gurenge_onsets=onset_detection('projectPDS/recordings/gurenge.wav', CNNOnsetProcessor(),lim=gurenge_threshold)
naruto_onsets=onset_detection('projectPDS/recordings/naruto.wav', CNNOnsetProcessor(),lim=naruto_threshold)
silhouette_onsets=onset_detection('projectPDS/recordings/silhouette.wav', CNNOnsetProcessor(),lim=silhouette_threshold)

### Pitch Extraction

To find the pitch of each note, the autocorrelation method was used in a stationary zone between onsets. Since this stationary zone may fail to detect the correct note, when a false negative onset occurs, for example, a slicing window with a predefined size is used until a correct note is found.

In [None]:
minuet_notes = get_notes(s_minuet, fs_minuet, minuet_onsets)
telemann_notes = get_notes(s_telemann, fs_telemann, telemann_onsets)
gurenge_notes = get_notes(s_gurenge, fs_gurenge, gurenge_onsets)
naruto_notes = get_notes(s_naruto, fs_naruto, naruto_onsets)
silhouette_notes = get_notes(s_silhouette, fs_silhouette, silhouette_onsets)

### Amplitude Extration

The amplitude of the signal is proportional to its RMS value, so the computation of a frame-based RMS is used to detect the amplitude of each note.

In [None]:
minuet_amplitude = get_amplitude(s_minuet,fs_minuet, minuet_onsets, 256,32, False)
telemann_amplitude = get_amplitude(s_telemann,fs_telemann, telemann_onsets, 256,32, False)
gurenge_amplitude = get_amplitude(s_gurenge,fs_gurenge, gurenge_onsets, 256,32, False)
naruto_amplitude = get_amplitude(s_naruto,fs_naruto, naruto_onsets, 256,32, False)
silhouette_amplitude = get_amplitude(s_silhouette,fs_silhouette, silhouette_onsets, 256,32, False)

### Duration Extration

In this project, it is assumed that the audios have no pause between notes and the overlap of notes is minimal. So, the duration of a note was defined as the time distance between onsets.

In [None]:
minuet_duration = get_duration(s_minuet, fs_minuet, minuet_onsets)
telemann_duration = get_duration(s_telemann, fs_telemann, telemann_onsets)
gurenge_duration = get_duration(s_gurenge, fs_gurenge, gurenge_onsets)
naruto_duration = get_duration(s_naruto, fs_naruto, naruto_onsets)
silhouette_duration=get_duration(s_silhouette, fs_silhouette, silhouette_onsets)

### Beat Extration

To extract the beat and tempo, a function from Librosa library was directly used.

In [None]:
minuet_tempo, _ = librosa.beat.beat_track(y=s_minuet, sr=fs_minuet)
telemann_tempo, _ = librosa.beat.beat_track(y=s_telemann, sr=fs_telemann)
gurenge_tempo, _ = librosa.beat.beat_track(y=s_gurenge, sr=fs_gurenge)
naruto_tempo, _ = librosa.beat.beat_track(y=s_naruto, sr=fs_naruto)
silhouette_tempo, _ = librosa.beat.beat_track(y=s_silhouette, sr=fs_silhouette)

## Musical Signal Synthesis

Once the features are extracted from the audio and saved in variables. It is possible to organize them in the form of a composition that is used by the polyphonic synthesizer to synthesize a new audio.

### Create composition 

Each note contains [note, amplitude, duration, onset], and multiple notes forms a composition

In [None]:
comp_minuet = make_composition(minuet_notes, minuet_amplitude, minuet_duration, minuet_onsets)
comp_telemann = make_composition(telemann_notes, telemann_amplitude, telemann_duration, telemann_onsets)
comp_naruto = make_composition(naruto_notes, naruto_amplitude, naruto_duration, naruto_onsets)
comp_gurenge = make_composition(gurenge_notes, gurenge_amplitude, gurenge_duration, gurenge_onsets)
comp_silhouette = make_composition(silhouette_notes, silhouette_amplitude, silhouette_duration, silhouette_onsets)

In [None]:
comp_naruto = make_composition(naruto_notes, naruto_amplitude, naruto_duration, naruto_onsets)

### Create Synthesizer

The synthesizer is a class with different functionalities, that needs to be created, giving the desired frequency of sampling.

In [None]:
# silhouette synthesized audio will be used as a example
# Create synthesizer
my_synthesizer = synthesizer(fs_silhouette) 

### Wavetables

Other functionality of the class is the generation of wavetables. Four options of wavetables are considered: </p>
- Sin (only has one harmonic, which is the fundamental frequency. Therefore is quit boring to hear)
- Sawtooth (It creates harmonics multiple integers of the fundamental frequency and decays as $ \dfrac{1}{n}$)
- Square (consists in all the odd harmonics and its amplitude decays as $ \dfrac{1}{n}$)
- Triangle (consists by some harmonics which amplitude that decays as $ \dfrac{1}{n^2}$)

The synthesizer works by looping through the composition and adding, to a final array of the synthesized audio signal, the signal of each note with the desired wavetable and envelope, at their correct position in time.

In [None]:
wavetables = [np.sin, square, sawtooth, sawtooth]
width = 1
duty_cycle=0.5
for wt in wavetables:
    if wt == np.sin:
        print("Sin Wavetable")
    elif wt == square:
        print("Square Wavetable")
    elif wt == sawtooth and width == 1:
        print("Sawtooth Wavetable")
    else:
        print("Triangle Wavetable")
    # Generate sin wavetable
    my_synthesizer.wavetable_generate(wt, width=width, dc=duty_cycle) 
    if wt == sawtooth:
        width=0.5
    # Create audio vector
    synth_silhouette=my_synthesizer.synthesize(composition=comp_silhouette) 
    # Play the output signal
    display(Audio(synth_silhouette, rate=fs_silhouette))

### Beat Change

In Music, the beat is the basic unit of time. Changing the beat will make the music slower or faster. 'shilhuette.wav' composition has originally 100 BPM, and a change to 90 BMP is done, making it sound slower.

In [None]:
# Generate square wavetable
my_synthesizer.wavetable_generate(sawtooth, dc=0.5) 

# Define beat as 60
comp_nbeat_silhouette=my_synthesizer.define_beat(composition=comp_silhouette, new_beat = 90, old_beat=silhouette_tempo)

# Create audio vector
synth_nbeat_silhouette=my_synthesizer.synthesize(comp_nbeat_silhouette) 

# Play the output signal
display(Audio(synth_nbeat_silhouette, rate=fs_silhouette))

In [None]:
# Confirmation using librosa beat track
new_tempo, _ = librosa.beat.beat_track(y=synth_nbeat_silhouette, sr=fs_naruto)
print("original beat value: ", silhouette_tempo)
print("new beat value: ", new_tempo)

### Envelope Change

The ADSR envelope describes the evolution of amplitude of each note. Changing it will modify the timbre of the note.

In [None]:
# Generate sin wavetable
my_synthesizer.wavetable_generate(sawtooth, width=0.5) 

# Define envelope format
my_synthesizer.define_adsr_envelope(attack=0.01, decay=0.4, sustain=0.01, release=0.2, height=1) 

# Create audio vector
synth_adsr_silhouette = my_synthesizer.synthesize(comp_silhouette) 

print("Original signal")
plot_audio(synth_silhouette,spectrogram=False)
display(Audio(synth_silhouette, rate=fs_naruto))
print("Signal with envelope change")
plot_audio(synth_adsr_silhouette,spectrogram=False)
# Play the output signal
display(Audio(synth_adsr_silhouette, rate=fs_naruto))

### Octave Change

Change the octave of the notes is multiplying their frequency by $2^n$, where n is the number of octaves we intend to change. If n is positive, the notes change to a higher octave and, if n is negative, they change to a lower octave.

In [None]:
# Minus one octave Up
comp_oct_silhouette = my_synthesizer.octave_change(comp_silhouette, change=-1)

# Generate sin wavetable
my_synthesizer.wavetable_generate(sawtooth, width=0.5) 

# Define envelope format
my_synthesizer.define_adsr_envelope(attack=0.1, decay=0.2, sustain=0.8, release=0.4, height=1) 

# Create audio vector
synth_delay_silhouette = my_synthesizer.synthesize(comp_oct_silhouette) 

# Plot the output signal
plot_audio(synth_delay_silhouette,spectrogram=False)

# Play the output signal
display(Audio(synth_delay_silhouette, rate=fs_naruto))

### Delay

A delay can be added to the start of the audio.

In [None]:
# Add delay of 2 second
comp_delay_silhouette = my_synthesizer.delay_t(comp_silhouette,t_delay=2) 

# Define envelope format
my_synthesizer.define_adsr_envelope(attack=0.1, decay=0.2, sustain=0.8, release=0.4, height=1) 

# Create audio vector
synth_delay_silhouette = my_synthesizer.synthesize(comp_delay_silhouette) 

# Plot the output signal
plot_audio(synth_delay_silhouette,spectrogram=False)

# Play the output signal
display(Audio(synth_delay_silhouette, rate=fs_naruto))

### Piano Sound

It is also possible to recreate a synthesized piano version! Which is possible by getting the notes from previously recorded audios of a real piano wave of each note.

In [None]:
synth_piano_silhouette=my_synthesizer.piano(comp_silhouette, fs_silhouette)

display(Audio(synth_piano_silhouette, rate=fs_silhouette))

## Code Tests

To test the implemented onset detectors and extractors of pitch, amplitude and duration, a composition was manually composed, having, thus, notes with well-known onset, pitch, amplitude and duration. The composition was previously synthesized in a simple sawtooth wave audio that is loaded here.

In [None]:
# Define composition of test (note, amplitude, duration, initial instant)
comp_tminuet = (('G4', 1, 0.5,0), ('C4',0.5,0.25,0.5), ('D4', 0.6,0.25,0.75), ('E4', 0.7,0.25,1),('F4',0.8, 0.25,1.25), 
          ('G4',0.9,0.5,1.5), ('C4',1,0.5, 2),('C4', 1, 0.5 ,2.5),
        ('A4',1,0.5,3), ('F4',0.5,0.25,3.5), ('G4', 0.6, 0.25,3.75), ('A4', 0.7,0.25, 4), ('B4', 0.8,0.25,4.25), 
          ('C5',0.9,0.5,4.5), ('C4',1,0.5,5),('C4',1,0.5,5.5))
    

# Load file with composition of test
s_tminuet, fs_tminuet = librosa.load('projectPDS/recordings/tminuet.wav')

# Display compostition of test
print('Composition of test:')
display(Audio(s_tminuet, rate=fs_tminuet))

### Onset Detection Test

The code test of the onset detection is performed by comparison of the onsets detected using the implemented algorithms and the true onsets.

In [None]:
# Save known onset locations in a list
tminuet_true_onsets=[]
for note in comp_tminuet:
    tminuet_true_onsets.append(note[3])

# MADMOM CNN Onset detector
tminuet_onsets_CNN = onset_detection('projectPDS/recordings/tminuet.wav', CNNOnsetProcessor(),lim=0.25)

# MADMOM RNN Onset detector
tminuet_onsets_RNN = onset_detection('projectPDS/recordings/tminuet.wav', RNNOnsetProcessor(),lim=0.2)

# Test
np.testing.assert_almost_equal(tminuet_onsets_CNN, tminuet_true_onsets, decimal=2)
np.testing.assert_almost_equal(tminuet_onsets_RNN, tminuet_true_onsets, decimal=2)
print('Onset detection test completed successfully')

### Pitch Extration Test

The code test of the pitch extraction is performed by comparison of the notes detected using the function get_notes(), which resorts to the function pitch() to find the frequency of each note, with the true notes.

In [None]:
# Save true notes in a list
tminuet_true_notes=[]
for note in comp_tminuet:
    tminuet_true_notes.append(note[0])

# Extraction of notes from the audio 
tminuet_notes = get_notes(s_tminuet, fs_tminuet, tminuet_onsets_CNN)

# Test
np.testing.assert_array_equal(tminuet_notes, tminuet_true_notes)

print('Pitch extraction test completed successfully')

### Amplitude Extraction Test

The code test of the amplitude extraction is performed by comparison of the notes amplitude, which is found by computation of the maximum RMS value with unitary frame and hop sizes, with the true amplitudes.

In [None]:
# Save true notes amplitude in a list
tminuet_amp = []
for note in comp_tminuet:
    tminuet_amp.append(note[1])

# Extraction of amplitudes from the audio 
test_rms = get_amplitude(s_tminuet, fs_tminuet, tminuet_onsets_CNN, 1, 1, False)

# Test
np.testing.assert_almost_equal(tminuet_amp, test_rms, decimal=1)
print('Amplitude extraction test completed successfully')

### Duration Extraction Test

The code test of the duration extraction is performed by comparison of the notes duration, which is found as the diference between onsets, with the true durations.


In [None]:
# Save true notes duration in a list
tminuet_dur = []
for note in comp_tminuet:
    tminuet_dur.append(note[2])

# Extraction of durations from the audio 
tminuet_duration = get_duration(s_tminuet, fs_tminuet, tminuet_onsets_CNN)

# Test
np.testing.assert_almost_equal(tminuet_dur, tminuet_duration, decimal=2)
print('Duration extraction test completed successfully')

### Polyphony Test

In order to test the code functions developed to make the polyphonic synthesizer, a simple test composition was used. The test composition has 2 notes: A4, with amplitude 0.5 and played for 2 seconds, and C6, with amplitude 0.7 and played for 1 second in the meantime of A4. An audio of the composition is synthesized using a sin wavetable. An envelope with null attack, decay and release, and unitary sustain, is considered given that, for the test, it is important that the energy of the signal is constant during the reproduction of each note.

When 2 notes are reproduced simultaneously, the energy of the signal is the sum of the energy of the signal of each note. Thus, the function get_amplitude(), with unitary frame and hop sizes, is used to find the maximum RMS value at the time each note is reproduced. Then, an assert check is used to verify if the obtained RMS value of the first note corresponds to its true amplitude and if the RMS value of the second note corresponds to the sum of the true amplitudes of the two notes, since, at that time, they are being reproduced at the same time.


In [None]:
comp_polyphonic = (
    ('A4', 0.5, 2, 0), 
    ('C6',0.7, 1,0.5))

onsets_polyphonic = [0, 0.5]

# Save true notes amplitude in a list
polyphonic_amp = []
for note in comp_polyphonic:
    polyphonic_amp.append(note[1])

# ---Synthesize composition---
# Create synthesizer
fs=22050
my_synthesizer = synthesizer(fs)

# Generate sin wavetable
my_synthesizer.wavetable_generate(np.sin) 

# Change envelope
my_synthesizer.define_adsr_envelope(attack=0.0, decay=0, sustain=1, release=0, height=1) 

# Create audio vector
synth_polyphonic = my_synthesizer.synthesize(comp_polyphonic) 

# Play the output signal
display(Audio(synth_polyphonic, rate=22050))

# Compute amplitude of detected notes to get the maximum RMS values of the signal
rms_polyphonic = get_amplitude(synth_polyphonic, fs, onsets_polyphonic, 1, 1, False)

# Test
assert np.isclose(rms_polyphonic[0], polyphonic_amp[0], atol=0.05)
assert np.isclose(rms_polyphonic[1], polyphonic_amp[0]+polyphonic_amp[1], atol=0.05)
print('Polyphony test completed successfully')

## Some Interesiting results

In [None]:
# Create synthsizer 
fs = 48000
interesting_synth= synthesizer(fs=48000)

### Minuet different versions

Original version of minuet is from viola and it is synthesized for two different timbres:
- piano
- sawtooth

In [None]:
# synthesizer settings
interesting_synth.wavetable_generate(sawtooth, width=0.8) 
interesting_synth.define_adsr_envelope(attack=0.1, decay=0.4, sustain=0.2, release=0.2, height=1) 

synth_minuet = interesting_synth.synthesize(comp_minuet)

synth_piano_minuet = interesting_synth.piano(comp_minuet, fs_minuet)

print("Original Viola version:")
display(Audio(s_minuet, rate=fs_minuet))
print("Synthesized Piano version:")
display(Audio(synth_piano_minuet, rate= fs_minuet))
print("Synthesized Sawtooth version:")
display(Audio(synth_minuet, rate = fs_minuet))

From the results we can hear that it is not perfect due to fact that onset detection didnt show f-measure of one. 

### Simple boring version of telemann

In [None]:
interesting_synth.wavetable_generate(np.sin) 
interesting_synth.define_adsr_envelope(attack=0.1, decay=0.4, sustain=0.01, release=0.1, height=1) 

synth_boring_telemann = interesting_synth.synthesize(comp_telemann)

print("Original Version:")
display(Audio(s_telemann, rate=fs_telemann))
print("Boring Version:")
display(Audio(synth_boring_telemann, rate=fs_telemann))

The telemann audio is only synthesized with sin wavetable so, as expected, without harmonics the sound is not very interesting. However, contrasting with minuet audio, the onset detection is perfect for telemann which gives better results after synthesis. 

### Gurenge lower tone plus it's delayed version

In [None]:
# Composition Change
# lower one octave
interesting_synth.wavetable_generate(sawtooth, width=0.5) 
interesting_synth.define_adsr_envelope(attack=0.01, decay=0.4, sustain=0.01, release=0.1, height=1) 
comp_oct_gurenge = interesting_synth.octave_change(comp_gurenge, change=-1)
synth_adsr_gurenge = interesting_synth.synthesize(comp_oct_gurenge) 
# delay 0.1 seconds
interesting_synth.wavetable_generate(sawtooth, width=1) 
interesting_synth.define_adsr_envelope(attack=0.2, decay=0.4, sustain=0.1, release=0.1, height=1) 
comp_delay_gurenge = interesting_synth.delay_t(comp_gurenge, t_delay=0.1) 
synth_delay_gurenge = interesting_synth.synthesize(comp_delay_gurenge)

synth_final_gurenge = interesting_synth.add_two_signal(synth_delay_gurenge, synth_adsr_gurenge)
print("Original Version")
display(Audio(s_gurenge, rate=fs_gurenge))
print("Delayed version one octave lower")
display(Audio(synth_final_gurenge, rate=fs_gurenge))

The gurenge audio is concatenated with a delayed version which can make some "ECO" effect, it is also lowered one octave to distinguish from the original one. 

### Silhouette Polyphonic

Below is combined silhouette audio and hand-composed secondary composition, resulting in a piano polyphonic audio.

In [None]:
# Making Secondary Composition
Beat = 150 # Beat Value need be manually ajusted
Quaver_duration = (60/Beat)/2 # each note is a quaver(half a beat)

silhouette_sec_amplitude = 0.05 
silhouette_sec_notes =('F#2', 'D2', 'F#2', 'A2', 'D3', 'A2', 'F#2','D2',
                  'G2', 'D2', 'G2','B2', 'D3', 'B2', 'G2', 'D2',
                  'A2', 'E2', 'A2', 'E3', 'A3', 'E3', 'A2', 'E2',
                  'A#2', 'F#2', 'A#2', 'F#3', 'A#3', 'F#3', 'A#2', 'F#2',
                  'B2','F#2', 'C3', 'F#3', 'B3', 'F#3', 'B2', 'F#2',
                  'F#2', 'A2', 'D3', 'A3', 'D4', 'A3', 'D3', 'A2')

comp_silhouette_sec=[]
prev_note_start = silhouette_onsets[0]
for note in silhouette_sec_notes:
    features =[]
    features.append(note) # note

    features.append(silhouette_sec_amplitude) # amplitude
    features.append(Quaver_duration) # duration
    features.append(prev_note_start)

    prev_note_start = prev_note_start+Quaver_duration
    comp_silhouette_sec.append(features)

# Synthesizing
# Main Composition
print("Silhouette main comp")
synth_silhouette = interesting_synth.piano(composition = comp_silhouette, fs=fs_silhouette)
display(Audio(synth_silhouette, rate=fs_silhouette))

# Secondary Composition
print("Silhouette second comp")
synth_silhouette_sec=interesting_synth.piano(composition=comp_silhouette_sec, fs=fs_silhouette) 
display(Audio(synth_silhouette_sec, rate=fs_silhouette))

# Complete Composition
synth_silhouette_complete=interesting_synth.add_two_signal(synth_silhouette_sec, synth_silhouette)
print("Silhouette complete")
display(Audio(synth_silhouette_complete, rate=fs_silhouette))

The polyphony makes the sound a lot richer as we can hear from the concatenation of those audio files. 

### Fast Piano version of Naruto with two octave lower

In [None]:
#lower one octave
comp_new_naruto = interesting_synth.octave_change(comp_naruto, change=-2)
comp_new_naruto = interesting_synth.define_beat(comp_new_naruto, new_beat=240, old_beat=naruto_tempo)
synth_new_naruto = interesting_synth.piano(composition=comp_new_naruto, fs=fs_naruto)

print("Orignal Version:")
display(Audio(s_naruto, rate=fs_naruto))

print("Fast synthesized piano version with 2 ocatave lower:")
display(Audio(synth_new_naruto, rate=fs_naruto))

Some composition sounds better with low frequency notes, which is the case of the naruto composition, it is synthesized with 240 BPM and two octaves lower.

# References:

Audios:<br />
https://www.youtube.com/watch?v=S8RZJ8GndBA <br/>
https://www.youtube.com/watch?v=qGjFGaWJPi8 <br/>
https://www.youtube.com/watch?v=qdvYO8HgbEQ <br/>
https://www.youtube.com/watch?v=9mIUEN1GIRw 

Single Piano Notes Collection:<br />
https://github.com/plemaster01/PythonPiano </p>