<a href="https://colab.research.google.com/github/Bateyjosue/AI-Saturday/blob/main/hw1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Style transfer using a Linear Transformation

Here we have three pieces of audio. The first two are `Synth.wav` (audio A) and `Piano.wav` (audio B), which are recordings of a chromatic scale in a single octave played by a synthesizer and a piano respectively. The third piece of audio is the intro melody of “Blinding Lights” (audio C) by The Weeknd, played with the same synth tone used to generate `Synth.wav`.

All audio files are in the `hw1/audio` folder. 

From these files, you can obtain the spectrogram $M_{A}$ , $M_{B}$ and $M_{C}$ . Your objective is to find the spectrogram of the piano version of the song “Blinding Lights” ($M_{D}$).

In this problem, we assume that style can be transferred using a linear transformation. Formally, we need
to find the matrix **T** such that:
$$
TM_{A} ≈ M_{B}
$$

1. Write code to determine matrix **T** and report the value of $∥TM_{A} − M_{B} ∥^{2}_{F}$ .
**Submit the matrix T as $problem3t.csv$ and your code**
2. Our model assumes that **T** can transfer style from synthesizer music to piano music. Applying **T** on $M_{C}$ should give us a estimation of “Blinding Lights” played by Piano, getting an estimation of $M_{D}$. Using this matrix and phase matrix of C, synthesize an audio signal.
**Submit your code, your estimation of the matrix $M_{D}$ as $problem3md.csv$ and the sythensized audio named as $problem3.wav$**

---

In [None]:
from google.colab import drive

drive.mount("/content/gdrive")
%cd /content/gdrive/MyDrive/AI Saturnday/hw1

In [2]:
import numpy as np
import librosa

In [3]:
# Audio A Spectogram
audioA, sr = librosa.load('./audio/Synth.wav', sr=None)
spectoA = librosa.stft(audioA, n_fft=2048, hop_length=256, center=False, win_length=2048)
MA =abs(spectoA)
phase_A = spectoA / (MA + 2.2204e-16)

# Audio B Spectogram
audioB, sr = librosa.load('./audio/Piano.wav', sr=None)
spectoB = librosa.stft(audioB, n_fft=2048, hop_length=256, center=False, win_length=2048)
MB =abs(spectoB)
phase_B = spectoB / (MB + 2.2204e-16)

# Audio B Spectogram
audioC, sr = librosa.load('./audio/BlindingLights.wav', sr=None)
spectoC = librosa.stft(audioC, n_fft=2048, hop_length=265, center=False, win_length=2048)
MC = abs(spectoC)
phase_C = spectoC / (MC + 2.2204e-16)

In [5]:
T = np.matmul(MB,(np.linalg.pinv(MA)))
np.savetxt('problem3t.csv', T)


In [12]:
import math

TX = (np.matmul(T,MA) - MB)
error = np.linalg.norm(TX, ord='fro')

print(T.shape)

(1025, 1025)


In [14]:
MD = np.matmul(T, MC)
np.savetxt('problem3md.csv', MD)

In [8]:
import soundfile as sf

MD_signal = librosa.istft(MD * phase_C, hop_length=256, center=False, win_length=2048)

sf.write('problem3.wav', MD_signal, 44100)

### Bonus: Check your reconstructed signal music:



In [9]:
import IPython.display as ipd
ipd.Audio('problem3.wav')