<div align="right"><i>COM418 - Computers and Music</i></div>
<div align="right"><a href="https://people.epfl.ch/paolo.prandoni">Lucie Perrotta</a>, <a href="https://www.epfl.ch/labs/lcav/">LCAV, EPFL</a></div>

<p style="font-size: 30pt; font-weight: bold; color: #B51F1F;">Channel Vocoder</p>

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Audio
from IPython.display import IFrame
from scipy import signal

import import_ipynb
from Helpers import * 

figsize=(10,5)
import matplotlib
matplotlib.rcParams.update({'font.size': 16});

In [None]:
fs=44100

In this notebook, we will implement and test an easy **channel vocoder**. A channel vocoder is a musical device that allows to sing while playing notes on a keyboard at the same time. The vocoder blends the voice (called the modulator) with the played notes on the keyboard (called the carrier) so that the resulting voice sings the note played on the keyboard. The resulting voice has a robotic, artificial sound that is rather popular in electronic music, with notable uses by bands such as Daft Punk, or Kraftwerk.

<img src="https://www.bhphotovideo.com/images/images2000x2000/waldorf_stvc_string_synthesizer_1382081.jpg" alt="Drawing" style="width: 35%;"/>

The implementation of a Channel vocoder is in fact quite simple. It takes 2 inputs, the carrier and the modulator signals, that must be of the same length. It divides each signal into frequency bands called **channels** (hence the name) using many parallel bandpass filters. The width of each channel can be equal, or logarithmically sized to match the human ear perception of frequency. For each channel, the envelope of the modulator signal is then computed, for instance using a rectifier and a moving average. It is simply multiplied to the carrier signal for each channel, before all channels are added back together.

<img src="https://i.imgur.com/aIePutp.png" alt="Drawing" style="width: 65%;"/>

To improve the intelligibility of the speech, it is also possible to add AWGN to each to the carrier of each band, helping to produce non-voiced sounds, such as the sound s, or f. 

As an example signal to test our vocoder with, we are going to use dry voice samples from the song "Nightcall" by french artist Kavinsky.

![Nightcall](https://upload.wikimedia.org/wikipedia/en/5/5b/Kavinsky_Nightcall_2010.png)

First, let's listen to the original song: 

In [None]:
IFrame(src="https://www.youtube.com/embed/46qo_V1zcOM?start=30", width="560", height="315", frameborder="0", allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture")

## 1. The modulator and the carrier signals


We are now going to recreate the lead vocoder using 2 signals: we need a modulator signal, a voice pronouning the lyrics, and a carrier signal, a synthesizer, containing the notes for the pitch.

### 1.1. The modulator

Let's first import the modulator signal. It is simply the lyrics spoken at the right rhythm. No need to sing or pay attention to the pitch, only the prononciation and the rhythm of the text are going to matter. Note that the voice sample is available for free on **Splice**, an online resource for audio production.

In [None]:
nightcall_modulator = open_audio('snd/nightcall_modulator.wav')
Audio('snd/nightcall_modulator.wav', autoplay=False)

### 1.2. The carrier

Second, we import a carrier signal, which is simply a synthesizer playing the chords that are gonna be used for the vocoder. Note that the carrier signal does not need to feature silent parts, since the modulator's silences will automatically mute the final vocoded track. The carrier and the modulator simply need to be in synch with each other.

In [None]:
nightcall_carrier = open_audio('snd/nightcall_carrier.wav')
Audio("snd/nightcall_carrier.wav", autoplay=False)

## 2. The channel vocoder

### 2.1. The channeler

Let's now start implementing the phase vocoder. The first tool we need is an efficient filter to allow decomposing both the carrier and the modulator signals into channels (or bands). Let's call this function the **channeler** since it decomposes the input signals into frequency channels. It takes as input a signal to be filtered, a integer representing the number of bands, and a boolean for setting if we want white noise to be added to each band (used for the carrier).

In [None]:
def channeler(x, n_bands, add_noise=False):
    """
    Separate a signal into log-sized frequency channels.
    x: the input signal
    n_bands: the number of frequency channels
    add_noise: add white noise or note to each channel
    """
    band_freqs = np.logspace(2, 14, n_bands+1, base=2) # get all the limits between the bands, in log space
    
    x_bands = np.zeros((n_bands, x.size)) # Placeholder for all bands
    
    for i in range(n_bands):
        noise = 0.7*np.random.random(x.size) if add_noise else 0 # Create AWGN or not
        x_bands[i] = butter_pass_filter(x + noise, np.array((band_freqs[i], band_freqs[i+1])), fs, btype="band", order=5).astype(np.float32) # Carrier + uniform noise 

    return x_bands

In [None]:
# Example plot
plt.figure(figsize=figsize)
plt.magnitude_spectrum(nightcall_carrier)
plt.title("Carrier signal before channeling")
plt.xscale("log")
plt.xlim(1e-4)
plt.show()

carrier_bands = channeler(nightcall_carrier, 8, add_noise=True)
plt.figure(figsize=figsize)
for i in range(8):
    plt.magnitude_spectrum(carrier_bands[i], alpha=.7)
plt.title("Carrier channels after channeling and noise addition")
plt.xscale("log")
plt.xlim(1e-4)
plt.show()    

### 2.2. The envelope computer

Next, we can implement a simple envelope computer. Given a signal, this function computes its temporal envelope.

In [None]:
def envelope_computer(x):
    """
    Envelope computation of one channels of the modulator
    x: the input signal
    """
    x = np.abs(x) # Rectify the signal to positive
    x = moving_average(x, 1000) # Smooth the signal
    return 3*x # Normalize # Normalize

In [None]:
plt.figure(figsize=figsize)
plt.plot(np.abs(nightcall_modulator)[:150000] , label="Modulator")
plt.plot(envelope_computer(nightcall_modulator)[:150000], label="Modulator envelope")
plt.legend(loc="best")
plt.title("Modulator signal and its envelope")
plt.show()

### 2.3. The channel vocoder (itself)

We can now implement the channel vocoder itself! It takes as input both signals presented above, as well as an integer controlling the number of channels (bands) of the vocoder. A larger number of channels results in the finer grained vocoded sound, but also takes more time to compute. Some artists may voluntarily use a lower numer of bands to increase the artificial effect of the vocoder. Try playing with it!

In [None]:
def channel_vocoder(modulator, carrier, n_bands=32):
    """
    Channel vocoder
    modulator: the modulator signal
    carrier: the carrier signal
    n_bands: the number of bands of the vocoder (better to be a power of 2)
    """
    # Decompose both modulation and carrier signals into frequency channels
    modul_bands = channeler(modulator, n_bands, add_noise=False)
    carrier_bands = channeler(carrier, n_bands, add_noise=True)
    
    # Compute envelope of the modulator
    modul_bands = np.array([envelope_computer(modul_bands[i]) for i in range(n_bands)])

    # Multiply carrier and modulator
    result_bands = np.prod([modul_bands, carrier_bands], axis=0)

    # Merge back all channels together and normalize
    result = np.sum(result_bands, axis=0)
    return normalize(result) # Normalize

In [None]:
nightcall_vocoder = channel_vocoder(nightcall_modulator, nightcall_carrier, n_bands=32)
Audio(nightcall_vocoder, rate=fs)

The vocoded voice is still perfectly intelligible, and it's easy to understand the lyrics. However, the pitch of the voice is now the synthesizer playing chords! One can try to deactivate the AWGN and compare the results. We finally plot the STFT of all 3 signals. One can notice that the vocoded signal has kept the general shape of the voice (modulator) signal, but is using the frequency information from the carrier!

In [None]:
# Plot
f, t, Zxx = signal.stft(nightcall_modulator[:7*fs], fs, nperseg=1000)
plt.figure(figsize=figsize)
plt.pcolormesh(t, f[:100], np.abs(Zxx[:100,:]), cmap='nipy_spectral', shading='gouraud')
plt.title("Original voice (modulator)")
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

f, t, Zxx = signal.stft(nightcall_vocoder[:7*fs], fs, nperseg=1000)
plt.figure(figsize=figsize)
plt.pcolormesh(t, f[:100], np.abs(Zxx[:100,:]), cmap='nipy_spectral', shading='gouraud')
plt.title("Vocoded voice")
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

f, t, Zxx = signal.stft(nightcall_carrier[:7*fs], fs, nperseg=1000)
plt.figure(figsize=figsize)
plt.pcolormesh(t, f[:100], np.abs(Zxx[:100,:]), cmap='nipy_spectral', shading='gouraud')
plt.title("Carrier")
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

## 3. Playing it together with the music

Finally, let's try to play it with the background music to see if it sounds like the original!

In [None]:
nightcall_instru = open_audio('snd/nightcall_instrumental.wav')

nightcall_final = nightcall_vocoder + 0.6*nightcall_instru
nightcall_final = normalize(nightcall_final) # Normalize

Audio(nightcall_final, rate=fs)