<a href="https://colab.research.google.com/github/magenta/ddsp/blob/main/ddsp/colab/tutorials/4_core_functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


##### Copyright 2021 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");





In [None]:
# Copyright 2021 Google LLC. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

# DDSP Core Functions

This notebook provides some simple demonstrations of using DDSP primitives for synthesis, filtering, and interpolation. Keep in mind that all of these components are fully differentiable and can be integrated with neural networks and end-2-end learning. 

While the preferred API is to use the Synthesizer and Effect Processors that are built around these central components, it is of course possible to call the core functions directly as well.

In [None]:
#@title Install and import dependencies

%tensorflow_version 2.x
!pip install -qU ddsp

# Ignore a bunch of deprecation warnings
import warnings
warnings.filterwarnings("ignore")

import ddsp
import ddsp.training
from ddsp.colab.colab_utils import (play, specplot, transfer_function, 
                                    plot_impulse_responses, DEFAULT_SAMPLE_RATE)
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

sample_rate = DEFAULT_SAMPLE_RATE  # 16000

# Generation

## `oscillator_bank()`



Synthesize audio with an array of sinusoidal oscillators. Frequencies and amplitudes must be provided at audio rate.


### Ex: Simple harmonic sound

In [None]:
n_samples = int(sample_rate * 4.0)
n_components = 3

# Amplitudes [n_batch, n_samples, n_components].
# Linearly decay in time.
amps = np.linspace(0.3, 0.0, n_samples)
amps = np.tile(amps[np.newaxis, :, np.newaxis], [1, 1, n_components])

# Frequencies in Hz [n_batch, n_samples, n_components].
frequencies = np.ones([1, n_samples, 1]) * np.array([[[220, 440, 660]]])

# Sythesize.
audio = ddsp.core.oscillator_bank(frequencies, amps, sample_rate)

# Listen.
play(audio)
specplot(audio)

### Ex: Random frequencies

In [None]:
n_samples = int(sample_rate * 4.0)
n_components = 6
n_frames = 100

# Amplitudes [n_batch, n_samples, n_components].
# Linearly decay in time.
amps = np.linspace(0.3, 0.0, n_samples)
amps = np.tile(amps[np.newaxis, :, np.newaxis], [1, 1, n_components])

# Frequencies in Hz [n_batch, n_samples, n_components].
frequencies = []
for _ in range(n_components):
  f_start = np.random.uniform(20, 4000)
  f_end = np.random.uniform(20, 4000)
  frequencies.append(np.linspace(f_start, f_end, n_frames))
frequencies = np.stack(frequencies).T[np.newaxis, ...] 
frequencies = ddsp.core.resample(frequencies, n_samples)

# Sythesize.
audio = ddsp.core.oscillator_bank(frequencies, amps, sample_rate)
audio /= np.abs(audio).max()

# Listen.
play(audio)
specplot(audio)

### Ex: Swarm of sinusoids

Just for fun...

In [None]:
def smooth(x, window_size=2000):
  """Smooth signal with box filter. For random frequency modulation."""
  window = np.ones(window_size) / window_size
  return np.convolve(window, x, mode='same')

n_samples = int(sample_rate * 6.0)
n_components = 100

# Time points for the frequency ramp.
n_start = int(sample_rate * 1.5)
n_stop = int(sample_rate * 4.0)
n_ramp = n_stop - n_start
n_level = n_samples - n_stop

# Amplitudes [n_batch, n_samples, n_components].
# Decrease amplitude for higher components.
amps = np.ones([1, n_samples, 1])
amps = amps * np.logspace(0, -2, n_components)[np.newaxis, np.newaxis, :]
# Fade in at the start, out at end.
amps[:, :n_start, :] *= np.logspace(-2, 0, n_start)[np.newaxis, :, np.newaxis]
amps[:, -2000:, :] *= np.logspace(0, -2, 2000)[np.newaxis, :, np.newaxis]

# Frequencies in Hz [n_batch, n_samples, n_components].
# Sweep frequencies from random initial frequenices to fixed final frequencies.
freq_initial = np.random.uniform(low=240.0, high=280.0, size=10)
harmonics = np.arange(1, 11)
f0 = np.array([0.5, 1, 1, 2.5, 3, 3.5, 4, 4.5, 5, 5.25])
freq_final = 150 * f0

# Treat each frequency sweep separately.
frequencies = []

for i, f in zip(freq_initial, freq_final):
  # Sweep the frequency.
  freq = np.concatenate(
      [i * np.ones(n_start), np.linspace(i, f, n_ramp), f * np.ones(n_level),])
  # Modulate the frequency.
  d_freq = smooth(np.concatenate([
      np.random.uniform(low=0.1, high=1.9, size=n_start),
      np.random.uniform(low=0.5, high=1.5, size=n_ramp + n_level),
  ]))
  freq *= d_freq
  # Add harmonics for each fundamental.
  frequencies.append([freq * h for h in harmonics])

# Rearrange to [n_batch, n_samples, n_components].
frequencies = np.transpose(np.stack(frequencies), (2, 1, 0))
frequencies = np.reshape(frequencies, [1, n_samples, -1])

In [None]:
# Sythesize.
audio = ddsp.core.oscillator_bank(frequencies, amps, sample_rate)

# Listen.
audio /= np.abs(audio).max()
play(audio)
specplot(audio)

## `linear_lookup()`


Synthesize audio with an array of sinusoidal oscillators. Frequencies and amplitudes must be provided at audio rate.


### Ex: Sinusoidal lookup

As a simple example, lookup from a sin-wave wavetable produces a sin wave at the lookup frequency

In [None]:
n_samples = int(sample_rate * 1.0)
n_wavetable = 2048
n_cycles = 440

# Sin wave
wavetable = tf.sin(tf.linspace(0.0, 2.0 * np.pi, n_wavetable))
wavetable = wavetable[tf.newaxis, tf.newaxis, :]

phase = tf.linspace(0.0, n_cycles, n_samples) % 1.0
phase = phase[tf.newaxis, :, tf.newaxis]

output = ddsp.core.linear_lookup(phase, wavetable)
target = np.sin(np.linspace(0.0, 2.0 * np.pi * n_cycles, n_samples))

# For plotting.
output = output[0]
phase = phase[0, :, 0]

In [None]:
# Plot the results
plt.figure(figsize=(12, 6))

plt.subplot(121)
plt.plot(wavetable[0, 0, :])
plt.title('Wavetable')

plt.subplot(122)
plt.plot(output[:200], label='Output')
plt.plot(phase[:200], label='Oscillator Phase')
plt.plot(target[:200], label='Target')
plt.title('Wavetable lookup')
plt.legend(loc='lower left')

print('Target')
play(target)
print('Output')
play(output)

There are small aritfacts due to the linear interpolation and implicit resampling of the signal

In [None]:
plt.plot(target[:200] - output[:200])

You can also use any arbitrary waveform as the lookup signal to get more interesting outputs

In [None]:
modulation =  tf.linspace(0.0, 0.5, n_samples)
modulation = modulation[tf.newaxis, :, tf.newaxis]
phase2 = (tf.sin(np.pi * phase[tf.newaxis, :, tf.newaxis]) + modulation)**2.0 % 1.0
output2 = ddsp.core.linear_lookup(phase2, wavetable)

# For plotting
output2 = output2[0]
phase2 = phase2[0, :, 0]

# Plot the results
plt.figure(figsize=(6, 6))

plt.plot(output2[:200], label='Output')
plt.plot(phase2[:200], label='Oscillator Phase')
plt.title('Wavetable lookup')
plt.ylim(-1.5, 1.5)
plt.legend(loc='lower left')

print('Output')
play(output2)

### Ex: Wavetable Synthesis

We can also use this linear lookup to build a wavetable synthesizer. Here, we pass in a series of wavetables (one for each timestep) and look up from the changing wavetables over time

In [None]:
n_secs = 3
n_samples = int(sample_rate * n_secs)
n_wavetable = 2048
n_cycles = 110 * n_secs

phase = tf.linspace(0.0, n_cycles, n_samples) % 1.0
phase = phase[tf.newaxis, :, tf.newaxis]

# Sin wave
wavetable_sin = tf.sin(tf.linspace(0.0, 2.0 * np.pi, n_wavetable))
wavetable_sin = wavetable_sin[tf.newaxis, tf.newaxis, :]

# Square wave
wavetable_square = tf.cast(wavetable_sin > 0.0, tf.float32) * 2.0 - 1.0

# Combine them
wavetables = tf.concat([wavetable_sin, wavetable_square], axis=1)
wavetables = ddsp.core.resample(wavetables, n_samples)

wavetables *= 0.5

output_multiwave = ddsp.core.linear_lookup(phase, wavetables)

# For plotting
wavetables = wavetables[0]
output_multiwave = output_multiwave[0]
phase = phase[0, :, 0]

In [None]:
# Plot the results
plt.figure(figsize=(12, 6))

plt.subplot(121)
plt.plot(wavetables[0, :])
plt.plot(wavetables[16000, :])
plt.plot(wavetables[32000, :])
plt.title('Wavetable')

plt.subplot(122)
plt.plot(output_multiwave[:200], label='Output')
plt.plot(phase[:200], label='Oscillator Phase')
plt.title('Wavetable lookup')
plt.legend(loc='lower left')

print('Output')
play(output_multiwave)

## `wavetable_synthesis()`

We also have a convenience function to make wavetable synthesis easier. `wavetable_synthesis()` takes a frame-based frequency and amplitude of the oscillator.

In [None]:
n_frames = 100
frequencies = 110 * tf.linspace(1.5, 1, n_frames)[tf.newaxis, :, tf.newaxis]
amplitudes = 0.5 * tf.linspace(0.7, 0.001, n_frames)[tf.newaxis, :, tf.newaxis]

n_secs = 3
n_samples = int(sample_rate * n_secs)
n_wavetable = 2048

# Sin wave
wavetable_sin = tf.sin(tf.linspace(0.0, 2.0 * np.pi, n_wavetable))
wavetable_sin = wavetable_sin[tf.newaxis, tf.newaxis, :]

# Square wave
wavetable_square = tf.cast(wavetable_sin > 0.0, tf.float32) * 2.0 - 1.0

# Combine them
wavetables = tf.concat([wavetable_sin, wavetable_square, wavetable_sin], axis=1)
wavetables = ddsp.core.resample(wavetables, n_samples)

output_multiwave = ddsp.core.wavetable_synthesis(frequencies, 
                                                 amplitudes, 
                                                 wavetables, 
                                                 n_samples=n_samples,
                                                 sample_rate=sample_rate)

# For plotting
wavetables = wavetables[0]
output_multiwave = output_multiwave[0]

In [None]:
# Plot the results
plt.figure(figsize=(12, 6))

plt.subplot(121)
plt.plot(wavetables[0, :])
plt.plot(wavetables[16000, :])
plt.plot(wavetables[32000, :])
plt.title('Wavetable')

print('Output')
play(output_multiwave)

## `variable_length_delay()`

If we instead treat a moving window as a "wavetable" we can implement a variable time delay in a forward pass using `linear_lookup()`. Variable time delays are the key component to time modulation effects such as [vibrato](https://en.wikipedia.org/wiki/Vibrato), [chorus](https://en.wikipedia.org/wiki/Chorus_effect), and [flanging](https://en.wikipedia.org/wiki/Flanging). 

In [None]:
# Get a single example from NSynth.
# Takes a few seconds to load from GCS.
data_provider = ddsp.training.data.NSynthTfds(split='train')
batch = data_provider.get_batch(batch_size=1, shuffle=False).skip(1)
audio = next(iter(tfds.as_numpy(batch)))['audio']

specplot(audio)
play(audio)

In [None]:
n_samples = audio.shape[1]
n_seconds = n_samples / sample_rate

def sin_phase(mod_rate):
  phase = tf.sin(tf.linspace(0.0, mod_rate * n_seconds * 2.0 * np.pi, n_samples))
  phase = (phase[tf.newaxis, :, tf.newaxis] + 1.0) / 2.0  # Scale to [0, 1.0]
  return phase

### Ex. Flanger

In [None]:
mod_rate = 0.25  # Hz
mod_ms = 1.5
center_ms = 0.0
delay_ms = mod_ms + center_ms
max_length = int(sample_rate / 1000.0 * delay_ms)

phase = sin_phase(mod_rate) * (mod_ms / delay_ms) + (center_ms / delay_ms)

audio_wet = ddsp.core.variable_length_delay(phase, 
                                            audio, 
                                            max_length=max_length)

audio_out = 0.5 * (audio + audio_wet)

# Listen.
play(audio_out)
specplot(audio_out)

### Ex. Chorus

In [None]:
mod_rate = 2.0  # Hz
mod_ms = 1.0
center_ms = 25.0
delay_ms = mod_ms + center_ms
max_length = int(sample_rate / 1000.0 * delay_ms)

phase = sin_phase(mod_rate) * (mod_ms / delay_ms) + (center_ms / delay_ms)

audio_wet = ddsp.core.variable_length_delay(phase, 
                                            audio, 
                                            max_length=max_length)

audio_out = 0.5 * (audio + audio_wet)

# Listen.
play(audio_out)
specplot(audio_out)

### Ex. Vibrato

In [None]:
mod_rate = 1.0  # Hz
mod_ms = 20.0
center_ms = 00.0
delay_ms = mod_ms + center_ms
max_length = int(sample_rate / 1000.0 * delay_ms)

phase = sin_phase(mod_rate) * (mod_ms / delay_ms) + (center_ms / delay_ms)

audio_wet = ddsp.core.variable_length_delay(phase, 
                                            audio, 
                                            max_length=max_length)

audio_out = audio_wet

# Listen.
play(audio_out)
specplot(audio_out)

# Filtering


Time-varying differentiable linear filters (parameterized in frequency space). Impulse responses are designed by `sinc_impulse_response()` and `frequency_impulse_reponse()` and then applied by `fft_convolve()`. 

`sinc_filter()` and `frequency_filter()` are thin wrappers around filter design and `fft_convolve()`.

## `fft_convolve()`


Time-varying filter. Given audio [batch, n_samples], and a series of impulse responses [batch, n_frames, n_impulse_response], splits the audio into frames, applies filters, and then overlap-and-adds audio back together. 

#### Ex: Low-pass sweep

In [None]:
## Low-pass sweep in Hertz.
noise = np.random.uniform(-0.5, 0.5, [1, sample_rate *4])
f_cutoff = np.linspace(0., 1.0, 200)[np.newaxis, :, np.newaxis]

ir = ddsp.core.sinc_impulse_response(f_cutoff , 2048)
filtered = ddsp.core.fft_convolve(noise, ir)

specplot(noise)
specplot(filtered)
play(noise)
play(filtered)

## `sinc_impulse_response()`


Simple FIR low-pass filter design using sinc functions.

### Ex: Brick-wall filter


In [None]:
# Brick-wall filter
f_cutoff = 4000
window_size = 2000

# True filter.
impulse_response = ddsp.core.sinc_impulse_response(f_cutoff, 
                                                   window_size, 
                                                   sample_rate)

# Ideal brick-wall filter
half_nyquist = int(window_size / 2)
desired_magnitudes = np.concatenate([np.ones([half_nyquist]), 
                                     np.zeros([half_nyquist]) + 1e-6], axis=0) 

plot_impulse_responses(impulse_response, desired_magnitudes)

In [None]:
## Normalized frequency [0, 1] works as well, without needing sample_rate.
f_cutoff = 0.5

# True filter.
impulse_response = ddsp.core.sinc_impulse_response(f_cutoff, window_size)

plot_impulse_responses(impulse_response, desired_magnitudes)

In [None]:
# Changing window size changes the time-frequency characteristics.
impulse_response = ddsp.core.sinc_impulse_response(f_cutoff, window_size=250)

plot_impulse_responses(impulse_response, desired_magnitudes)

## `sinc_filter()`


Thin wrapper around `sinc_impulse_response()` and `fft_convolve`. Filter audio with a low-pass filter.


### Ex: Bandlimited Upsampling

Let's start with a triangle wave at 100 Hz, sampled at 10kHz.

In [None]:
original_sample_rate = 10000
n_samples = sample_rate + 1

# Let's start with a triangle wave at 100 Hz.
time = tf.linspace(0.0, 1.0, n_samples)
signal = (tf.linspace(0.0, 100.0, n_samples) % 1.0) - 0.5

# Look at FFT of signal.
frequencies, magnitudes = transfer_function(signal[tf.newaxis, tf.newaxis, :],
                                            sample_rate=original_sample_rate)

plt.figure(figsize=(12, 6))
plt.subplot(121)
plt.plot(time[:200], signal[:200])
plt.title('Amplitude (time)')
plt.subplot(122)
plt.semilogy(frequencies, magnitudes[0, 0, :])
plt.title('Magnitude (frequency)')

print('Original')
play(signal, sample_rate=original_sample_rate)

If we naively double the sample rate to 20kHz, we introduce upsampling artifacts.

In [None]:
upsample = 2
upsample_rate = int(original_sample_rate * upsample)
n_upsample = int(n_samples * upsample)

time_up = tf.linspace(0.0, 1.0, n_upsample)

# Box upsampling
signal_up = tf.compat.v1.image.resize_nearest_neighbor(
    signal[tf.newaxis, :, tf.newaxis, tf.newaxis], [n_upsample, 1]
    )[0, :, 0, 0]
frequencies_up, magnitudes_up = transfer_function(signal_up[tf.newaxis, tf.newaxis, :],
                                                      sample_rate=upsample_rate)

# Bilinear upsampling
signal_up_bl = ddsp.core.resample(signal[tf.newaxis, :, tf.newaxis], n_upsample)[0, :, 0]
frequencies_up_bl, magnitudes_up_bl = transfer_function(signal_up_bl[tf.newaxis, tf.newaxis, :],
                                                        sample_rate=upsample_rate)

plt.figure(figsize=(12, 6))
plt.subplot(121)
plt.semilogy(frequencies_up, magnitudes_up[0, 0, :], label='box upsample')
plt.semilogy(frequencies_up_bl, magnitudes_up_bl[0, 0, :], label='bilinear upsample')
plt.semilogy(frequencies, magnitudes[0, 0, :], label='original')
plt.ylim(1e-3, 1e4)
plt.title('Magnitude (frequency)')
plt.legend()

print('Box upsample')
play(signal_up, sample_rate=upsample_rate)
print('Bilinear upsample')
play(signal_up_bl, sample_rate=upsample_rate)
print('Original')
play(signal, sample_rate=original_sample_rate)

By applying a brick-wall low-pass filter as above, we can remove aliasing artifacts.

In [None]:
n_frequencies = 1024
half_nyquist = int(n_frequencies / 2)

# Bandpass filters, [n_batch, n_frames, n_frequencies].
cutoff_frequency = tf.ones([1, 1, 1]) * 0.5 

signal_filt = ddsp.core.sinc_filter(signal_up[tf.newaxis, :],
                               cutoff_frequency,
                               window_size=1024)[0]

frequencies_filt, magnitudes_filt = transfer_function(signal_filt[tf.newaxis, tf.newaxis, :],
                                                      sample_rate=upsample_rate)


plt.figure(figsize=(12, 6))
plt.subplot(121)
plt.semilogy(frequencies_up, magnitudes_up[0, 0, :], label='box upsample')
plt.semilogy(frequencies_filt, magnitudes_filt[0, 0, :], label='anti-aliased')
plt.ylim(1e-3, 1e4)
plt.title('Magnitude (frequency)')
plt.legend()

print('Box upsample')
play(signal_up, sample_rate=upsample_rate)
print('Anti-aliased')
play(signal_filt, sample_rate=upsample_rate)
print('Original')
play(signal, sample_rate=original_sample_rate)


## `frequency_impulse_response()`


FIR filter design method used by `ddsp.frequency_filter()`. Uses the frequency sampling method of filter design as described [here]( https://ccrma.stanford.edu/~jos/sasp/Windowing_Desired_Impulse_Response.html).

### Ex: Arbitrary filter design


In [None]:
# Brick-wall filter
n_frequencies = 512
half_nyquist = int(n_frequencies / 2)

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = (tf.linspace(1.0, 0.001, n_frequencies) + 
              0.1 * tf.sin(tf.linspace(0.0, 2.0 * np.pi * 8, n_frequencies)))
magnitudes = magnitudes[tf.newaxis, tf.newaxis, :]
desired_magnitudes = magnitudes[0, 0, :]

# Designed filter.
impulse_response = ddsp.core.frequency_impulse_response(magnitudes, window_size=0)

plot_impulse_responses(impulse_response, desired_magnitudes)

In [None]:
# Changing window size changes the time-frequency characteristics.
impulse_response = ddsp.core.frequency_impulse_response(magnitudes, 
                                                        window_size=80)

plot_impulse_responses(impulse_response, desired_magnitudes)

## `frequency_filter()`



Thin wrapper around `frequency_impulse_response()` and `fft_convolve`. Filter audio with a finite impulse response linear time-varying filter, designed using the frequency sampling method.


### Ex: Arbitrary time-varying filter

Let's try a time-invariant filter. The `magnitudes` have a single frame and `n_frequency` bands linearly spaced between 0 and Nyquist.

In [None]:
n_samples = int(sample_rate * 4.0)
n_frequencies = 1000

# White noise.
audio_in = tf.random.uniform([1, n_samples], -0.5, 0.5)

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = tf.sin(tf.linspace(0.0, 10.0, n_frequencies))**4.0
magnitudes = magnitudes[tf.newaxis, tf.newaxis, :]

# Filter.
audio_out = ddsp.core.frequency_filter(audio_in, magnitudes)

# Listen.
print('Original')
play(audio_in)
specplot(audio_in)

print('Filtered')
play(audio_out)
specplot(audio_out)

`ddsp.fir_filter()` uses the frequency sampling method of filter design as described [here]( https://ccrma.stanford.edu/~jos/sasp/Windowing_Desired_Impulse_Response.html).

Reducing `n_frequencies` thus reduces frequency resolution.

`window_size` crops the impulse responses to also determine the time-frequency tradeoff.
`window_size` must be > the fft_size which is the power of 2 >= `n_frequencies` * 2.
Setting `window_size` < 1, automatically sets it at `n_frequencies`.


In [None]:
# Fewer frequencies, less frequency resolution.
n_frequencies = 32

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = tf.sin(tf.linspace(0.0, 10.0, n_frequencies))**4.0
magnitudes = magnitudes[tf.newaxis, tf.newaxis, :]

# Filter.
audio_out = ddsp.core.frequency_filter(audio_in, magnitudes, window_size=0)

# Listen.
print('Less frequency resolution')
play(audio_out)
specplot(audio_out)

In [None]:
# Smaller window_size, less frequency resolution (more temporal resolution).
n_frequencies = 1000

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = tf.sin(tf.linspace(0.0, 10.0, n_frequencies))**4.0
magnitudes = magnitudes[tf.newaxis, tf.newaxis, :]

# Filter.
audio_out = ddsp.core.frequency_filter(audio_in, magnitudes, window_size=32)

# Listen.
print('Smaller window')
play(audio_out)
specplot(audio_out)

In [None]:
# Now let's try a time-varying filter.
n_frames = 250
n_frequencies = 1000

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = [tf.sin(tf.linspace(0.0, w, n_frequencies))**4.0 for w in np.linspace(4.0, 40.0, n_frames)]
magnitudes = tf.stack(magnitudes)
magnitudes = magnitudes[tf.newaxis, :, :]

# Filter.
audio_out = ddsp.core.frequency_filter(audio_in, magnitudes)

# Listen.
print('Time-varying Filter')
play(audio_out)
specplot(audio_out)

In [None]:
# The filter equally spaces the frames in time, so if you don't have enough, you'll hear transitions.
n_frames = 15

# Bandpass filters, [n_batch, n_frames, n_frequencies].
magnitudes = [tf.sin(tf.linspace(0.0, w, n_frequencies))**4.0 for w in np.linspace(4.0, 40.0, n_frames)]
magnitudes = tf.stack(magnitudes)
magnitudes = magnitudes[tf.newaxis, :, :]

# Filter.
audio_out = ddsp.core.frequency_filter(audio_in, magnitudes)

# Listen.
print('Time-varying Filter, Low temporal resolution')
play(audio_out)
specplot(audio_out)

# Resampling

Many functions require controls to be provided at the audio sample rate, but often one will want the network to output controls at a coarser rate.

## `resample()`


Simple bilinear upsampling of control signal based on `tf.image.resize()`.

With `add_endpoint=False`, uses the last timestep as the endpoint, producing `n_frames - 1` segments, each with a length of `n_timesteps / (n_frames - 1)`.


In [None]:
n_coarse = 9
n_fine = 16000

coarse = 1.0 - np.sin(np.linspace(0, np.pi, n_coarse))[np.newaxis, :, np.newaxis]
fine = ddsp.core.resample(coarse, n_fine, add_endpoint=False)

plt.plot(np.linspace(0, n_fine, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Bilinear upsampling ({} points, {} intervals)'.format(n_coarse, n_coarse - 1))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

With `add_endpoint=True`, holds the last timestep for an additional step as the endpoint. 
Then, `n_timesteps` is divided evenly into `n_frames` segments of size `n_timesteps / n_frames`. This is the default behavior, as it matches the default behavior of `fft_convolve`.

In [None]:
fine = ddsp.core.resample(coarse, n_fine)

n_adjusted = int(n_fine / n_coarse * (n_coarse - 1))
plt.plot(np.linspace(0, n_adjusted, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Bilinear upsampling ({} points, {} intervals)'.format(n_coarse, n_coarse))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

You can also do cubic interpolation

In [None]:
fine = ddsp.core.resample(coarse, n_fine, method='cubic', add_endpoint=False)

plt.plot(np.linspace(0, n_fine, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Bicubic upsampling ({} points, {} intervals)'.format(n_coarse, n_coarse - 1))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

Resampling also works for downsampling

In [None]:
n_coarse = 9
n_fine = 16000

fine = 1.0 - np.sin(np.linspace(0, np.pi, n_fine))[np.newaxis, :, np.newaxis]
coarse = ddsp.core.resample(fine, n_coarse, add_endpoint=False)

plt.plot(np.linspace(0, n_coarse, n_fine), fine[0, :, 0],  label='fine')
plt.plot(np.linspace(0, n_coarse, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.title('Bilinear downsampling ({} points, {} intervals)'.format(n_coarse, n_coarse - 1))
plt.legend(loc='lower right')
plt.xlim(-0.5, 10.5)
_ = plt.ylim(-0.1, 1.1)

For downsampling add endpoint interpolates up to an added endpoint, which actually removes an endpoint from the downsampled signal. This still results in the same number of points and segments.

In [None]:
n_intervals = (n_fine - 1)
n_forward = int(n_coarse / n_fine * n_intervals)

fine = 1.0 - np.sin(np.linspace(0, np.pi, n_fine))[np.newaxis, :, np.newaxis]
coarse = ddsp.core.resample(fine, n_coarse)

plt.plot(np.linspace(0, n_coarse, n_fine), fine[0, :, 0], label='fine')
plt.plot(np.linspace(0, n_coarse - 1, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.title('Bilinear downsampling ({} points, {} intervals)'.format(n_coarse, n_coarse))
plt.legend(loc='lower right')
plt.xlim(-0.5, 10.5)
_ = plt.ylim(-0.1, 1.1)

## `upsample_with_windows()`

Upsample signal with overlapping hann windows (like an inverse STFT). Good for smooth amplitude envelopes. 

In [None]:
n_coarse = 5
n_fine = 16000

coarse = 1.0 - np.sin(np.linspace(0, np.pi, n_coarse))[np.newaxis, :, np.newaxis]
fine = ddsp.core.upsample_with_windows(coarse, n_fine, add_endpoint=False)

plt.plot(np.linspace(0, n_fine, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Upsample with windows ({} points, {} intervals)'.format(n_coarse, n_coarse - 1))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

`add_endpoint` has the same behavior and defaults to True, as it matches the behavior of `fft_convolve`.

In [None]:
coarse = 1.0 - np.sin(np.linspace(0, np.pi, n_coarse))[np.newaxis, :, np.newaxis]
fine = ddsp.core.upsample_with_windows(coarse, n_fine)

n_intervals = (n_coarse - 1)
n_forward = int(n_fine / n_coarse * n_intervals)
plt.plot(np.linspace(0, n_forward, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Upsample with windows ({} points, {} intervals)'.format(n_coarse, n_coarse))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

You can also call `upsample_with_windows()` by calling `resample(method='window')`.

In [None]:
fine = ddsp.core.resample(coarse, n_fine, method='window')

plt.plot(np.linspace(0, n_forward, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Upsample with windows ({} points, {} intervals)'.format(n_coarse, n_coarse))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)

The Hann window transitions are smooth like bicubic, but more gradual and don't overshoot.

In [None]:
fine = ddsp.core.resample(coarse, n_fine, method='cubic')

plt.plot(np.linspace(0, n_forward, n_coarse), coarse[0, :, 0], 'o', label='coarse')
plt.plot(np.linspace(0, n_fine, n_fine), fine[0, :, 0], label='fine')
plt.title('Bicubic upsampling ({} points, {} intervals)'.format(n_coarse, n_coarse))
plt.legend(loc='lower right')
_ = plt.ylim(-0.1, 1.1)