# Writing an Oscillator in PyTorch

In [None]:
import torch
import IPython.display as ipd
import matplotlib.pyplot as plt

Our sinusoidal oscillator receives a sequence of amplitude
and angular frequency values, and optionally an initial phase.

The equation for this oscillator is as follows:

$$
y[n] = a[n] * sin(\phi[n])
$$

where $a[n]$ and $\phi[n]$ is amplitude and phase at the $n^\text{th}$ sample. However,
we receive a time-varying frequency as input to our function, so we need to convert that
to time-varying frequency. Recall the relationship between frequency and phase:

$$
\omega = \frac{d\phi}{dt}
$$

Frequency is the derivative of phase. Therefore, phase can be calculated by integrating
frequency (summing in the discrete-time case).

$$
\phi[n] = \phi_0 + \sum_{k=0}^{n}\omega[k]
$$

where $\omega[n]$ is angular frequency at the $n^\text{th}$ sample and $\phi[n]$ is computed
as the sum of all previous frequency values plus an initial phase, $\phi_0$.

## Creating a Sinusoidal Oscillator

Let's express the above oscillator function within a python function using the PyTorch
API.

In [None]:
def sinusoid(
    amp: torch.Tensor,  # Amplitude (batch_size, n_steps)
    omega: torch.Tensor,  # Angular frequency (batch_size, n_steps)
    phi: torch.Tensor = None,  # Initial phase (batch_size,), if None then 0
) -> torch.Tensor:
    """
    Implementational of a sinusoidal oscillator function. Receives time-varying
    amplitude and angular frequency as input and synthesizes a sinusoidal signal.
    An optional initial phase can also be specified.
    """
    # If phi is not specified, set it to 0
    if phi is None:
        phi = torch.zeros(amp.shape[0])

    # Unsqueeze the last dimension of phi to give it a time step dimension equal to 1
    # Then add the initial phase to the beginning of the angular frequency tensor
    # This sets the initial phase of the oscillator
    # We then discard the last element of the angular frequency tensor to maintain
    # the correct number of time steps
    phi = phi.unsqueeze(-1)
    omega = torch.cat([phi, omega], dim=1)[..., :-1]

    # torch.cumsum performs the cumulative summation of the angular frequency
    phase = torch.cumsum(omega, dim=1)
    return amp * torch.sin(phase)

### Using our oscillator

To use our new sinusoidal oscillator we need to create the control arguments for the
function, which are `Tensor`s of time-varying angular frequency and amplitude values.

Let's first make a static tone at 440Hz at an audio sampling rate of 16kHz.

In [None]:
sr = 16000  # Audio sampling rate
frequency = 440.0  # Frequency of the sinusoid in Hz

Make one second (16k samples) of angular frequency control signal. First create a
static signal at 440.0Hz, and then convert that to angular frequency, which has a range of $[0, 2\pi]$, where $2\pi$ is
the sampling rate.

In [None]:
f = torch.ones(1, sr) * frequency
omega = f * 2 * torch.pi / sr

Similarly, make a static amplitude signal with an amplitude of 0.5

In [None]:
amp = torch.ones(1, sr) * 0.5

Synthesize it!

In [None]:
y = sinusoid(amp, omega)
ipd.Audio(y[0].numpy(), rate=sr, normalize=False)

### Varying the frequency and amplitude

Because our amplitude and frequency tensors represent time-varying values, we can synthesize
a sound with modulating amplitude and frequency. For example, let's have the frequency
decrease by an octave from 440Hz to 220Hz over one second while the amplitude increases
from zero.

In [None]:
# Frequency envelope
f = torch.linspace(440, 220, sr).unsqueeze(0)
omega = f * 2 * torch.pi / sr

# Amplitude envelope
amp = torch.linspace(0, 1, sr).unsqueeze(0)

# Plot the amplitude and angular frequency envelopes
fig, ax1 = plt.subplots()

ax1.plot(amp[0].numpy(), color="tab:blue", label="Amplitude")
ax1.set_ylabel("Amplitude")
ax1.set_xlabel("Time (samples)")

ax2 = ax1.twinx()
ax2.plot(omega[0].numpy(), color="tab:red", label="Frequency")
ax2.set_ylabel("Angular Frequency (rads / sample)")
ax2.set_ylim(0, 0.5)

fig.legend()
plt.show()

In [None]:
y = sinusoid(amp, omega)
ipd.Audio(y[0].numpy(), rate=sr, normalize=False)

## Optimizing parameters for our oscillator

In [None]:
# Create a random amplitude envelope with 8 random points and interpolate it to the
# audio sampling rate
rand_amp = torch.rand(1, 1, 8)
rand_amp = torch.nn.functional.interpolate(rand_amp, size=(sr), mode="linear").squeeze(
    0
)

# Turn the random amplitude envelope into a learnable parameter
amp_param = torch.nn.Parameter(rand_amp)

y_hat = sinusoid(amp_param, omega)

# Now that the tensor is a learnable parameter, it is marked as requiring a gradient.
# We need to detach it from the computation graph before plotting and rendering audio.
plt.plot(y[0].numpy(), label="Ground Truth")
plt.plot(y_hat[0].detach().numpy(), alpha=0.75, label="Random Init Amp")
plt.legend()

ipd.Audio(y_hat[0].detach().numpy(), rate=sr, normalize=False)

Now that we have a learnable amplitude envelope and our oscillator is written in
PyTorch, which means that it is differentiable via PyTorch's autodiff engine, we can
run gradient descent optimization to learn to match the parameters from our ground truth.

First, we will create an optimizer.

In [None]:
optimizer = torch.optim.Adam([amp_param], lr=0.001)

In [None]:
loss_log = []
for i in range(1000):
    # 1. Compute a forward pass using our learned parameter
    y_hat = sinusoid(amp_param, omega)

    # 2. Compute the MAE loss directly on the time domain signal
    loss = torch.mean(torch.abs(y_hat - y))

    # Store the current loss value for plotting later
    loss_log.append(loss.item())

    # 3. Reset gradients
    optimizer.zero_grad()

    # 4. Compute the gradients
    loss.backward()

    # 5. Update the parameters
    optimizer.step()

In [None]:
# Turn the random amplitude envelope into a learnable parameter
amp_param = torch.nn.Parameter(rand_amp)

y_hat = sinusoid(amp_param, omega)

# Now that the tensor is a learnable parameter, it is marked as requiring a gradient.
# We need to detach it from the computation graph before plotting and rendering audio.
plt.plot(y[0].numpy(), label="Ground Truth")
plt.plot(y_hat[0].detach().numpy(), alpha=0.75, label="Random Init Amp")
plt.legend()

ipd.Audio(y_hat[0].detach().numpy(), rate=sr, normalize=False)

## Optimizing Frequency

Great! We made a differentiable oscillator and we can use gradient descent to learn an
amplitude envelope to match a target.

Now how about optimizing a frequency envelope?

Unfortunately, we have encountered one of the fundamental challenges in gradient-based
optimization -- learning periodic functions. Due to the nature of sinusoids,
the gradient of loss functions with respect to frequency parameters is also sinusoidal,
which means that the error landscape is full of nasty local minima. This means we have
a non-convex optimization problem. While research is investigating solutions to this
problem, diving into this problem is beyond the scope of this part of tutorial.

Despite this problem, lot's of interesting work has been conducted in DDSP that doesn't rely
on directly learning frequencies of oscillators. In the next section we'll see how this
simple sinusoidal oscillator will form the basis of a more complex synthesizer that can
generate realistic instrumental sounds.