<a href="https://colab.research.google.com/github/youngmoo/DSP_ColabUtils/blob/main/Animation_with_Sound.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction and motivation

This is a short example notebook for creating animations of audio visualizations (using matplotlib) and exporting the result (with sound) as an .mp4 video file.
* By default, matplotlib animation does not produce video files with sound.
* Most notebooks will use the JavaScript widget to display animation results inline, since it allows for greater interaction (play / reverse, scrubbing, frame-by-frame advance, etc.). But this widget doesn't incorporate sound playback.
* Here, we create separate files for the animation and the sound, and use ffmpeg to combine them into a single video, which can be displayed inline.

# Initialization

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import animation, rc
import IPython.display as ipd
%matplotlib inline
import soundfile as sf

# Render animations as video (rather than jshtml)
rc('animation', html='html5')

## My plot style defaults (optional)

In [2]:
rc('font', family='Liberation Serif')
rc('font', size=20)

# Step-by-step: output animation with sound (to .mp4 video)

# 1. Make and save an audio signal (a chirp up-and-down)

In [3]:
dur = 1       # Length of ramp in seconds (output is up & down, so final duration will be 2 * dur)
fs = 22050    # Sampling rate (22050 Hz)
f1 = 100      # Starting frequency (100 Hz)
f2 = 1000     # Highest frequency (1000 Hz)

t_up = np.arange(fs*dur) / fs
t_down = np.arange(fs*dur,fs*dur*2) / fs

f_up = (f2 - f1)/dur;
f_down = (f1 - f2)/dur;

chirp_up = np.sin(np.pi*f_up*t_up*t_up + 2*np.pi*f1*t_up);
chirp_down = np.sin(np.pi*f_down*t_down*t_down + 2*np.pi*(2*f2-f1)*t_down);

chirp = []
chirp = np.append(chirp, [chirp_up, chirp_down])

ipd.Audio(chirp, rate=fs)

In [4]:
# Save chirp to local (temp) WAV file
sf.write("chirp.wav", chirp, fs)

## 2. Create waveform animation frames from the audio
We first create a video from animation frames without sound

In [5]:
fps = 30            # Frame rate (frames-per-second)
frameSize = 1024    # Size (in audio samples) of waveform data per frame

fig = plt.figure(figsize = (16,6))  # Create figure (wide aspect ratio) and axes
ax = plt.axes(xlim=(0,frameSize/fs*1000), ylim=(-1.1, 1.1))
line, = ax.plot([], [], lw=2)   # Line object (linewidth set to 2, just to make it bolder)
ax.set_xlabel('Time (milliseconds)')
ax.set_ylabel('Amplitude')
fig.tight_layout()  # Tightens the figure layout to remove extra border space

# Number of animation frames (2 x duration x frame rate, minus 1 because of issues with end of audio vector)
num_frames = 2*dur*fps - 1

# Our animation function. This is called repeatedly (per frame), until num_frames
def animate(n):
    n1 = int(n * fs / fps)            # Start index of current frame
    n2 = int(n1 + frameSize)          # Ending index of current frame
    x = np.arange(frameSize)/fs*1000  # x-data (in seconds)
    y = chirp[n1:n2]                  # y-data (current frame)
    line.set_data(x, y)               # Set line data to new wave data frame 

    # Uncomment next line to save frame images to 'chirp/' folder (you must manually create the 'chirp/' folder)    
    #fig.savefig("chirp/frame%04d.png" % n, dpi=150, transparent=True)

    if n==num_frames-1:         # Close the plot on last frame (prevents an additional plot below our animation)
        plt.close()

    return line,

# Call the animator (blit=True means only re-draw the parts that have changed)
anim = animation.FuncAnimation(fig, animate, frames=num_frames, interval=100/3, blit=True)

anim  # Display animation

## 3. Save the animation as .mp4 video to a (temp) local file

In [6]:
anim.save("chirp.mp4")

## 4. Combine temp video file with previously saved audio (wav) file
Uses ffmpeg to add the audio file as a soundtrack to the temp video

In [7]:
!ffmpeg -i chirp.mp4 -i chirp.wav -map 0 -map 1:a -c:v copy -shortest 'chirp+sound.mp4'

ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-li

In [8]:
ipd.Video("chirp+sound.mp4", embed=True)

# A single function to combine animation and sound
This function the filename of the combined video file
* I wish it could also display the final combined video, but because it uses the shell to run ffmpeg, I don't think there's an easy way to do that.
* This intentionally quiets the output from ffmpeg, so there's not a ton of output (that's not very useful)

In [10]:
def animWithSound(anim_frames, audio_data, sample_rate=44100):
  # This is just a hack to create unique filenames (based on the current timestamp)
  dt_suffix = str( int( np.datetime64('now').astype(np.timedelta64) / np.timedelta64(1, 's') ) ) # Current date/time in seconds
  anim_filename = 'temp_anim_' + dt_suffix + '.mp4'
  audio_filename = 'temp_audio_' + dt_suffix + '.wav'
  output_filename = 'temp+sound_' + dt_suffix + '.mp4'

  anim_frames.save(anim_filename)
  sf.write(audio_filename, audio_data, sample_rate)
  !ffmpeg -i $anim_filename -i $audio_filename -map 0 -map 1:a -c:v copy -shortest $output_filename -hide_banner -loglevel error
  return output_filename  # Return the filename of the temp output

## Example usage: Down to 2 lines

In [11]:
out_file = animWithSound(anim, chirp, 22050)
ipd.Video(out_file, embed=True)