# Use internal microphone to plot waveforms (Part 2: add spectrum)

This script should make a continuous line plot of a sound that is picked up by the computer's microphone and a spectrum to show power.  It derives from https://www.youtube.com/watch?v=aQKX3mrDFoY

This is a follow-on example from My_first_audio notebook.

Note that if the speaker volume is too high, the graph will look broken (values exceeding +-128 will wrap)

This uses the python module "pyaudio", which also needed "portaudio", as well as numpy and scipy (for fft)

First, as always, we import the needed packages; in this case pyaudio to grab sound from the microphone (either internal or external), struct to convert the digital sound from packed binary to integer, numpy for array handling, scipy for fft (making spectrum) and matplotlib to plot

In [1]:
# as always..
import numpy as np
import matplotlib.pyplot as plt

# like the previous demo, add the sound package and struct to convert raw bits to integers
import pyaudio
import struct

# new for this demo, we add the fft module
from scipy.fftpack import fft

# this time add additional modules for timing (done when plot is closed)
import time
from tkinter import TclError

This "backend" will allow plots to come up outside the jupyter browser (as popups)

In [2]:
%matplotlib tk

Here we define a segment of sound to process.  This essentially takes a continuous time-series and makes it a set of finite-length signals.  These will be the number of samples per frame.  Here we use 1024 * 4, or 4096.  The format here is 16-bit integer, channel is 1 for the machines internal mic being mono (not stereo), and the rate is a somewhat standard 44.1 kHz

Like the last demo, the variable "RATE" is the sampling rate, e.g., 44100 samples per second.  The amount of data that we will process at one time is CHUNK.  CHUNK/RATE therefore is the length of time of each processed segment. 

In [3]:
CHUNK = 1024 * 4
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100

Next, we create a stream using the variables defined above.  Here I'm not sure why we're using integer 16, only later to change to integer 8

In [4]:
p = pyaudio.PyAudio()

stream = p.open(
    format = FORMAT,
    channels = CHANNELS,
    rate = RATE,
    input = True,
    output = True,
    frames_per_buffer = CHUNK
)

Now we have our sound object we can make a plot.  This time we'll have one plot of the raw waveform (top) and another plot of the spectrum of this signal (bottom).  This way we can look at the sounds in terms of time and frequency space at the time.

In [5]:
# define our figure to have 2 panels (subplots)
fig, (ax1, ax2) = plt.subplots(2, figsize=(15,8))

# again, like the last demo, we'll fill the initial plot with random data; this will be overwritten as soon
# as the next block is executed; here we will use time for the first plot and frequency for the second

x_time = np.linspace(0,CHUNK/RATE,CHUNK)
line, = ax1.plot(x_time, np.random.rand(CHUNK), '-', lw=1)
ax1.set_title('Audio Waveform')
ax1.set_xlabel('time (seconds)')
ax1.set_ylabel('amplitude')
ax1.set_xlim(0,CHUNK/RATE)
ax1.set_ylim(-150, 150)
# in case we want to gussy up the axis
#plt.setp(ax1, xticks=[0, CHUNK/2, CHUNK, 3* CHUNK/2, 2*CHUNK], yticks=[-128, 0, 128])


x_freq = np.linspace(0, RATE, CHUNK)
line_fft, = ax2.semilogx(x_freq, np.random.rand(CHUNK), '-', lw=1)
ax2.set_title('Audio Spectrum')
ax2.set_xlabel('frequency (cycles/second [Hz])')
ax2.set_ylabel('power')
ax2.set_xlim(20,RATE/2)
ax2.set_ylim(0,0.25)

plt.show(block=False)

In [6]:
print('stream started')

stream started


In [7]:
frame_count = 0
start_time = time.time()

Start the animation part.  The plots are already setup, here we just update the lines.  Unlike the prior example, here we add an exception (so we can stop the animation a little more cleanly, and also get frame rate)

In [8]:
while True:
    data = stream.read(CHUNK,exception_on_overflow=False)
    data_int = struct.unpack(str(2*CHUNK)+'B',data)
    data_np = np.array(data_int,dtype='b')[::2]

    line.set_ydata(data_np)
    y_fft = fft(data_np)
#   note the fft routine returns a complex number; np.abs returns the magnitude (sqrt(a2+b2))
#   I don't know why this is normalized by 2/(256*CHUNK)
    line_fft.set_ydata(np.abs(y_fft)*2/(256*CHUNK))
  
    try:
        fig.canvas.draw()
        fig.canvas.flush_events()
        frame_count += 1

    except TclError:
        frame_rate = frame_count / (time.time() - start_time)
        print('stream stopped')
        print('average frame rate = {:0f} FPS'.format(frame_rate))
        break


KeyboardInterrupt: 