# Exploring WAV files

## Decode standard compressed formats to `.wav`

In [112]:
import sys, os, wave, contextlib, audioread
import numpy as np

Here's how to create a `.wav` file from (say) a `.mp3` file using the `audioread` library

In [15]:
# This is from https://github.com/beetbox/audioread/blob/master/decode.py

def decode(filename):
    filename = os.path.abspath(os.path.expanduser(filename))
    if not os.path.exists(filename):
        print("File not found.", file=sys.stderr)
        sys.exit(1)

    try:
        with audioread.audio_open(filename) as f:
            print('Input file: %i channels at %i Hz; %.1f seconds.' %
                  (f.channels, f.samplerate, f.duration),
                  file=sys.stderr)
            print('Backend:', str(type(f).__module__).split('.')[1],
                  file=sys.stderr)

            with contextlib.closing(wave.open(filename + '.wav', 'w')) as of:
                of.setnchannels(f.channels)
                of.setframerate(f.samplerate)
                of.setsampwidth(2)

                for buf in f:
                    of.writeframes(buf)

    except audioread.DecodeError:
        print("File could not be decoded.", file=sys.stderr)
        sys.exit(1)


In [16]:
decode("example.mp3")

Input file: 2 channels at 44100 Hz; 9.9 seconds.
Backend: macca


## The format

Use the `wave` module to read the raw bytes

In [None]:
wav_file = "example.mp3.wav"

In [107]:
wav_file = os.path.abspath(os.path.expanduser(wav_file))
with wave.open(wav_file, mode=None) as f:
    print(f.getparams())
    frames = f.readframes(16) # Get the first 16 frames

_wave_params(nchannels=2, sampwidth=2, framerate=44100, nframes=437919, comptype='NONE', compname='not compressed')


Convert the bytes to integers. `sampwidth=2` indicates two bytes for each value i.e. 16 bit encoding. `nchannels=2` corresponds to stereo.

In [133]:
vals = np.array([], dtype=np.int16)
for idx in range(len(frames) // 4):
    vals = np.append(vals, int.from_bytes(frames[2*idx:2*idx+2], byteorder='little', signed=True))

In [137]:
vals.reshape([-1,2])

array([[-1,  0],
       [ 1, -2],
       [ 0,  2],
       [-1, -1],
       [ 1,  0],
       [-1,  1],
       [ 0, -2],
       [ 0,  2]])

Note that `int.from_bytes` returns `int64` by default 

In [138]:
vals.dtype

dtype('int64')

## Read and decode to numpy array using `scipy.io.wavfile`

This is more direct

In [42]:
from scipy.io import wavfile

In [90]:
_, vals = wavfile.read(wav_file)

In [93]:
vals[:8]

array([[-1,  0],
       [ 1, -2],
       [ 0,  2],
       [-1, -1],
       [ 1,  0],
       [-1,  1],
       [ 0, -2],
       [ 0,  2]], dtype=int16)

## WAV files in TensorFlow?

I expect it can be done directly in TensorFlow...