<h1 align="center"> pyLDPC: Sound Coding & Decoding Module Construction </h1> 

## update:  03/28/16 - v.0.8.0

<b><font color="red"> Since version 0.7: Coding and decoding functions take tG (Transposed G) instead of G, the coding matrix. Functions that construct it (CodingMatrix and CodingMatrix_systematic) return tG instead of G as well. </font></b> 

<br> 

<font color=#0101DF><b> Note: </b> </font> 

 <b> Github doesn't support HTML audio in jupyter's cells, I invite you to open this notebook in <a href="http://nbviewer.jupyter.org/github/janatiH/pyldpc/blob/master/pyLDPC-Sound-Construction.ipynb?flush_cache=true"> nbviewer </a></b> 

This notebook introduces the sub-module <i>Sound</i> of pyLDPC: ldpc_sound.
It explains what each function does and how to use it. If you're more interested in using the module, you may follow the <a href="http://nbviewer.jupyter.org/github/janatiH/pyldpc/blob/master/pyLDPC-Tutorial-Sound.ipynb?flush_cache=true"> User's guide: pyLDPC-Sound Tutorial</a>

All the functions defined below can be called py importing the sub-module ldpc_sound:
```python
import pyldpc.ldpc_sound as ldpc_sound
ldpc_sound.SoundCoding(....)
```
Or once and for all import everything so that you can follow the tutorial mentioned above.
```python
from pyldpc.ldpc_sound import*
SoundCoding(....)
```

We'll consider audio files in <b><font color=#0101DF> wav </font></b> format. To read wav files, we'll use the module <i><font color=#0101DF> wavfile </font></i> in 
the package <i><font color=#0101DF> scipy.io </font></i>.

If you're familiar with audio files (read as numpy.arrays) you should probably skip the first section.

In [6]:
from scipy.io import wavfile
import numpy as np
import pyldpc
from time import time

## 1- What does a wav file look like ? 

Here's a wavfile you might (or even should) recognize:

<br>

<audio controls src="Sound/got/got.wav" type="audio/wav">
</audio>

<br>

The function <i> read </i> in the module <i> wavfile </i> takes one argument (the wav file's path) and returns the tuple <i> freq, data </i>, where freq is the track's frequency (or sample rate) and data is the numpy.array storing the sound's information. Changing the frequency only affects the way the file is read. The greater the frequency, the more samples are read per second. In the example below, if you divide <i>length</i> by <i>frequency</i>: 


In [7]:
4482048 / 44100

101.63374149659865

Which corresponds to the song's length in seconds (it rounds up to 1mn42s).
In practice, we'll be coding and decoding parts of an audio-array using <i>frequency</i> as delimiter. 
Let's see what <i> got </i> looks like in numbers. 

In [8]:
freq_got, got_data = wavfile.read("Sound/got/got.wav")
print("Frequency: {} samples/sec".format(freq_got))
print("\nData:\n",got_data)
print("\nData's shape:",got_data.shape)
print("\nData's type:",got_data.dtype)

Frequency: 44100 samples/sec

Data:
 [[0 0]
 [0 0]
 [0 0]
 ..., 
 [0 0]
 [0 0]
 [0 0]]

Data's shape: (4482048, 2)

Data's type: int16


To conclude about frequency, the filtered array
```python
got_data[freq_got : 3*freq_got] 
```
for example, is a <font color=#0101DF><b>2.freq_got</b></font> sized array. If saved with the <font color=#0101DF><b>same frequency freq_got</b></font> and then read, the wav file reading will last for <font color=#0101DF><b>2 seconds.</b></font> In other words, coding and decoding a 1 second track, is technically coding and decoding an array of freq_got int16 numbers. 

So <i> got'</i>s data is a 2D-array of int16 elements. If you keep only one of the columns of <i>got</i>, you may not realize any change in the track. That's because the number of columns is actually the number of <i>channels</i>. One channel is enough to construct a wav file, the second channel is not needed except for the <i>stereo effect</i>. 

That's why, <b>this submodule will only code and decode the first channel (i.e the first column) of a wav file</b> (in case it's a setero).

Let's make <i>got</i> a 1D-array version (no stero effect) of the original track.

In [9]:
got = got_data[:,0]
print("Frequency: {} samples/sec".format(freq_got))
print("\nData:\n",got)
print("\nData's shape:",got.shape)
print("\nData's type:",got.dtype)

Frequency: 44100 samples/sec

Data:
 [0 0 0 ..., 0 0 0]

Data's shape: (4482048,)

Data's type: int16


## 2 - Audio-array to binary array


This is it. We've got a 4482048 length array of <b>int16</b> numbers (which also cover negative integers): i.e each number is in:

<img src="Equations/SoundC1.png">

In order to convert each of these numbers to binary arrays, they must be <b> unsigned integers </b>. 
That's why we'll apply a sort of <i>translation</i> by 2^15 so that each number of the array will be included in:

<img src="Equations/SoundC2.png">

Then, each number can be written in a binary array of 17 bits. In binary, <i>got</i> will be written as a 2D-array shaped (<i>length</i>,17).

In [10]:
def Audio2Bin(audio_array):
    
    """
    Converts the first audio channel (first column) of an int16 audio_array to a 17-bits binary form.
    
    Parameters:
    - audio-array: must be int16. May be 2D-array but the function only converts one channel. 
    
    returns:
    - 17 bits binary audio-array shaped (length,17) where length is the audio_array's length. 
    
    """
    
    #### Keep the first channel of the audio file only:
    if len(audio_array.shape)>1:
        audio = audio_array[:,0]
    else:
        audio = audio_array
        
    length = audio.size 
    
    #### Translate audio by 2^15 so as to make its dtype unsigned.
    audio = audio + 2**15
    
    audio_bin = np.zeros(shape=(length,17),dtype=int)
    for i in range(length):
        audio_bin[i,:] = pyldpc.ldpcalgebra.int2bitarray(audio[i],17)
        
    return audio_bin

In [11]:
def Bin2Audio(audio_bin):
    
    """
    Converts a 17-bits binary audio array to an int16 1D-(one channel) audio_array.
    
    Parameters:
    - audio_bin: 17 bits binary array shaped (length,17). 
    
    returns:
    - int16 1D-audio-array of size = length.
    
    """
            
    length = audio_bin.shape[0] 
    
    audio = np.zeros(length,dtype=int)
    
    for i in range(length):
        audio[i] = pyldpc.ldpcalgebra.bitarray2int(audio_bin[i,:])
    
    #### Translate audio by - 2^15 so as to make its dtype signed int16.

    audio = audio - 2**15

    return audio.astype(np.int16)

# Coding & Decoding using small matrices 

## 3 - Coding Audio files

Now that our audio file is a (length,17) binary array, we can encode each 17-bits binary array with a 17 rows coding matrix G. But in order to "visualize" the noisy audio, G must be systematic so that the 17 first bits of each codeword can still be identified after coding. The coding function <i>SoundCoding</i> adds a AWGN with a specified SNR. SNR: Signal-Noise Ratio, SNR = 10log(1/variance) in decibels.

The coding function <i>SoundCoding</i> is actually similar to <i>ImageCoding</i>, since it gathers the 17 noisy bits of n-sized codewords <i>on the left</i> of the noisy array so that the "noisy" form of the original song is reconstructed. The rest of the noisy-array (i.e the redundant bits) are dropped (pure noise).

<font color="red">Since v7.0, SoundCoding takes tG, transposed coding matrix tG instead of G. </font>
<br>

In [129]:
def SoundCoding(tG,audio_bin,snr):
    
    """ 
    
    Codes a binary audio array (Therefore must be a 2D-array shaped (length,17)). It is reshaped so as to match tG's dimensions. 
    Then a gaussian noise N(0,snr) is added to the codeword.
    
    Remember SNR: Signal-Noise Ratio: SNR = 10log(1/variance) in decibels of the AWGN used in coding.

    Of course, "listening" to an audio file with n-bits array is impossible, that's why if Coding Matrix G is systematic,
    reading the noisy sound can be possible by gathering the 17 first bits of each 
    n-bits codeword to the left, the redundant bits are dropped. 
    
    returns  a tuple: the (X,n) coded audio, and the noisy one in binary form (length).
    
    Parameters:

    tG: Transposed Coding Matrix G - must be systematic. See CodingMatrix_systematic.
    audio_bin: 2D-array of a binary audio shaped (length,17).
    SNR: Signal-Noise Ratio, SNR = 10log(1/variance) in decibels of the AWGN used in coding.
    
    
    Returns:
    Tuple: binary noisy_audio, coded_audio
    
    
    """
    
    n,k = tG.shape
    length = audio_bin.shape[0]
    
    ratio = (length*17)//k
    rows = ceil(length*17/k)
    rest = (length*17)%k
        
    audio_bin_reshaped = np.zeros((rows,k),dtype=int)

    audio_bin_reshaped[:ratio,:k] = audio_bin.flatten()[:ratio*k].reshape(ratio,k)
    
    if rest >0:
        audio_bin_reshaped[-1,:rest] = audio_bin.flatten()[-rest:]
        
    if n>=100 and not type(tG)==scipy.sparse.csr_matrix:
        warnings.warn("Using scipy.sparse.csr_matrix format is highly recommanded when computing coding and decoding with large matrices to speed up calculations.")
        
    if n>=100 and not (tG[:k,:]==np.identity(k)).all():
        raise ValueError("G must be Systematic if n>=100 (for later decoding, solving tG.tv = tx for images has a O(n^3) complexity.)")
       

    coded_audio = np.zeros(shape=(rows+1,n))
    coded_audio[-1,0]=length
    
    
    for i in range(rows):
        coded_number = pyldpc.Coding(tG,audio_bin_reshaped[i,:],snr)
        coded_audio[i,:] = coded_number
                
    noisy_audio = (coded_audio[:,:k]<0).astype(int).flatten()[:length*17].reshape(length,17)
    
    return coded_audio,noisy_audio
    

### 4- Decoding audio files

Decoding functin <i> SoundDecoding </i> uses parity-check matrix to decode y=x+e and find the noise-free codeword, but also needs coding matrix G to solve coding equation tGtv = tx and find original 17 bits message v if G is not systematic. 

When all original 17 bits numbers are decoded, the function returns the audio file in its "reading" format int16.

<br> 
<font color="blue"> SoundDecoding uses a slightly different version of decoding functions (pyldpc.Decoding_BP_ext, pyldpc.Decoding_logBP_ext) where some "static" parameters related to H are passed to the decoding functions in order to speed up computing. These optimization changes are hidden so that the user will not suffer from radical syntax changes in new releases.</font>

In [120]:
def SoundDecoding(tG,H,audio_coded,snr,max_iter=1,log=1):
    
    """ 
    Sound Decoding Function. Taked the 2-D binary coded audio array where each element is a codeword n-bits array and decodes 
    every one of them. Needs H to decode and G to solve v.G = x where x is the codeword element decoded by the function
    itself. When v is found for each codeword, the decoded audio is returned in binary form. It can then be compared to audio_bin 
    with BER_audio function.
    
    Parameters: 
    
    tG: Transposed Coding Matrix must be systematic.
    H: Parity-Check Matrix (Decoding matrix). 
    audio_coded: binary coded audio returned by the function SoundCoding. Must be shaped (length, n) where n is a
                the length of a codeword (also the number of H's columns)
    
    snr: Signal-Noise Ratio: SNR = 10log(1/variance) in decibels of the AWGN used in coding.
    
    log: (optional, default = True), if True, Full-log version of BP algorithm is used. 
    max_iter: (optional, default =1), number of iterations of decoding. increase if snr is < 5db. 

    
    """
    
    n,k = tG.shape
    rows,N = audio_coded.shape
    length = int(audio_coded[-1,0])
    
    if N!=n:
        raise ValueError('Coded Image must have the same number of columns as H')
        
    if n>=100 and not type(tG)==scipy.sparse.csr_matrix:
        warnings.warn("Using scipy.sparse.csr_matrix format is highly recommanded when computing coding and decoding with large matrices to speed up calculations.")
        
    if n>=100 and not (tG[:k,:]==np.identity(k)).all():
        raise ValueError("G must be Systematic. Solving tG.tv = tx for images has a O(n^3) complexity.")
            
    audio_decoded = np.zeros(shape=(rows-1,k),dtype = int)

    if log:
        DecodingFunction = pyldpc.Decoding_logBP
    else:
        DecodingFunction = pyldpc.Decoding_BP
       
    BitsNodes = pyldpc.ldpcalgebra.BitsAndNodes(H)

    for j in range(rows-1):

        decoded = DecodingFunction(H,BitsNodes,audio_coded[j,:],snr,max_iter)

        audio_decoded[j,:] = decoded[:k]
    
    audio_decoded = audio_decoded.flatten()[:length*17].reshape(length,17)
 
    return audio_decoded

### Bit Error Rate - Audio: 
Function that will give us an idea about how accurate our decoding is by comparing the original and the decoded audio files bit by bit. It returns the rate of false bits over all bits.

In [14]:
def BER_audio(original_audio_bin,decoded_audio_bin):
    """ 
    
    Computes Bit-Error-Rate (BER) by comparing 2 binary audio arrays.
    The ratio of bit errors over total number of bits is returned.
    
    """
    if not original_audio_bin.shape == decoded_audio_bin.shape:
        raise ValueError('Original and decoded audio files\' shapes don\'t match !')
        
    length, k = original_audio_bin.shape 
    
    total_bits  = np.prod(original_audio_bin.shape)

    errors_bits = sum(abs(original_audio_bin-decoded_audio_bin).reshape(length*k))
    
    BER = errors_bits/total_bits 
    
    return(BER)


## 5- Application : 
<a href="http://nbviewer.jupyter.org/github/janatiH/pyldpc/blob/master/pyLDPC-Tutorial-Sound.ipynb?flush_cache=true"> User's guide: pyLDPC-Sound Tutorial</a>