# Supervised Learning and Image Compression using SVD

This notebook was completed as a part of my the Computational Methods for Data Analysis class at UW.

The purpose of this notebook is to better be able to work with image and music data. One algorithm, Singular Value Decomposition, was used on both to make the data smaller and easier to work with.  


## Theoretical Background: 
### Principal Component Analysis and Singular Value Decomposition

Both analyses start with a singular value decomposition (SVD) of the data matrix.  SVD is similar to an eigendecomposition of a matrix but can be performed on any real or complex matrix. 

<center>$A = U\Sigma V'$ Equation 1</center>  
  
Equation 1 shows the general form of an SVD of any sized real or complex matrix A.  A is broken down into two unitary matrices, U and V*, and a diagonal matrix Σ, consisting of the square roots of the eigenvalues of AAt. Σ’s diagonals are the singular values of the matrix A. U represents the principal components of A and V corresponds to the weights of each principal component on the data in A.  

In python this is easy to implement, using the svd function in numpy's linalg library:

In [2]:
from numpy.linalg import svd

### Gabor Filtering

All of the music data were first converted into spectrograms using gabor filtering. Gabor filtering is a modified discrete fourier transform, which allows for the ability to study a signal’s frequencies and when they appear in that signal. 

One issue with the fourier transform is that there is no time component, i.e. there is no information of when the frequencies appear in the signal.  To get around this gabor filtering performs a series of fourier transforms on different time slices of a signal. A filter slices the signal into a specific time window and performs the fourier transform on that filtered signal, only producing the frequencies appearing in that slice of time.


### Machine Learning

Two machine learning techniques were used to predict the genre or band of songs.  In both techniques a training set is used to train the algorithm and then tested on a set of test data.  The training data consists of data and labels of the data. They are both examples of supervised machine learning algorithms.

#### Naive Bayes
The first algorithm used is a Naive Bayes classifier. The naive bayes classifier calculates the probability a new datapoint is in a cluster, based on the training set (priors).  It accounts for different sizes of training data well. 

#### Linear Discriminant Analysis
The second algorithm used was linear discriminant analysis (LDA).  LDA looks for an optimal linear cut between known clusters of data. It then can predict which cluster new data belongs to by seeing which side of the linear cuts it falls on. 