# Singular Value Decomposition


SVD is a technique to find out the orthogonal axes for capturing the maximum variance in data. It breaks down the original data matrix to three component matrices, as shown by the following equation:

 A = UΣ$V^{T}$
 

Here, 
A is an m x n data matrix with m observations containing n columns/attributes. For better understanding of U , Σ, and $V^{T}$, let's take a highly exaggerated dataset that shows food item ratings given by users. This is a 12 x 9 dataset with 12 users' ratings on 9 food items: 

In [1]:
# loading necessary libraries
import pandas as pd
import numpy as np

In [4]:
rating = pd.read_csv("MyFoodRatings.csv")
rating

Unnamed: 0,Name,Chicken,Mutton,Paneer,ChowMein,SpringRolls,Momo,Sushi,Ramen,Tempura
0,A,5,5,5,0,0,0,0,0,0
1,B,4,4,4,0,0,0,0,0,0
2,C,3,3,3,0,0,0,0,0,0
3,D,2,2,2,0,0,0,0,0,0
4,E,0,0,0,2,2,2,0,0,0
5,F,0,0,0,1,1,1,0,0,0
6,G,0,0,0,5,3,4,0,0,0
7,H,0,0,0,4,4,4,0,0,0
8,I,0,0,0,0,0,0,2,2,4
9,J,0,0,0,0,0,0,1,1,1


The table has deliberately been kept unrealistic to drive home some key concepts. You may already have noticed a pattern.
 
The first three food items, namely, Chicken, Mutton, and Paneer are Indian dishes; Chow Mein, Spring Roll, and Momo are Chinese dishes, while the last three, Sushi, Ramen, and Tempura, are Japanese dishes.
 
Notice that the first four users — A, B, C, and D — mainly eat Indian food, and their ratings for the items highly correlated with each other. Similarly, users E, F, G, and H prefer Chinese food and their ratings for Chinese dishes have high correlation.
 
Now, do we really need nine columns to represent these obervations? Can’t we reduce the total number of columns to three (one for each food type)? One column for Indian food, one for Chinese, and another for Japanese. This is what SVD can help us accomplish. It finds out the columns that have a common theme and combines them linearly with a certain weightage.
 
On appying SVD to the original matrix, you'll get the matrices U,  Σ,  and $V^{T}$.

U is a User x Themes matrix.

Σ is a diagonal matrix, where each diagonal entry represents the weight of a theme (in decreasing order from left to right).

$V^{T}$ maps the latent themes to the original columns and is a Themes x Original features matrix. 

In [6]:
# lets decompose the matrix using SVD now

U, S, VT = np.linalg.svd(rating.iloc[:,1:],full_matrices=True, compute_uv=True)

In [10]:
print("The shape of U:  ", U.shape)
print("The shape of S:  ", S.shape)
print("The shape of VT: ",VT.shape)

The shape of U:   (12, 12)
The shape of S:   (9,)
The shape of VT:  (9, 9)


All the non-zero entries in the S matrix are the weights given to the different themes. 

In [30]:
#display the first few elements of the S matrix
np.diag(S[:5])

array([[12.72792206,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        , 10.57703788,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  8.84826058,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.24205263,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  1.06125858]])

In [32]:
print("The maximum weightage is the first element of the S Matrix :{}".format(S[:1]))

The maximum weightage is the first element of the S Matrix :[12.72792206]


In [36]:
# the lowest weight for the first 5 elements of S matrix is 

print("The lowest weight in the first 5 elements of the S Matrix :{}".format(S[4]))

The lowest weight in the first 5 elements of the S Matrix :1.0612585765265095


In [41]:
# the elements of VT matrix

VT_df = pd.DataFrame(VT)
VT_df

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,-0.57735,-0.57735,-0.57735,-0.0,-0.0,-0.0,-0.0,-0.0,-0.0
1,0.0,0.0,0.0,-0.637875,-0.512261,-0.575068,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,-0.476163,-0.557638,-0.679933
3,0.0,0.0,0.0,0.0,0.0,0.0,0.29611,0.626381,-0.721087
4,-0.0,-0.0,-0.0,0.65303,-0.755593,-0.051282,-0.0,-0.0,-0.0
5,-0.0,-0.0,-0.0,-0.0,-0.0,-0.0,0.828002,-0.54469,-0.133137
6,0.0,0.0,0.0,-0.408248,-0.408248,0.816497,0.0,0.0,0.0
7,-0.816497,0.408248,0.408248,-0.0,-0.0,-0.0,-0.0,-0.0,-0.0
8,0.0,-0.707107,0.707107,0.0,0.0,0.0,0.0,0.0,0.0


In [43]:
# the first three columns of U matrix
U[:,:3]

array([[-6.80413817e-01,  3.02164435e-17, -1.54169932e-17],
       [-5.44331054e-01,  9.78791554e-18, -4.99397711e-18],
       [-4.08248290e-01,  7.34093665e-18, -3.74548283e-18],
       [-2.72165527e-01, -1.06128345e-16,  5.41486614e-17],
       [ 0.00000000e+00, -3.26217015e-01, -7.66682207e-17],
       [ 0.00000000e+00, -1.63108508e-01,  5.08164222e-17],
       [ 0.00000000e+00, -6.64310100e-01,  6.16297582e-33],
       [ 0.00000000e+00, -6.52434031e-01,  2.56300048e-17],
       [ 0.00000000e+00,  0.00000000e+00, -5.41047938e-01],
       [ 0.00000000e+00,  0.00000000e+00, -1.93680321e-01],
       [ 0.00000000e+00,  0.00000000e+00, -7.20906926e-01],
       [ 0.00000000e+00,  0.00000000e+00, -3.87360641e-01]])