In [23]:
from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.decomposition import sparse_encode
import pandas as pd
import numpy as np

## Loading the iris dataset
### Iris dataset contains:
1. Iris.data that is an array where each row of which is a feature vector for one sample.
2. Iris.target that is a vector defining the class labels associated with the corresponding rows of iris.data.

In [24]:
iris = datasets.load_iris()
X = iris.data
Y = iris.target

## Principal components analysis
The Iris dataset is represented by a four-dimensonal feature vector. It is difficult to visualise 4d data.
One way to do it is to apply Principal Components Analysis to reduce the dimensionality of the data. 

## Performing PCA on the Iris dataset:
PCA computes vectors that you can project your data onto in order to reduce the dimension of your data. 

Since each row of your data is 4 dimensional there will be a maximum of 4 vectors onto which data can be projected and each of those vectors will be 2-dimensional. 

Each row of PCA.components_ is a single vector onto which things get projected, it will have the same size as the number of columns in your training data. 

Calling transform you're asking sklearn to actually do the projection. You are asking it to project each row of your data into the vector space that was learned when fit was called. For each row of the data you pass to transform you'll have 1 row in the output and the number of columns in that row will be the number of vectors that were learned in the fit phase. The number of columns will be equal to the value of n_components you passed.

In [25]:
pca_vectors = np.array([[7.2,2.8,4.4,0.3],
                        [5.1,4.1,5.9,2.3],
                        [7.9,3.0,2.5,0.6],
                        [6.1,3.4,6.6,2.2],
                        [6.1,2.6,4.7,0.9]])

pca = PCA(n_components=2)
pca.fit(X,Y)
pca.transform(pca_vectors).round(4)

array([[ 0.7398,  0.6595],
       [ 1.8726, -0.1812],
       [-0.5443,  1.5719],
       [ 2.857 , -0.1495],
       [ 0.8311, -0.3061]])

## Sparce Coding
Using the "Orthogonal Matching Pursuit" (OMP) method. Impleneted by the "sparse_encode" function from the sklearn.decomposition library.

In [27]:
sparse_coef_vectors = ([[6.3,2.3,2.6,2.0],
                       [5.6,2.9,5.9,1.0],
                       [7.6,3.9,4.5,1.3],
                       [4.8,3.2,3.3,1.1],
                       [5.9,2.4,6.1,2.3]])

num_non_zero = 2
tolerance = 1.000000e-05

## Defining a dictionary
A class-specific dictionary containing all the samples from the Iris dataset that are in that class.

In [28]:
class_0_data = np.array([x for x, y_value in zip(X, Y) if y_value == 0])
class_1_data = np.array([x for x, y_value in zip(X, Y) if y_value == 1])
class_2_data = np.array([x for x, y_value in zip(X, Y) if y_value == 2])

## Applying sparse_encode with the newely defined dictionaries and defining the cost function
The cost associated with each encoding is calculated using ||x−VTy||2+λ||y||0, where λ=0.1

In [33]:
class_dictionaries = [class_0_data, class_1_data, class_2_data]

for idx, sample in enumerate(sparse_coef_vectors, start=1):
    costs = []
    for class_dict in class_dictionaries:
        class_pred = sparse_encode(
            [sample],
            class_dict, 
            algorithm='omp', 
            n_nonzero_coefs=num_non_zero, 
            alpha=tolerance
        )
        VT_y = class_dict.T.dot(class_pred.T) #Transposing and dot porduct of the two matrices
        x_minus_VT_y = sample - VT_y.T
        first_part = np.linalg.norm(x_minus_VT_y) #Euclidean distance calculation (Sqrt of sum of squares)
        second_part = 0.1 * 2
        
        cost = np.round(first_part + second_part, 4)
        
        costs.append(cost)
        
    print(f'Sample {idx} costs for dictionaries for class 0, 1 and 2: {costs}')
    print(f'Predicted class for sample {idx} is: {np.argmin(costs)}')

Sample 1 costs for dictionaries for class 0, 1 and 2: [1.9066, 1.5721, 1.4768]
Predicted class for sample 1 is: 2
Sample 2 costs for dictionaries for class 0, 1 and 2: [1.7357, 1.1069, 0.47]
Predicted class for sample 2 is: 2
Sample 3 costs for dictionaries for class 0, 1 and 2: [2.5009, 0.5616, 1.7803]
Predicted class for sample 3 is: 1
Sample 4 costs for dictionaries for class 0, 1 and 2: [1.3761, 0.7526, 0.736]
Predicted class for sample 4 is: 2
Sample 5 costs for dictionaries for class 0, 1 and 2: [3.006, 0.6005, 0.9553]
Predicted class for sample 5 is: 1
