### LDA (Linear Discriminant Analysis)

1) Instead of finding new axes (dimensions) that maximize the variation in the data, LDA focuses on maximizing the separability among the known categories (classes) in the target variable.

2) Unlike Principal Component Analysis (PCA), LDA requires you to provide features and class labels for your target. Hence, despite being a dimensionality reduction technique similar to PCA, it sits within the supervised branch of Machine Learning.

<img src="lda1.png">

3) The goal of PCA is to capture the maximum amount of variance, which is achieved by the algorithm finding a line that minimizes the distances from data points to that line. Interestingly, this is equivalent to maximizing the spread of data point projections on that same line, which is why we can capture the maximum amount of variance.
<img src="lda2.png">
<img src="lda3.png">

4) The goal of LDA is to maximize the separability of the known categories in our target variable (“cheap”, “expensive”) while at the same time reducing dimensions

<img src="lda4.png">
<img src="lda5.png">

### How LDA finds the right axis?

<b>1) Maximizing the distance (d²).</b>
2 categories — when you have two classes in your target variable, the distance refers to the difference between the mean (μ) of class 1 and the mean of class 2.
More than 2 categories — when you have three or more classes in your target variable, the algorithm first finds a central point to all of the data and then measures the distance from each category mean (μ) to that central point.

<b>2) Minimize the variation, also known as “scatter” in LDA (s²)</b>

<img src="lda6.png">

In [1]:
# PCA
# 1) Standardization of the dataset
# 2) Computing Covraince Matrix
# 3) Genertaing Eigen Values and Eigen Vetcor for Covariance Matrix
#    The model will arrange Eigen vectors in decreasing order of Eigen values, 
#    which is the order of significance of the principal components
# 4) The number of Principal components so chosen becomes a part of Feature Vetcor/Map
# 5) Resultant Principal_comps = np.dot(StandardizedOriginalDataSet,FeatueVector^T)

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
from sklearn.datasets import load_wine

In [11]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
# Supervised learning algo

In [12]:
dt = load_wine()
x = dt.data
y = dt.target
print(x.shape)
print(y.shape)

(178, 13)
(178,)


In [13]:
print(dt.feature_names)

['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']


In [14]:
print(list(set(y)))  # no of categories in the target variable

[0, 1, 2]


In [15]:
lda = LinearDiscriminantAnalysis()
lda_res = lda.fit_transform(x,y)
print(lda_res.shape)
print(type(lda_res))

(178, 2)
<class 'numpy.ndarray'>


#### Note
1) n_components cant be larger than min(n_features, n_classes - 1)<br>
2) Ex : min(13, 3-1) = min(13,2) = 2

In [16]:
df_lda = pd.DataFrame(lda_res,columns=['lda1','lda2'])
df_lda.head()

Unnamed: 0,lda1,lda2
0,-4.700244,1.979138
1,-4.301958,1.170413
2,-3.42072,1.429101
3,-4.205754,4.002871
4,-1.509982,0.451224


In [17]:
df_lda.shape

(178, 2)

In [19]:
x = df_lda
print(x.shape)
print(y.shape)

(178, 2)
(178,)


In [21]:
from sklearn.model_selection import train_test_split

In [23]:
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.25)
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(133, 2)
(45, 2)
(133,)
(45,)


In [33]:
from sklearn.svm import SVC

In [42]:
m1 = SVC(kernel='linear',C=0.001)
m1.fit(x_train,y_train)

In [43]:
# A
print('Train score',m1.score(x_train,y_train))
print('Test score',m1.score(x_test,y_test))

Train score 0.8646616541353384
Test score 0.8444444444444444


In [44]:
ypred_m1 = m1.predict(x_test)

In [45]:
from sklearn.metrics import confusion_matrix,classification_report

In [46]:
print(confusion_matrix(y_test,ypred_m1))
print(classification_report(y_test,ypred_m1))

[[13  2  0]
 [ 0 16  0]
 [ 0  5  9]]
              precision    recall  f1-score   support

           0       1.00      0.87      0.93        15
           1       0.70      1.00      0.82        16
           2       1.00      0.64      0.78        14

    accuracy                           0.84        45
   macro avg       0.90      0.84      0.84        45
weighted avg       0.89      0.84      0.84        45

