# <center> Linear Discriminant Analysis
    
<B> LDA</B> makes predictions by estimating the probability that a new set of inputs belongs to each class. The class that gets the highest probability is the output class and a prediction is made.

The model uses <b>Bayes Theorem</b>  to estimate the probabilities. Briefly Bayes’ Theorem can be used to estimate the probability of the output class (k) given the input (x) using the probability of each class and the probability of the data belonging to each class:

<b>P(Y=x|X=x) = (PIk * fk(x)) / sum(PIl * fl(x))</b>

Where PIk refers to the base probability of each class (k) observed in your training data (e.g. 0.5 for a 50-50 split in a two class problem). In Bayes’ Theorem this is called the prior probability.

<b>PIk = nk/n</b>

The f(x) above is the estimated probability of x belonging to the class. A Gaussian distribution function is used for f(x). Plugging the Gaussian into the above equation and simplifying we end up with the equation below. This is called a discriminate function and the class is calculated as having the largest value will be the output classification (y):

<b>Dk(x) = x * (muk/siga^2) – (muk^2/(2*sigma^2)) + ln(PIk)</b>

Dk(x) is the discriminate function for class k given input x, the muk, sigma^2 and PIk are all estimated from your data.   
![LDA.jpg](attachment:LDA.jpg)


### Implementation is given below for LDA
### >>FIrstly import all different useful libraries

In [20]:
import pandas as pd
# from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn import metrics


In [27]:
df = pd.read_csv('heart_stalog.csv')


### >>There is string values in this csv file as shown below after applying head code
### print(df.head())

![image.png](123.png)

### Here in class there is string values.Firstly convert it into binary with the help of label encoder like given below

In [28]:
label_encoder = preprocessing.LabelEncoder()
df['class'] = label_encoder.fit_transform(df['class'])

In [22]:
print(df.head())

    age  sex  chest  resting_blood_pressure  serum_cholestoral  \
0  70.0  1.0    4.0                   130.0              322.0   
1  67.0  0.0    3.0                   115.0              564.0   
2  57.0  1.0    2.0                   124.0              261.0   
3  64.0  1.0    4.0                   128.0              263.0   
4  74.0  0.0    2.0                   120.0              269.0   

   fasting_blood_sugar  resting_electrocardiographic_results  \
0                  0.0                                   2.0   
1                  0.0                                   2.0   
2                  0.0                                   0.0   
3                  0.0                                   0.0   
4                  0.0                                   2.0   

   maximum_heart_rate_achieved  exercise_induced_angina  oldpeak  slope  \
0                        109.0                      0.0      2.4    2.0   
1                        160.0                      0.0      1.6    

In [30]:
#Print encoded binary data in new file
df.to_csv("new.csv", index=None)

### >>After loading new csv file,do partitioning of data of features data and target data

In [31]:
x=df.iloc[:,3:13].values    #features data
y = df.iloc[:, 13].values   #target data


## >>Take training and testing data sets

In [32]:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)

## >>Get Predictions fitting data in LDA model as given below 

In [33]:
output=LDA(n_components=3)
output.fit(x_train,y_train)
y_pred=output.predict(x_test)
print(y_pred)

[0 0 1 1 1 0 0 1 0 0 0 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 1 0 1 0 0 0 1 0 0 0 0
 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0]


## >>Get accuracy of model like this

In [34]:
from sklearn import metrics

print("accuracy:", metrics.accuracy_score(y_test,y_pred))

accuracy: 0.8703703703703703


## Research Infinite Solutions LLP

by [Research Infinite Solutions](http://www.researchinfinitesolutions.com/)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.