<a href="https://colab.research.google.com/github/yoseforaz0990/ML-templates/blob/main/dimensionality_reduction/linear_discriminant_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

| Step                                              | Explanation                                                                                                                     |
|---------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|
| 1. Applying LDA                                   | Linear Discriminant Analysis (LDA) is a supervised dimensionality reduction technique that transforms the feature space into a lower-dimensional space while maximizing the class separability. It aims to find linear combinations of features that best discriminate between different classes. In this step, we specify the desired number of linear discriminants, which is set to 2 in this case (LD1 and LD2). |
| 2. Training the Logistic Regression model        | After applying LDA and obtaining the transformed feature vectors in the training set, we train a Logistic Regression model on this reduced data. Logistic Regression is a popular classification algorithm used for binary or multiclass classification tasks. It learns to separate the data into different classes based on the input features. |
| 3. Making the Confusion Matrix                    | The Confusion Matrix is a table used to evaluate the performance of a classification model. It shows the number of true positives, true negatives, false positives, and false negatives. It helps assess how well the model predicts the classes on the test set. |
| 4. Visualising the Test set results (Decision boundary) | To visualize the decision boundary of the Logistic Regression model, we create a grid of points spanning the range of LD1 and LD2 in the transformed feature space. We then apply the trained classifier to each point on the grid to predict the class and create a contour plot to visualize the decision boundary. This allows us to observe how well the model separates the data points into their respective classes. |


In [None]:
# Assuming you have already split your dataset into X_train, X_test, y_train, and y_test

# Applying LDA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
lda = LDA(n_components=2)
X_train = lda.fit_transform(X_train, y_train)
X_test = lda.transform(X_test)

# Training the Logistic Regression model on the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state=0)
classifier.fit(X_train, y_train)

# Making the Confusion Matrix (Not shown in the provided code)
# To make predictions and evaluate the model's performance on the test set

# Visualising the Test set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start=X_set[:, 0].min() - 1, stop=X_set[:, 0].max() + 1, step=0.01),
                     np.arange(start=X_set[:, 1].min() - 1, stop=X_set[:, 1].max() + 1, step=0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha=0.75, cmap=ListedColormap(('red', 'green', 'blue')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c=ListedColormap(('red', 'green', 'blue'))(i), label=j)
plt.title('Logistic Regression (Test set)')
plt.xlabel('LD1')
plt.ylabel('LD2')
plt.legend()
plt.show()

