## what is Discriminant Functions?

Discriminant functions and decision rules are fundamental concepts in the context of classification problems. Let's break down each term:

### Discriminant Function:
A discriminant function is a mathematical function that takes input features and maps them to a decision or classification. The primary purpose of the discriminant function is to discriminate between different classes or categories. The form of the discriminant function depends on the specific classification algorithm being used. Here are a few examples:

1. **Linear Discriminant Function (LDF):** In linear discriminant analysis (LDA), the discriminant function is a linear combination of the input features. It can be represented as $ Y(\mathbf{x}) = \mathbf{w}^T \mathbf{x} + b $, where $\mathbf{w}$ is a weight vector, $\mathbf{x}$ is the input feature vector, and $b$ is a bias term.

2. **Quadratic Discriminant Function (QDF):** In quadratic discriminant analysis (QDA), the discriminant function involves quadratic terms in addition to linear terms, providing more flexibility.

3. **Support Vector Machines (SVM):** SVMs use a discriminant function based on a hyperplane that maximally separates different classes in the feature space. The decision boundary is determined by support vectors.

4. **Logistic Regression:** In logistic regression, the logistic or sigmoid function is used as the discriminant function to model the probability of belonging to a particular class.

### Decision Rule:
The decision rule is a criterion or condition based on the output of the discriminant function that determines the final class assignment. It's the rule that specifies how to make decisions or predictions based on the computed discriminant values. The decision rule typically involves comparing the output of the discriminant function to a threshold or using some criteria to assign the input to a specific class.

For example, in a binary classification problem:

- If $ Y(\mathbf{x}) > \text{Threshold} $, assign the input to Class 1.
- If $ Y(\mathbf{x}) \leq \text{Threshold} $, assign the input to Class 2.

The choice of the decision rule can impact the performance of the classification model, and it may be adjusted based on the specific requirements of the application.

In summary, discriminant functions provide a way to map input features to decision values, and decision rules determine how those values are translated into class assignments. These concepts are foundational in the field of pattern recognition, machine learning, and statistics.

In the context of Gaussian distribution and discriminant functions, there are three common cases that are used in pattern recognition and statistical classification:

1. **Single-Class (Single Gaussian) Discriminant Function:($C_1=C_2=C$)**
   - **Assumption:** All classes share a common covariance matrix.
   - **Discriminant Function:** Assumes that the covariance matrix is the same for all classes $( \Sigma_1 = \Sigma_2 = \ldots = \Sigma_k )$.
   - **Considerations**:
    The data for different classes exhibit similar statistical properties.
    The covariance matrix is assumed to be the same across all classes.
   - **Decision Rule:** Assign the input to the class with the highest posterior probability given by the Gaussian distribution.

2. **Diagonal Covariance Matrix Discriminant Function: ($C_1=C_2=σ^2I$)**
   - **Assumption:** Each class has its own diagonal covariance matrix (variances along the coordinate axes).
   - **Discriminant Function:** Assumes that the covariance matrices are diagonal $\Sigma_i = \text{diag}(\sigma_{i1}^2, \sigma_{i2}^2, \ldots, \sigma_{ip}^2)$.
   - **Considerations**:
    The variances of the features (dimensions) for each class might be different.
    The correlations between different features are assumed to be negligible.
   - **Decision Rule:** Similar to the single-class case, the decision rule involves assigning the input to the class with the highest posterior probability given by the Gaussian distribution.

3. **General Covariance Matrix Discriminant Function:($C_1 \neq C_2$)**
   - **Assumption:** Each class has its own general (non-diagonal) covariance matrix.
   - **Discriminant Function:** Does not assume that covariance matrices are the same or diagonal.
   - **Considerations**:
    The variances of the features may differ between classes.
    The correlations between different features can be taken into account.
    Provides more flexibility in modeling the shape of the distribution for each class.
   - **Decision Rule:** The decision rule involves calculating the Mahalanobis distance, which accounts for the correlations between different features. The input is assigned to the class with the smallest Mahalanobis distance.

In all cases, the discriminant functions are derived from the assumption that the data in each class follows a multivariate Gaussian (normal) distribution. These discriminant functions are used in techniques like Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for classification tasks.

It's important to note that the choice between these cases depends on the underlying assumptions about the distribution of the data and the characteristics of the classes being modeled. The single-class case is simpler and assumes a common covariance matrix for all classes. The diagonal and general covariance matrix cases provide more flexibility in modeling the shape of the distribution for each class. The appropriate choice often depends on the nature of the data and the specific requirements of the classification problem.

## Random Datapoint generation

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal

# Set seed for reproducibility
np.random.seed(42)

# Generate random data for two classes
num_samples = 100
mean_class1 = [1, 2]
mean_class2 = [4, 3]
covariance_class1 = [[1, 0.5], [0.5, 1]]
covariance_class2 = [[1, -0.8], [-0.8, 1]]

class1_data = np.random.multivariate_normal(mean_class1, covariance_class1, num_samples)
class2_data = np.random.multivariate_normal(mean_class2, covariance_class2, num_samples)

# Combine the data
all_data = np.vstack([class1_data, class2_data])
labels = np.hstack([np.zeros(num_samples), np.ones(num_samples)])

# Plot the generated data
plt.scatter(class1_data[:, 0], class1_data[:, 1], label='Class 1', marker='o')
plt.scatter(class2_data[:, 0], class2_data[:, 1], label='Class 2', marker='x')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Generated 2D Random Dataset')
plt.show()


## Case 1 : Single-Class (Single Gaussian) Discriminant Function:($C_1=C_2=C$)

In [None]:
# Calculate mean and covariance for all data
mean_all = np.mean(all_data, axis=0)
covariance_all = np.cov(all_data, rowvar=False)
print(mean_all, covariance_all)

In [None]:
# Create a multivariate normal distribution using mean and covariance
distribution_all = multivariate_normal(mean_all, covariance_all)
distribution_all

In [None]:
# Discriminant function for single-class case
def single_class_discriminant_function(x):
    return distribution_all.pdf(x)

In [None]:
# Predict the class for a new data point
new_point = np.array([5, 5])
prediction = single_class_discriminant_function(new_point)

print("Discriminant Function Value for the New Data Point:", prediction)

In [None]:
# Decision Rule: Assign the input to the class with the highest posterior probability
if prediction<0.005:
  predicted_class=2
else:
  predicted_class=1
print("Predicted Class for the New Data Point:", predicted_class)

In [None]:
# Plot the generated data
plt.scatter(class1_data[:, 0], class1_data[:, 1], label='Class 1', marker='o')
plt.scatter(class2_data[:, 0], class2_data[:, 1], label='Class 2', marker='x')
plt.scatter(new_point[0], new_point[1], label='New Point', marker='^')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Generated 2D Random Dataset')
plt.show()

## Case-2 : Diagonal Covariance Matrix Discriminant Function: ($C_1=C_2=σ^2I$)

In [None]:
class1_data=all_data[:100, :]
class2_data=all_data[100: :]

class1_data.shape, class2_data.shape

In [None]:
# Calculate mean and covariance for each class separately
mean_class1 = np.mean(class1_data, axis=0)
mean_class2 = np.mean(class2_data, axis=0)
covariance_class1 = np.diag(np.var(class1_data, axis=0))
covariance_class2 = np.diag(np.var(class2_data, axis=0))

print(mean_class1, covariance_class1)
print(mean_class2, covariance_class2)

In [None]:
# Create multivariate normal distributions for each class
distribution_class1 = multivariate_normal(mean_class1, covariance_class1)
distribution_class2 = multivariate_normal(mean_class2, covariance_class1)

In [None]:
# Discriminant function for diagonal covariance matrix case
def diagonal_covariance_discriminant_function(x):
    return distribution_class1.pdf(x), distribution_class2.pdf(x)

In [None]:
# Decision Rule: Assign the input to the class with the highest posterior probability
def predict_class(x):
    pdf_class1, pdf_class2 = diagonal_covariance_discriminant_function(x)
    if pdf_class1>pdf_class2:
      return 1, pdf_class1
    else:
      return 2, pdf_class2
    # return 0 if pdf_class1 > pdf_class2 else 1

In [None]:
# Predict the class for a new data point
new_point = np.array([1, 1])
predicted_class, predicted_probability = predict_class(new_point)
print("Predicted Class for the New Data Point:", predicted_class)
print("Prdicted Class probability", predicted_probability)

In [None]:
# Plot the generated data
plt.scatter(class1_data[:, 0], class1_data[:, 1], label='Class 1', marker='o')
plt.scatter(class2_data[:, 0], class2_data[:, 1], label='Class 2', marker='x')
plt.scatter(new_point[0], new_point[1], label='New Point', marker='^')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Generated 2D Random Dataset')
plt.show()

## Case 3 : General Covariance Matrix Discriminant Function:($C_1 \neq C_2$)

In [None]:
class1_data=all_data[:100, :]
class2_data=all_data[100: :]

class1_data.shape, class2_data.shape

((100, 2), (100, 2))

In [None]:
# Calculate mean and covariance for each class separately
mean_class1 = np.mean(class1_data, axis=0)
mean_class2 = np.mean(class2_data, axis=0)
covariance_class1 = np.diag(np.var(class1_data, axis=0))
covariance_class2 = np.diag(np.var(class2_data, axis=0))

print(mean_class1, covariance_class1)
print(mean_class2, covariance_class2)

[1.08307042 2.11709274] [[0.8150929  0.        ]
 [0.         0.76768711]]
[3.89208458 3.13541943] [[1.09769667 0.        ]
 [0.         1.00118854]]


In [None]:
# Create multivariate normal distributions for each class
distribution_class1 = multivariate_normal(mean_class1, covariance_class1)
distribution_class2 = multivariate_normal(mean_class2, covariance_class2)

# Discriminant function for general covariance matrix case
def general_covariance_discriminant_function(x):
    return distribution_class1.pdf(x), distribution_class2.pdf(x)

# Decision Rule: Assign the input to the class with the smallest Mahalanobis distance
def predict_class_case3(x):
    pdf_class1, pdf_class2 = general_covariance_discriminant_function(x)
    return 0 if pdf_class1 > pdf_class2 else 1


In [None]:
# Predict the class for a new data point
new_point = np.array([3, 3.5])
predicted_class = predict_class_case3(new_point)

print("Predicted Class for the New Data Point:", predicted_class)

Predicted Class for the New Data Point: 1


In [None]:
# Plot the generated data
plt.scatter(class1_data[:, 0], class1_data[:, 1], label='Class 1', marker='o')
plt.scatter(class2_data[:, 0], class2_data[:, 1], label='Class 2', marker='x')
plt.scatter(new_point[0], new_point[1], label='New Point', marker='^')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Generated 2D Random Dataset')
plt.show()

## Testing Dataset with CASE 3

In [None]:
num_samples = 20
mean_class1 = [1, 2]
mean_class2 = [6, 7]
covariance_class1 = [[1, 0.5], [0.5, 1]]
covariance_class2 = [[1, -0.8], [-0.8, 1]]

class1_data = np.random.multivariate_normal(mean_class1, covariance_class1, num_samples)
class2_data = np.random.multivariate_normal(mean_class2, covariance_class2, num_samples)

# Combine the data
test_all_data = np.vstack([class1_data, class2_data])
test_labels = np.hstack([np.zeros(num_samples), np.ones(num_samples)])

# Plot the generated data
plt.scatter(test_all_data[:20, 0], test_all_data[:20, 1], label='Class 1', marker='o')
plt.scatter(test_all_data[20:, 0], test_all_data[20:, 1], label='Class 2', marker='x')

plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Generated 2D Random Test Dataset')
plt.show()

In [None]:
pred_label=[]
pred_prob=[]
for d in test_all_data:
  pred=predict_class_case3(d)
  # print(pred)
  pred_label.append(pred)
  # break

In [None]:
from sklearn import metrics

In [None]:
acc=metrics.accuracy_score(test_labels, pred_label)
acc

0.95

In [None]:
f1=metrics.f1_score(test_labels, pred_label)
f1

0.9523809523809523