<img src="../../../images/banners/ml-algorithms.jpg" width="600"/>

<a class="anchor" id="intro_to_data_structures"></a>
# <img src="../../../images/logos/ml-logo.png" width="23"/> Classification

## <img src="../../../images/logos/toc.png" width="20"/> Table of Contents
* [Logistic Regression](#)
* [Generative vs Discriminative Setting](#)

---

## Logistic Regression

In [1]:
# Example usage
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [2]:
# Load and preprocess the dataset
data = load_breast_cancer()
X = data.data
y = data.target
X = np.hstack((np.ones((len(y), 1)), X))
X = StandardScaler().fit_transform(X)

In [3]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Building A Logistic Regression in Python

In [4]:
class LogisticRegression:
    def __init__(self, learning_rate=0.01, num_iterations=1000):
        self.learning_rate = learning_rate
        self.num_iterations = num_iterations
        self.weights = None

    def fit(self, X, y):
        num_samples, num_features = X.shape

        # Initialize weights and bias
        self.weights = np.zeros(num_features)

        # Gradient descent
        for _ in range(self.num_iterations):
            linear_model = np.dot(X, self.weights)
            y_predicted = self.sigmoid(linear_model)

            # Update weights and bias using gradient descent
            dw = (1 / num_samples) * np.dot(X.T, (y_predicted - y))

            self.weights -= self.learning_rate * dw

    def predict(self, X):
        linear_model = np.dot(X, self.weights)
        y_predicted = self.sigmoid(linear_model)
        y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
        return y_predicted_cls

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

In [5]:
# Create an instance of LogisticRegression and fit the training data
model = LogisticRegression()
model.fit(X_train, y_train)

In [6]:
# Predict the labels for the test set
y_pred = model.predict(X_test)

In [7]:
# Evaluate the model
accuracy = np.mean(y_pred == y_test)
print(f"Accuracy: {accuracy}")

Accuracy: 0.9649122807017544


### `sklearn` Logistic Regression

In [8]:
# Example usage
from sklearn.linear_model import LogisticRegression as SKLogisticRegression

In [9]:
# Create an instance of LogisticRegression and fit the training data
model = SKLogisticRegression()
model.fit(X_train, y_train)

LogisticRegression()

In [10]:
# Predict the labels for the test set
y_pred = model.predict(X_test)

In [11]:
# Evaluate the model
accuracy = np.mean(y_pred == y_test)

In [12]:
print(f"Accuracy: {accuracy}")

Accuracy: 0.9736842105263158


## Generative vs Discriminative Setting

Discriminative and generative approaches are two different ways of tackling problems in machine learning.

In simple terms, a discriminative approach focuses on learning the boundary or decision-making process between different classes or categories. It aims to find a direct mapping from input data to output labels. It learns the relationship between the input and output directly without explicitly modeling the underlying distribution of the data.

<img src="../images/generative-vs-discriminative-models.png" width="600"/>

For example, let's say we have a dataset of images with cats and dogs, and we want to build a model that can classify new images as either cats or dogs. In a discriminative approach, the model would learn the features or patterns in the images that distinguish cats from dogs. It would then use these learned features to make predictions on new, unseen images.

On the other hand, a generative approach focuses on modeling the joint distribution of the input data and output labels. It aims to understand how the data is generated from different classes or categories. It learns the underlying distribution of the data and uses this knowledge to generate new data samples.

Using the same example, in a generative approach, the model would first learn the distribution of features or patterns for both cats and dogs separately. It would then use this information to generate new images that resemble either cats or dogs.

<img src="../images/generative-vs-discriminative-models-2.png" width="600"/>

To summarize:
- Discriminative approach: Learns the boundary or decision-making process directly between different classes. It focuses on the relationship between input and output without explicitly modeling the underlying distribution.
- Generative approach: Models the joint distribution of the input data and output labels. It focuses on understanding how the data is generated from different classes and can generate new samples.
- Both approaches have their advantages and are useful in different scenarios. Discriminative models are often simpler and can be more efficient for classification tasks. Generative models, on the other hand, can be useful when we want to understand the underlying structure of the data or generate new samples that resemble the given data distribution.

> Note: When we say discriminative models are often simpler, we are referring to the fact that they directly learn the decision boundary between different classes or categories without explicitly modeling the underlying data distribution. This direct mapping from input to output makes the modeling task more focused and potentially easier.

Here's a table summarizing the advantages and disadvantages of the discriminative and generative approaches in machine learning:

|                    | Generative Approach                               | Discriminative Approach                                          |
|--------------------|--------------------------------------------------|------------------------------------------------------------------|
| Advantages         | - **Data generation**: Ability to generate new data samples | - **Simplicity**: Direct mapping from input to output            |
|                    | - **Handling missing data**: Can handle missing or incomplete data | - **Computational efficiency**: Fewer parameters                |
|                    | - **Anomaly detection**: Can identify outliers or anomalies | - **Feature relevance**: Focus on discriminative features       |
|                    | - **Data synthesis**: Can generate data that resembles the given distribution | - **Generalization**: Good at predicting on new data       |
| Disadvantages      | - **Complexity**: More parameters and computational resources required | - **Limited information**: Ignores data distribution         |
|                    | - **Overfitting risk**: Prone to overfitting if not enough data available | - **Data imbalance**: Sensitive to imbalanced data          |
|                    | - **Computational cost**: More complex and computationally expensive | - **No data generation**: Cannot generate new data         |
|                    | - **Interpretability**: More challenging to interpret and understand | - **Task-specific**: More suitable for classification tasks |


<img src="../images/chatgpt.jpg" width="400"/>

It's interesting to know that ChatGPT is a generative model. It belongs to a family of models called generative pre-trained transformers (GPT), which are designed to generate text based on the patterns and structures it has learned from the training data. ChatGPT generates responses to input based on the context and tries to produce coherent and relevant text.