# 🧩 Problem Statement

## 1. The Problem: Concept Drift
Imagine you are a student preparing for a math exam. You study hard and learn all the formulas for **Algebra**. But, when you go to the exam hall, you find out the questions are now about **Geometry**! Your old formulas (Algebra) don't work well anymore. You need to **adapt**—maybe unlearn some old rules and learn new ones quickly.

In Machine Learning, this is called **Concept Drift**. The relationship between your inputs and outputs changes over time.

## 2. Our Goal
Build a **Perceptron** that can:
- Learn from a stream of data.
- Detect when its accuracy drops (Validation Check).
- **Reset itself** or adapt when the data changes too much.

## 3. Visual Flow
```mermaid
graph TD
    A[Start: Data Stream] -->|Batch 1| B(Train Perceptron)
    B --> C{Check Accuracy}
    C -->|High Accuracy > 70%| D[Keep Learning Parameters]
    C -->|Low Accuracy < 70%| E[⚠️ RESET WEIGHTS]
    D --> F[Next Batch]
    E --> F
    F -->|Batch 2: Drift!| B
```

### 🔹 Line Explanation
#### 2.1 What the line does
Imports necessary libraries for array manipulation (`numpy`), data handling (`pandas`), plotting (`matplotlib`), and dataset generation (`sklearn`).
#### 2.2 Why it is used
These are the standard tools for Data Science in Python. We need `numpy` for the perceptron math and `matplotlib` to see the results.
#### 2.3 When to use it
At the beginning of every data science project.
#### 2.4 Where to use it
Top of the file/notebook.
#### 2.5 How to use it
`import numpy as np`
#### 2.6 How it works internally
Loads the compiled C libraries for fast math operations into Python's memory.
#### 2.7 Output with sample examples
None (execution is silent).

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

### 🔹 Function Explanation: drifting_stream
#### 3.1 What it does
Generates a stream of data batches where the data distribution shifts (drifts) over time.
#### 3.2 Why it is used
To simulate a changing environment so we can test if our model adapts.
#### 3.3 When to use it
When testing online learning algorithms.
#### 3.4 Where to use it
In synthetic data generation steps.
#### 3.5 How to use it
`batches = drifting_stream()`
#### 3.6 How it works internally
It creates base classification data and adds specific offsets (`shifts`) to the X and Y coordinates of the features.
#### 3.7 Output impact with examples
Returns a list of `(X, y)` tuples.

In [None]:
def drifting_stream(seed=99):
    rng = np.random.default_rng(seed)
    batches = []
    shifts = [(0.0, 0.0), (0.8, -0.6), (1.2, 0.9)]
    
    for drift_x, drift_y in shifts:
        X, y = make_classification(
            n_samples=500,
            n_features=2,
            n_informative=2,
            n_redundant=0,
            class_sep=1.2,
            random_state=rng.integers(1000),
        )
        X[:, 0] += drift_x
        X[:, 1] += drift_y
        batches.append((X, y))
    return batches

### 🔹 Class Explanation: AdaptivePerceptron
#### 2.1 What this Class does
It implements a Perceptron (linear classifier) that can decay its learning rate and reset its memory.
#### 2.2 Why it is used
Standard Perceptrons don't reset or decay explicitly. We need these features for our specific Concept Drift task.
#### 2.3 When to use it
When handling streaming data that might change drastically.
#### 2.4 Where to use it
As the core machine learning model.
#### 2.5 How to use it
`model = AdaptivePerceptron(learning_rate=0.1)`
#### 2.6 How it works internally
It stores `weights` and `bias`. `update_weights` adjusts them based on error. `reset_model` wipes them clean.
#### 2.7 Output with sample examples
An object that can `.predict(X)`.

In [None]:
class AdaptivePerceptron:
    def __init__(self, learning_rate=0.1, decay_rate=0.9, decay_steps=5):
        self.initial_learning_rate = learning_rate
        self.learning_rate = learning_rate
        self.decay_rate = decay_rate
        self.decay_steps = decay_steps
        self.weights = None
        self.bias = 0
        self.reset_count = 0 
        self.learning_rates_log = []

    def activation(self, z):
        # Step function: 1 if z >= 0 else 0
        return 1 if z >= 0 else 0

    def predict(self, X):
        if self.weights is None:
            return np.zeros(X.shape[0])
        linear_output = np.dot(X, self.weights) + self.bias
        y_predicted = np.array([self.activation(z) for z in linear_output])
        return y_predicted

    def update_weights(self, x_i, y_true):
        linear_output = np.dot(x_i, self.weights) + self.bias
        y_pred = self.activation(linear_output)
        error = y_true - y_pred
        if error != 0:
            update = self.learning_rate * error
            self.weights += update * x_i
            self.bias += update

    def reset_model(self, n_features):
        rng = np.random.default_rng(42)
        self.weights = rng.random(n_features) * 0.01
        self.bias = 0
        self.learning_rate = self.initial_learning_rate 
        self.reset_count += 1
        print("  [RESET] TRIGGERED: Weights re-initialized!")

### 🔹 Experiment Block Explanation
#### 2.1 What this block does
It runs the main loop: Get Batches -> Train -> Evaluate -> Reset if needed.
#### 2.2 Why it is used
To put everything together and see if our model works.
#### 2.3 When to use it
After defining data and model classes.
#### 2.4 Where to use it
At the end of the script/notebook.
#### 2.5 How to use it
Just run the cell.
#### 2.6 How it works internally
It loops through the generated batches, performs training loops, checks accuracy, and plotting results.
#### 2.7 Output with sample examples
Prints accuracy logs and displays a plot.

In [None]:
# 1. Get Data Stream
batches = drifting_stream(seed=99)
print(f"Generated {len(batches)} batches of data.\n")

# 2. Initialize Model
model = AdaptivePerceptron(learning_rate=0.1, decay_rate=0.9, decay_steps=5)

global_accuracies = []
reset_points = [] 
EPOCHS_PER_BATCH = 15
overall_epoch_counter = 0

for batch_idx, (X, y) in enumerate(batches):
    print(f"=== BATCH {batch_idx + 1} ===")
    # Split data
    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=200, shuffle=True, random_state=42)
    
    # Initialize weights if needed
    if model.weights is None:
        model.reset_model(n_features=X.shape[1])
        model.reset_count = 0 
    
    # Train Loop
    for epoch in range(EPOCHS_PER_BATCH):
        overall_epoch_counter += 1
        if overall_epoch_counter % model.decay_steps == 0:
            model.learning_rate *= model.decay_rate
        model.learning_rates_log.append(model.learning_rate)

        for i in range(len(X_train)):
            model.update_weights(X_train[i], y_train[i])
    
    # Validation Loop
    val_predictions = model.predict(X_val)
    val_acc = accuracy_score(y_val, val_predictions)
    global_accuracies.append(val_acc)
    print(f"  Batch {batch_idx+1} Validation Accuracy: {val_acc:.4f}")
    
    # Reset Condition
    if val_acc < 0.70:
        print(f"  [DRIFT] Accuracy {val_acc:.2f} < 0.70. Drift likely detected.")
        model.reset_model(n_features=X.shape[1])
        reset_points.append(batch_idx + 1)
        print("  [RETRAIN] Retraining on current batch after reset...")
        model.learning_rate = model.initial_learning_rate
        for epoch in range(EPOCHS_PER_BATCH):
             for i in range(len(X_train)):
                model.update_weights(X_train[i], y_train[i])

### 🔹 Analysis & Visualization Block
#### 3.1 What it does
Plots the accuracy over time and marks where Resets happened.
#### 3.2 Why it is used
Visual proof of the model's performance.
#### 3.3 When to use it
At the very end.
#### 3.4 Where to use it
Final cell.
#### 3.5 How to use it
See code below.
#### 3.6 How it works internally
Uses `matplotlib.pyplot` to draw lines and markers.
#### 3.7 Output impact with examples
Displays a graph inline.

In [None]:
# Analysis Output
print("\n=== FINAL ANALYSIS ===")
print(f"Total Resets: {model.reset_count}")
print(f"Final Batch Accuracy: {global_accuracies[-1]:.4f}")

plt.figure(figsize=(10, 6))
plt.plot(range(1, len(batches) + 1), global_accuracies, marker='o', linestyle='-', label='Validation Accuracy')
plt.axhline(y=0.70, color='r', linestyle='--', label='Reset Threshold (0.70)')
plt.axhline(y=0.80, color='g', linestyle='--', label='Success Criteria (0.80)')

for rp in reset_points:
    plt.axvline(x=rp, color='orange', linestyle=':', label='Weight Reset' if rp == reset_points[0] else "")
    
plt.title('Adaptive Perceptron Accuracy over Drifting Batches')
plt.xlabel('Batch Index')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)
plt.show()