<a href="https://colab.research.google.com/github/ShubhamP1028/DeepLearning/blob/main/Normalisation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Normalisation
---
We’ll normalize the features so that each input variable is on a similar scale, which helps with convergence during training. I’ll use MinMaxScaler for simplicity (you could also use StandardScaler).

---
<b>What is Normalisation? ➡</b>

Imagine you’re training a group of athletes 🏃‍♀️🏃‍♂️ for a relay race.
*  One is measured in meters per second,
*  Another in minutes per kilometer,
*  Another in miles per hour.

If you don’t convert them into the same scale, comparing their performance is confusing.

👉 Normalization = putting all features on a common scale so the model can learn fairly.

---
<b>🛠 Why It Matters</b>
*  Faster training – Gradient descent works better when features have similar ranges.
*  Avoid bias – A large-scale feature (like salary in ₹100,000s) shouldn’t dominate a small-scale feature (like age in 20s).
*  Better accuracy – Helps the model find general patterns instead of latching onto magnitudes.

Types of Normalization (most common)-

<b>1. Min-Max Normalization (Scaling)</b>
*  Rescales values into a fixed range, usually [0,1].
$$x' = \frac{x - x_{\min}}{x_{\max} - x_{\min}}$$

*  Example:
  *  Age 0–100 → 0–1
  *  Salary 20k–120k → 0–1

⚡ Best when you know the min/max and data has no extreme outliers.

---
<b>2. Z-score Normalization (Standardization)</b>
*  Centers data around 0 with standard deviation 1.
$$x' = \frac{x - \mu}{\sigma}$$
*  Example: If height mean = 170cm, std = 10cm → a person of 180cm =
(180−170)/10=1.0.

Meaning: “1 standard deviation above average.”

⚡ Best for ML/DL models (especially neural nets) since it keeps data balanced around zero.


---
3. Unit Vector Normalization
*  Scale each row (sample) to length 1.
$$x' = \frac{x}{\|x\|}$$
*  Useful in text embeddings, where direction matters more than magnitude.

---
## <u>Deep Learning Normalisation Techniques</u>
Normalization happens before training (input features) and sometimes inside the network:
*  Batch Normalization: Normalizes layer outputs during training to keep activations stable.
*  Layer Normalization / Group Normalization: Variants for different architectures (like transformers).

---

### <u>Batch Normalization (BN)</u>
Intuition

Normalizes activations across the batch dimension.

Think: "Look at all samples in the batch, compute mean & std, then normalize each feature channel."

Very similar to Z-score normalization — but applied per mini-batch during training.

Formula-

For activation
𝑥 in a batch:
$$\hat{x} = \frac{x - \mu_{\text{batch}}}{\sqrt{\sigma^2_{\text{batch}} + \epsilon}}$$

Then apply learnable scale and shift:
$$y = \gamma \hat{x} + \beta$$

𝜇 batch: mean of the batch

𝜎 batch2: variance of the batch

𝜖: small constant for stability

𝛾,𝛽: trainable parameters (so model can “undo” normalization if useful)

<u><b>Analogy to earlier normalizations</b></u>

Like Z-score normalization, but mean/std are computed per batch, not globally.

Keeps features stable across training steps.

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# 1. Load dataset
df = pd.read_csv("/mnt/data/goldstock.csv")

# 2. Feature engineering
df['open-close'] = df['Open'] - df['Close']
df['low-high'] = df['Low'] - df['High']
df['is_quarter_end'] = np.where(df['Date'].str.endswith(('03-31','06-30','09-30','12-31')), 1, 0)

# Target: Price goes up next day (binary classification)
df['target'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0)
df.dropna(inplace=True)

# 3. Features and target
X = df[['open-close', 'low-high', 'is_quarter_end']].values
y = df['target'].values

# 4. Normalization
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)

# 5. Train-test split
X_train, X_val, y_train, y_val = train_test_split(
    X_scaled, y, test_size=0.2, shuffle=False
)

# 6. TensorFlow model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

# 7. Train the model
history = model.fit(X_train, y_train,
                    validation_data=(X_val, y_val),
                    epochs=50,
                    batch_size=16,
                    verbose=1)

# 8. Plot accuracy and loss
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Val Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Val Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

plt.show()

# 9. Evaluation
loss, acc = model.evaluate(X_val, y_val, verbose=0)
print(f"Validation Accuracy: {acc:.4f}")
