In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# RGB → Font Color Classifier (NN) — Plan

## 1. Data Preparation
- [ ] Load `data_a2_mc1_vta_hs25.csv` with `sep=";"`.
- [ ] Inspect columns: `RED`, `GREEN`, `BLUE`, `LIGHT_OR_DARK_FONT_IND`.
- [ ] Normalize RGB to \([0,1]\): `R/255`, `G/255`, `B/255`.
- [ ] Shuffle rows to remove ordering bias.
- [ ] Split: **80% train**, **20% test** (stratify by label if possible).
- [ ] Verify label semantics (0 ↔ light? 1 ↔ dark?) by quick EDA (class means or a few samples).

## 2. Neural Network Architecture
- Parameters:
  - Input size \(n_{\text{in}}=3\) (R,G,B)
  - Hidden size \(n_h=8\) (tunable)
  - Output size \(n_{\text{out}}=1\)
- Init:
  - **He** for ReLU: \(W_1 \sim \mathcal{N}(0,\sqrt{2/n_{\text{in}}})\), \(W_2 \sim \mathcal{N}(0,\sqrt{2/n_h})\)
  - Biases: \(b_1=\mathbf{0}\), \(b_2=\mathbf{0}\)
- Activations:
  - \(\mathrm{ReLU}(z)=\max(0,z)\), \(\mathrm{ReLU}'(z)=\mathbf{1}_{z>0}\)
  - \(\sigma(z)=\frac{1}{1+e^{-z}}\)
- Forward pass (return all intermediates):
  - \(z_1 = X W_1 + b_1\)
  - \(a_1 = \mathrm{ReLU}(z_1)\)
  - \(z_2 = a_1 W_2 + b_2\)
  - \(a_2 = \sigma(z_2)\)

## 3. Loss Function (Binary Cross-Entropy)
\[
\mathcal{L} = -\left[y\cdot \log(\hat{y}) + (1-y)\cdot \log(1-\hat{y})\right]
\]
- Numerical stability: clip \(\hat{y}\) to \([10^{-12},\,1-10^{-12}]\).

## 4. Backpropagation
- \(dZ_2 = a_2 - y\)
- \(dW_2 = a_1^\top dZ_2\),  \(db_2 = \sum dZ_2\)
- \(dA_1 = dZ_2 W_2^\top\)
- \(dZ_1 = dA_1 \odot \mathrm{ReLU}'(z_1)\)
- \(dW_1 = X^\top dZ_1\),  \(db_1 = \sum dZ_1\)
- **Shape sanity checks** (batch \(m\)):
  - \(X:[m,3]\), \(W_1:[3,8]\), \(b_1:[1,8]\), \(a_1:[m,8]\)
  - \(W_2:[8,1]\), \(b_2:[1,1]\), \(a_2:[m,1]\)

## 5. Parameter Update (Training)
- Optimizer: **SGD**, learning rate \(\eta = 0.05\) (tune if needed).
- Loop **100,000** iterations:
  - Sample 1 (or mini-batch) from train.
  - Forward → Backward → Update:
    - \(W \leftarrow W - \eta \cdot \frac{1}{m_{\text{batch}}} dW\)
    - \(b \leftarrow b - \eta \cdot \frac{1}{m_{\text{batch}}} db\)
- Log: print loss every **1,000** iters.

## 6. Evaluation
- `predict_proba(X)`: return sigmoid outputs \(a_2\).
- `predict(X)`: threshold at **0.5** → \(\{0,1\}\).
- Report **accuracy** on train & test.
- (Optional) **Confusion matrix** \([TP, FP; FN, TN]\).

## 7. Inference Function
- `predict_text_color(R, G, B)`:
  - Scale to \([0,1]\).
  - Compute probability \(p=\hat{y}\).
  - Return `"hell"` if predicted **light**, `"dunkel"` if **dark** (match your verified label mapping).
  - Print \(p\) for transparency.

---

### Notes & Tips
- If loss plateaus or oscillates, try: lower \(\eta\), use mini-batches (e.g., 32), or increase \(n_h\).
- Keep a fixed random seed for reproducibility.
- Validate the **label mapping** once and document it at the top of your code.


prepare the data

In [1]:
df = pd.read_csv("data/data_a2_mc1_vta_hs25.csv", sep=";")
lum = 0.2126*df.RED + 0.7152*df.GREEN + 0.0722*df.BLUE
print(lum.groupby(df.LIGHT_OR_DARK_FONT_IND).mean())

NameError: name 'pd' is not defined

In [8]:
# Hiperparametreler (istediğinde değiştir)
n_in  = 3        # R,G,B
n_h   = 8        # gizli nöron sayısı (tune edilebilir)
n_out = 1        # tek olasılık çıktısı
rng = np.random.default_rng(42)

# He initialization (ReLU için)
W1 = rng.normal(0, np.sqrt(2.0/n_in), size=(n_in, n_h))
b1 = np.zeros((1, n_h))
W2 = rng.normal(0, np.sqrt(2.0/n_h),  size=(n_h, n_out))
b2 = np.zeros((1, n_out))

# Aktivasyonlar
def relu(z):      return np.maximum(0.0, z)
def relu_grad(z): return (z > 0).astype(z.dtype)
def sigmoid(z):   return 1.0 / (1.0 + np.exp(-z))

# (İsteğe bağlı) mimariyi hızlıca denerken tek forward geçiş
def forward_pass(X):
    # X: (N,3) ölçeklenmiş (÷255)
    z1 = X @ W1 + b1        # (N, n_h)
    a1 = relu(z1)           # (N, n_h)
    z2 = a1 @ W2 + b2       # (N, 1)
    a2 = sigmoid(z2)        # (N, 1)  → açık yazı olasılığı gibi yorumlanır
    return z1, a1, z2, a2

# Hızlı şekil kontrolü (sanity check)
_dummy = np.zeros((5, 3))   # 5 örnek, 3 özellik
z1,a1,z2,a2 = forward_pass(_dummy)
print(z1.shape, a1.shape, z2.shape, a2.shape)  # (5, n_h) (5, n_h) (5, 1) (5, 1)

(5, 8) (5, 8) (5, 1) (5, 1)
