In [3]:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import rbf_kernel

In [5]:
# Create assumed dataset
data = {
    "Height": [160, 165, 170, 175, 180, 185],   # in cm
    "Weight": [55, 65, 70, 75, 85, 90],        # in kg
    "Age":    [25, 30, 35, 40, 45, 50],        # in years
    "BP":     [120, 125, 130, 135, 140, 150],  # blood pressure
    "Diabetes": ["No", "No", "Yes", "No", "Yes", "Yes"]
}

df = pd.DataFrame(data)
print("Dataset:\n", df)


Dataset:
    Height  Weight  Age   BP Diabetes
0     160      55   25  120       No
1     165      65   30  125       No
2     170      70   35  130      Yes
3     175      75   40  135       No
4     180      85   45  140      Yes
5     185      90   50  150      Yes


In [7]:

# Extract only feature columns (X)
X = df[["Height", "Weight", "Age", "BP"]]

# Compute RBF kernel matrix
gamma = 0.001  # parameter controlling similarity (you can tune this)
rbf_matrix = rbf_kernel(X, X, gamma=gamma)

print("\nRBF Kernel Matrix (gamma=0.001):\n", np.round(rbf_matrix, 4))



RBF Kernel Matrix (gamma=0.001):
 [[1.     0.8395 0.5916 0.3413 0.1225 0.0342]
 [0.8395 1.     0.9048 0.6703 0.3413 0.1287]
 [0.5916 0.9048 1.     0.9048 0.5916 0.2865]
 [0.3413 0.6703 0.9048 1.     0.8395 0.522 ]
 [0.1225 0.3413 0.5916 0.8395 1.     0.8395]
 [0.0342 0.1287 0.2865 0.522  0.8395 1.    ]]



---

## üîé Steps SVM Takes After Kernel Matrix Computation

### 1. **Formulate Optimization Problem**

SVM tries to find the **best separating hyperplane** in the transformed feature space (implicitly created by the kernel).
The optimization problem is:

$$
\min_{\alpha} \; \frac{1}{2} \sum_{i,j} \alpha_i \alpha_j y_i y_j K(x_i, x_j) - \sum_i \alpha_i
$$

subject to:

$$
\sum_i \alpha_i y_i = 0, \quad 0 \leq \alpha_i \leq C
$$

* Here, $\alpha_i$ are **Lagrange multipliers**.
* $y_i$ are class labels (+1 or ‚Äì1).
* $C$ is the regularization parameter.
* The kernel matrix is $K(x_i, x_j)$.

---

### 2. **Solve Quadratic Optimization**

Using algorithms like **SMO (Sequential Minimal Optimization)** or other quadratic programming methods, the SVM solves for the $\alpha_i$.

* Only some $\alpha_i > 0$.
* Those data points are called **support vectors**.
* They ‚Äúsupport‚Äù the decision boundary.

---

### 3. **Build the Decision Function**

Once support vectors are found, the decision function is:

$$
f(x) = \text{sign}\Big( \sum_i \alpha_i y_i K(x_i, x) + b \Big)
$$

* $x_i$: support vectors
* $K(x_i, x)$: kernel similarity between support vector and new point
* $b$: bias term (computed during optimization)

---

### 4. **Prediction on New Data**

When you feed a new point $x_{\text{new}}$, SVM:

1. Computes $K(x_i, x_{\text{new}})$ for each support vector.
2. Weighs them with $\alpha_i y_i$.
3. Adds bias $b$.
4. Applies sign ‚Üí gives class label (+1 or ‚Äì1).

---

## ‚öñÔ∏è Intuition

* **Linear Kernel** ‚Üí hyperplane in original space.
* **RBF Kernel** ‚Üí hyperplane in infinite-dimensional feature space ‚Üí looks like **nonlinear curved boundaries** in the original space.

---

‚úÖ So the next step after building the **RBF kernel matrix** is:

* Plugging it into the **optimization problem**,
* Finding the **support vectors and weights**,
* Then using those to build the **decision function** for classification.

---
