# Real-World Use Case: Pulsar Star Detection

## 1. The Problem
Astronomers collect radio signals from space. Most are noise, but some are Pulsars (rare neutron stars). We need to classify signals as "Pulsar" or "Noise" to aid discovery.

## 2. Why SVM?
*   **Signal Separation**: We are trying to draw a boundary between signal and noise in high-dimensional space.
*   **Kernel Trick**: The boundary might be curved or complex, which SVM with RBF kernel handles perfectly.

## 3. Data Simulation (HTRU2 Proxy)
Features are statistical properties of the integrated pulse profile (Mean, Std, Kurtosis, Skewness).

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score, confusion_matrix

# 1. Generate Data
np.random.seed(42)
n = 500
# Noise (Class 0)
mean_noise = np.random.normal(100, 20, 450)
std_noise = np.random.normal(40, 10, 450)
# Pulsar (Class 1) - distinct profile
mean_pulsar = np.random.normal(50, 10, 50)
std_pulsar = np.random.normal(20, 5, 50)

X_mean = np.concatenate([mean_noise, mean_pulsar])
X_std = np.concatenate([std_noise, std_pulsar])
y = np.concatenate([np.zeros(450), np.ones(50)])

X = np.column_stack([X_mean, X_std])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 2. Pipeline (Standardize -> SVM RBF)
# SVM requires scaling because it uses distance/margins
clf = make_pipeline(StandardScaler(), SVC(kernel='rbf', C=10))

# 3. Train
clf.fit(X_train, y_train)

# 4. Evaluate
print(f"Accuracy: {clf.score(X_test, y_test):.4f}")
print("Confusion Matrix:\n", confusion_matrix(y_test, clf.predict(X_test)))