# 5.1.2 物體分類 (Object Classification)

**WBS 5.1.2**: HOG特徵提取與SVM分類器

本模組深入探討傳統機器學習在電腦視覺中的應用，從特徵工程到模型訓練的完整pipeline。

## 學習目標
- 理解HOG (Histogram of Oriented Gradients) 特徵提取原理
- 掌握SVM (Support Vector Machine) 分類器訓練
- 實作完整的機器學習工作流程
- 學習模型評估與優化技巧
- 應用於實際物體分類問題

## 前置知識
- 圖像處理基礎 (Stage 2 & 3)
- 特徵檢測概念 (Stage 4)
- Python機器學習基礎
- NumPy與Scikit-learn

In [None]:
# Import required libraries
import cv2
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
import pickle
import time
from collections import Counter

# Machine learning libraries
from sklearn.svm import SVC, LinearSVC
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score, learning_curve
from sklearn.metrics import (
    classification_report, confusion_matrix, accuracy_score, 
    precision_recall_fscore_support, roc_curve, auc
)
from sklearn.preprocessing import StandardScaler, label_binarize
from sklearn.ensemble import BaggingClassifier, AdaBoostClassifier
from skimage.feature import hog
from skimage import exposure
import joblib

# Configure matplotlib for Chinese display
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS', 'SimHei']
plt.rcParams['axes.unicode_minus'] = False
plt.rcParams['figure.figsize'] = (14, 8)

# Disable warnings for cleaner output
import warnings
warnings.filterwarnings('ignore')

print("\u2705 Libraries imported successfully")
print(f"OpenCV version: {cv2.__version__}")
print(f"NumPy version: {np.__version__}")

## 1. HOG特徵提取基礎

### 什麼是HOG？

**Histogram of Oriented Gradients (HOG)** 是一種用於物體檢測的特徵描述器，由Dalal和Triggs於2005年提出。

### HOG原理

1. **梯度計算**: 計算圖像的梯度強度和方向
2. **細胞劃分**: 將圖像分割為小的細胞(cells)
3. **方向直方圖**: 計算每個細胞的梯度方向直方圖
4. **區塊歸一化**: 對相鄰細胞組成的區塊進行歸一化
5. **特徵向量**: 串聯所有區塊的直方圖形成最終特徵

### HOG關鍵參數

- **orientations**: 梯度方向的bin數量 (通常為9)
- **pixels_per_cell**: 每個細胞的像素大小 (如8x8)
- **cells_per_block**: 每個區塊的細胞數量 (如2x2)
- **block_norm**: 歸一化方法 ('L1', 'L2', 'L2-Hys')

In [None]:
# Load and prepare a sample image for HOG demonstration
def load_sample_image():
    """
    Load sample image from dataset
    
    Returns:
        image: grayscale image
    """
    dataset_path = Path('../assets/datasets/dlib_ObjectCategories10')
    
    # Try to load from dataset
    if dataset_path.exists():
        categories = [d for d in dataset_path.iterdir() if d.is_dir()]
        if categories:
            images = list(categories[0].glob('*.jpg'))
            if images:
                img = cv2.imread(str(images[0]))
                img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
                return cv2.resize(img_gray, (128, 128))
    
    # Fallback: create synthetic image
    print("Dataset not found, creating synthetic image...")
    img = np.zeros((128, 128), dtype=np.uint8)
    cv2.rectangle(img, (30, 30), (98, 98), 255, 2)
    cv2.circle(img, (64, 64), 20, 200, -1)
    return img

# Load sample image
sample_img = load_sample_image()

# Extract HOG features with visualization
features, hog_image = hog(
    sample_img,
    orientations=9,
    pixels_per_cell=(8, 8),
    cells_per_block=(2, 2),
    block_norm='L2-Hys',
    visualize=True,
    feature_vector=True
)

# Rescale HOG image for better visualization
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))

# Display original image and HOG visualization
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

axes[0].imshow(sample_img, cmap='gray')
axes[0].set_title('Original Image (128x128)', fontsize=12)
axes[0].axis('off')

axes[1].imshow(hog_image_rescaled, cmap='hot')
axes[1].set_title('HOG Feature Visualization', fontsize=12)
axes[1].axis('off')

plt.tight_layout()
plt.show()

print(f"HOG feature vector dimension: {features.shape[0]}")
print(f"Feature vector range: [{features.min():.4f}, {features.max():.4f}]")

### HOG參數影響分析

In [None]:
# Compare different HOG parameter configurations
configs = [
    {'orientations': 9, 'pixels_per_cell': (8, 8), 'cells_per_block': (2, 2)},
    {'orientations': 9, 'pixels_per_cell': (16, 16), 'cells_per_block': (2, 2)},
    {'orientations': 12, 'pixels_per_cell': (8, 8), 'cells_per_block': (2, 2)},
    {'orientations': 9, 'pixels_per_cell': (8, 8), 'cells_per_block': (3, 3)},
]

fig, axes = plt.subplots(2, 2, figsize=(14, 10))
axes = axes.ravel()

for idx, config in enumerate(configs):
    features, hog_img = hog(
        sample_img,
        visualize=True,
        feature_vector=True,
        block_norm='L2-Hys',
        **config
    )
    
    hog_img_rescaled = exposure.rescale_intensity(hog_img, in_range=(0, 10))
    
    axes[idx].imshow(hog_img_rescaled, cmap='hot')
    axes[idx].set_title(
        f"Orient={config['orientations']}, "
        f"Cell={config['pixels_per_cell']}, "
        f"Block={config['cells_per_block']}\n"
        f"Dim={features.shape[0]}",
        fontsize=9
    )
    axes[idx].axis('off')

plt.tight_layout()
plt.show()

print("\nParameter impact summary:")
print("- More orientations → Higher angular resolution, larger feature vector")
print("- Larger cells → Lower spatial resolution, smaller feature vector")
print("- Larger blocks → More context, better normalization, larger overlap")

## 2. 數據集準備與載入

### 數據集結構

我們使用dlib的ObjectCategories10數據集：
```
dlib_ObjectCategories10/
├── accordion/
│   ├── image_0001.jpg
│   ├── image_0002.jpg
│   └── ...
├── camera/
│   ├── image_0001.jpg
│   └── ...
└── .../
```

### 數據載入策略

1. **圖像讀取**: 使用OpenCV讀取圖像
2. **預處理**: 灰階轉換、大小調整
3. **標籤編碼**: 將類別名稱轉換為數字
4. **數據分割**: 訓練集/測試集劃分

In [None]:
def load_dataset(dataset_path, img_size=(64, 128), max_samples_per_class=None):
    """
    Load image dataset from directory structure
    
    Parameters:
    -----------
    dataset_path : str or Path
        Path to dataset root directory
    img_size : tuple
        Target image size (width, height)
    max_samples_per_class : int, optional
        Maximum samples to load per class (for quick testing)
        
    Returns:
    --------
    images : list
        List of processed images
    labels : list
        List of corresponding labels
    class_names : list
        List of class names
    """
    dataset_path = Path(dataset_path)
    
    if not dataset_path.exists():
        raise FileNotFoundError(f"Dataset not found at {dataset_path}")
    
    images = []
    labels = []
    class_names = []
    
    # Get all category directories
    categories = sorted([d for d in dataset_path.iterdir() if d.is_dir()])
    
    print(f"Found {len(categories)} categories:")
    
    for label_idx, category_dir in enumerate(categories):
        class_name = category_dir.name
        class_names.append(class_name)
        
        # Get all images in category
        image_files = list(category_dir.glob('*.jpg')) + list(category_dir.glob('*.png'))
        
        # Limit samples if specified
        if max_samples_per_class:
            image_files = image_files[:max_samples_per_class]
        
        print(f"  - {class_name}: {len(image_files)} images")
        
        for img_path in image_files:
            # Read image
            img = cv2.imread(str(img_path))
            
            if img is None:
                continue
            
            # Convert to grayscale
            img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            
            # Resize to standard size
            img_resized = cv2.resize(img_gray, img_size)
            
            images.append(img_resized)
            labels.append(label_idx)
    
    print(f"\nTotal samples loaded: {len(images)}")
    
    return images, labels, class_names


# Load dataset
DATASET_PATH = '../assets/datasets/dlib_ObjectCategories10'
IMG_SIZE = (64, 128)  # Standard HOG size (width, height)

print("Loading dataset...")
images, labels, class_names = load_dataset(
    DATASET_PATH, 
    img_size=IMG_SIZE,
    max_samples_per_class=None  # Set to 20 for quick testing
)

print(f"\nDataset statistics:")
print(f"  Number of classes: {len(class_names)}")
print(f"  Class names: {', '.join(class_names)}")
print(f"  Image shape: {images[0].shape}")
print(f"  Total samples: {len(images)}")

# Check class distribution
label_counts = Counter(labels)
print("\nClass distribution:")
for idx, class_name in enumerate(class_names):
    print(f"  {class_name}: {label_counts[idx]} samples")

### 數據可視化

In [None]:
# Visualize sample images from each class
def visualize_samples(images, labels, class_names, samples_per_class=5):
    """
    Visualize sample images from each class
    """
    n_classes = len(class_names)
    
    fig, axes = plt.subplots(n_classes, samples_per_class, figsize=(15, 3 * n_classes))
    
    if n_classes == 1:
        axes = axes.reshape(1, -1)
    
    for class_idx, class_name in enumerate(class_names):
        # Get indices of this class
        class_indices = [i for i, label in enumerate(labels) if label == class_idx]
        
        # Randomly sample
        sample_indices = np.random.choice(
            class_indices, 
            min(samples_per_class, len(class_indices)), 
            replace=False
        )
        
        for col_idx, img_idx in enumerate(sample_indices):
            if col_idx < samples_per_class:
                axes[class_idx, col_idx].imshow(images[img_idx], cmap='gray')
                axes[class_idx, col_idx].axis('off')
                
                if col_idx == 0:
                    axes[class_idx, col_idx].set_ylabel(
                        class_name, 
                        fontsize=12, 
                        rotation=0, 
                        labelpad=40,
                        va='center'
                    )
    
    plt.tight_layout()
    plt.show()

# Visualize samples
visualize_samples(images, labels, class_names, samples_per_class=5)

## 3. HOG特徵提取Pipeline

### 批量特徵提取

將所有圖像轉換為HOG特徵向量，形成特徵矩陣用於機器學習訓練。

In [None]:
def extract_hog_features(images, orientations=9, pixels_per_cell=(8, 8), 
                         cells_per_block=(2, 2), block_norm='L2-Hys'):
    """
    Extract HOG features from multiple images
    
    Parameters:
    -----------
    images : list
        List of grayscale images
    orientations : int
        Number of orientation bins
    pixels_per_cell : tuple
        Size of a cell (in pixels)
    cells_per_block : tuple
        Number of cells in each block
    block_norm : str
        Block normalization method
        
    Returns:
    --------
    features : numpy.ndarray
        Feature matrix (n_samples, n_features)
    """
    features_list = []
    
    print(f"Extracting HOG features from {len(images)} images...")
    start_time = time.time()
    
    for idx, img in enumerate(images):
        # Extract HOG features
        features = hog(
            img,
            orientations=orientations,
            pixels_per_cell=pixels_per_cell,
            cells_per_block=cells_per_block,
            block_norm=block_norm,
            visualize=False,
            feature_vector=True
        )
        
        features_list.append(features)
        
        # Progress indicator
        if (idx + 1) % 20 == 0:
            print(f"  Processed {idx + 1}/{len(images)} images...", end='\r')
    
    elapsed = time.time() - start_time
    print(f"\n  Completed in {elapsed:.2f}s ({elapsed/len(images)*1000:.2f}ms per image)")
    
    features_array = np.array(features_list)
    print(f"  Feature matrix shape: {features_array.shape}")
    
    return features_array


# Extract HOG features from all images
X = extract_hog_features(
    images,
    orientations=9,
    pixels_per_cell=(8, 8),
    cells_per_block=(2, 2),
    block_norm='L2-Hys'
)

y = np.array(labels)

print(f"\nFeature extraction complete:")
print(f"  X shape: {X.shape} (samples × features)")
print(f"  y shape: {y.shape}")
print(f"  Feature range: [{X.min():.4f}, {X.max():.4f}]")

### 數據預處理

**標準化 (Standardization)**: SVM對特徵尺度敏感，需要進行標準化。

$$z = \frac{x - \mu}{\sigma}$$

其中 $\mu$ 是均值，$\sigma$ 是標準差。

In [None]:
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2, 
    random_state=42,
    stratify=y  # Maintain class distribution
)

print("Dataset split:")
print(f"  Training set: {X_train.shape[0]} samples")
print(f"  Testing set: {X_test.shape[0]} samples")

# Check class distribution in splits
print("\nClass distribution:")
print(f"  Training: {Counter(y_train)}")
print(f"  Testing: {Counter(y_test)}")

# Feature standardization
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("\nFeature standardization:")
print(f"  Before: mean={X_train.mean():.4f}, std={X_train.std():.4f}")
print(f"  After: mean={X_train_scaled.mean():.4f}, std={X_train_scaled.std():.4f}")

## 4. SVM分類器訓練

### SVM基礎

**Support Vector Machine (SVM)** 是一種強大的監督式學習算法，尋找最優超平面將不同類別分開。

### SVM核函數

1. **Linear**: $K(x, y) = x^T y$
   - 適用於線性可分問題
   - 訓練速度快

2. **RBF (Radial Basis Function)**: $K(x, y) = e^{-\gamma ||x-y||^2}$
   - 最常用的核函數
   - 適用於非線性問題

3. **Polynomial**: $K(x, y) = (\gamma x^T y + r)^d$
   - 適用於特定非線性問題

### 關鍵參數

- **C**: 懲罰參數，控制對誤分類的容忍度
  - 大C → 低偏差、高方差 (overfitting)
  - 小C → 高偏差、低方差 (underfitting)

- **gamma**: RBF核的參數
  - 大gamma → 影響範圍小，複雜邊界
  - 小gamma → 影響範圍大，平滑邊界

In [None]:
# Train baseline SVM with default parameters
print("Training baseline SVM classifier...")

# Use LinearSVC for faster training on large datasets
baseline_svm = LinearSVC(random_state=42, max_iter=1000)

start_time = time.time()
baseline_svm.fit(X_train_scaled, y_train)
train_time = time.time() - start_time

# Evaluate on training and testing sets
train_acc = baseline_svm.score(X_train_scaled, y_train)
test_acc = baseline_svm.score(X_test_scaled, y_test)

print(f"\nBaseline SVM Results:")
print(f"  Training time: {train_time:.2f}s")
print(f"  Training accuracy: {train_acc:.4f}")
print(f"  Testing accuracy: {test_acc:.4f}")

# Make predictions
y_pred = baseline_svm.predict(X_test_scaled)

# Detailed classification report
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=class_names))

### 混淆矩陣 (Confusion Matrix)

混淆矩陣展示了模型在各類別上的表現細節。

In [None]:
def plot_confusion_matrix(y_true, y_pred, class_names, normalize=False):
    """
    Plot confusion matrix
    
    Parameters:
    -----------
    y_true : array-like
        True labels
    y_pred : array-like
        Predicted labels
    class_names : list
        List of class names
    normalize : bool
        Whether to normalize the matrix
    """
    cm = confusion_matrix(y_true, y_pred)
    
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    
    fig, ax = plt.subplots(figsize=(10, 8))
    im = ax.imshow(cm, interpolation='nearest', cmap='Blues')
    ax.figure.colorbar(im, ax=ax)
    
    # Configure ticks
    ax.set(xticks=np.arange(cm.shape[1]),
           yticks=np.arange(cm.shape[0]),
           xticklabels=class_names,
           yticklabels=class_names,
           ylabel='True Label',
           xlabel='Predicted Label')
    
    # Rotate x labels
    plt.setp(ax.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor")
    
    # Add text annotations
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, format(cm[i, j], fmt),
                   ha="center", va="center",
                   color="white" if cm[i, j] > thresh else "black")
    
    title = 'Normalized Confusion Matrix' if normalize else 'Confusion Matrix'
    ax.set_title(title, fontsize=14, pad=20)
    
    plt.tight_layout()
    plt.show()

# Plot confusion matrices
plot_confusion_matrix(y_test, y_pred, class_names, normalize=False)
plot_confusion_matrix(y_test, y_pred, class_names, normalize=True)

## 5. 超參數調整 (Hyperparameter Tuning)

### Grid Search

使用網格搜索結合交叉驗證尋找最優超參數組合。

**GridSearchCV** 會嘗試所有參數組合，並使用k-fold交叉驗證評估每組參數。

In [None]:
# Define parameter grid for SVM with RBF kernel
print("Starting hyperparameter tuning with GridSearchCV...")
print("This may take several minutes...\n")

param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [0.001, 0.01, 0.1, 1],
    'kernel': ['rbf']
}

# Create SVM classifier
svm = SVC(random_state=42)

# Grid search with 5-fold cross-validation
grid_search = GridSearchCV(
    svm, 
    param_grid, 
    cv=5,
    scoring='accuracy',
    n_jobs=-1,  # Use all CPU cores
    verbose=1
)

start_time = time.time()
grid_search.fit(X_train_scaled, y_train)
tuning_time = time.time() - start_time

print(f"\nHyperparameter tuning completed in {tuning_time:.2f}s")
print(f"\nBest parameters: {grid_search.best_params_}")
print(f"Best cross-validation score: {grid_search.best_score_:.4f}")

# Evaluate best model on test set
best_svm = grid_search.best_estimator_
test_acc_tuned = best_svm.score(X_test_scaled, y_test)
print(f"Test accuracy (tuned): {test_acc_tuned:.4f}")

# Compare with baseline
print(f"\nImprovement over baseline: {(test_acc_tuned - test_acc):.4f}")

### GridSearch結果可視化

In [None]:
# Visualize grid search results
results = grid_search.cv_results_

# Extract C and gamma values
C_values = [params['C'] for params in results['params']]
gamma_values = [params['gamma'] for params in results['params']]
mean_scores = results['mean_test_score']

# Create pivot table for heatmap
C_unique = sorted(set(C_values))
gamma_unique = sorted(set(gamma_values))

score_matrix = np.zeros((len(gamma_unique), len(C_unique)))

for c, g, score in zip(C_values, gamma_values, mean_scores):
    c_idx = C_unique.index(c)
    g_idx = gamma_unique.index(g)
    score_matrix[g_idx, c_idx] = score

# Plot heatmap
fig, ax = plt.subplots(figsize=(10, 8))

im = ax.imshow(score_matrix, cmap='YlOrRd', aspect='auto')

ax.set_xticks(np.arange(len(C_unique)))
ax.set_yticks(np.arange(len(gamma_unique)))
ax.set_xticklabels(C_unique)
ax.set_yticklabels(gamma_unique)

ax.set_xlabel('C (Regularization)', fontsize=12)
ax.set_ylabel('Gamma (Kernel coefficient)', fontsize=12)
ax.set_title('Grid Search Results: Cross-Validation Accuracy', fontsize=14, pad=20)

# Add colorbar
cbar = plt.colorbar(im, ax=ax)
cbar.set_label('Accuracy', fontsize=12)

# Add text annotations
for i in range(len(gamma_unique)):
    for j in range(len(C_unique)):
        text = ax.text(j, i, f'{score_matrix[i, j]:.3f}',
                      ha="center", va="center", color="black", fontsize=9)

plt.tight_layout()
plt.show()

## 6. 模型評估與分析

### 交叉驗證 (Cross-Validation)

K-fold交叉驗證可以更可靠地評估模型性能。

In [None]:
# Perform k-fold cross-validation
print("Performing 10-fold cross-validation...")

cv_scores = cross_val_score(
    best_svm, 
    X_train_scaled, 
    y_train, 
    cv=10,
    scoring='accuracy',
    n_jobs=-1
)

print(f"\nCross-validation results:")
print(f"  Individual fold scores: {cv_scores}")
print(f"  Mean accuracy: {cv_scores.mean():.4f}")
print(f"  Standard deviation: {cv_scores.std():.4f}")
print(f"  95% confidence interval: [{cv_scores.mean() - 1.96*cv_scores.std():.4f}, "
      f"{cv_scores.mean() + 1.96*cv_scores.std():.4f}]")

# Visualize CV scores
fig, ax = plt.subplots(figsize=(12, 6))

ax.bar(range(1, len(cv_scores) + 1), cv_scores, alpha=0.7, color='skyblue', edgecolor='black')
ax.axhline(y=cv_scores.mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {cv_scores.mean():.4f}')
ax.axhline(y=cv_scores.mean() + cv_scores.std(), color='orange', linestyle=':', linewidth=1, label=f'+1 std')
ax.axhline(y=cv_scores.mean() - cv_scores.std(), color='orange', linestyle=':', linewidth=1, label=f'-1 std')

ax.set_xlabel('Fold', fontsize=12)
ax.set_ylabel('Accuracy', fontsize=12)
ax.set_title('10-Fold Cross-Validation Results', fontsize=14, pad=20)
ax.set_ylim([0, 1])
ax.legend(fontsize=10)
ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

### 學習曲線 (Learning Curve)

學習曲線展示了訓練集大小對模型性能的影響，幫助診斷偏差/方差問題。

In [None]:
# Compute learning curve
print("Computing learning curve...")

train_sizes, train_scores, val_scores = learning_curve(
    best_svm,
    X_train_scaled,
    y_train,
    cv=5,
    n_jobs=-1,
    train_sizes=np.linspace(0.1, 1.0, 10),
    scoring='accuracy',
    random_state=42
)

# Calculate mean and std
train_mean = train_scores.mean(axis=1)
train_std = train_scores.std(axis=1)
val_mean = val_scores.mean(axis=1)
val_std = val_scores.std(axis=1)

# Plot learning curve
fig, ax = plt.subplots(figsize=(12, 7))

ax.plot(train_sizes, train_mean, 'o-', color='blue', label='Training score', linewidth=2)
ax.fill_between(train_sizes, train_mean - train_std, train_mean + train_std, alpha=0.2, color='blue')

ax.plot(train_sizes, val_mean, 'o-', color='red', label='Cross-validation score', linewidth=2)
ax.fill_between(train_sizes, val_mean - val_std, val_mean + val_std, alpha=0.2, color='red')

ax.set_xlabel('Training Set Size', fontsize=12)
ax.set_ylabel('Accuracy', fontsize=12)
ax.set_title('Learning Curve', fontsize=14, pad=20)
ax.legend(loc='lower right', fontsize=11)
ax.grid(alpha=0.3)
ax.set_ylim([0, 1.05])

plt.tight_layout()
plt.show()

# Diagnose bias/variance
final_train_score = train_mean[-1]
final_val_score = val_mean[-1]
gap = final_train_score - final_val_score

print("\nLearning curve analysis:")
print(f"  Final training score: {final_train_score:.4f}")
print(f"  Final validation score: {final_val_score:.4f}")
print(f"  Gap: {gap:.4f}")

if gap > 0.1:
    print("  \u26a0\ufe0f High variance detected - model may be overfitting")
    print("  Suggestions: regularization, more data, reduce model complexity")
elif final_val_score < 0.7:
    print("  \u26a0\ufe0f High bias detected - model may be underfitting")
    print("  Suggestions: more features, more complex model, reduce regularization")
else:
    print("  \u2705 Good bias-variance tradeoff")

### 錯誤分析 (Error Analysis)

In [None]:
# Analyze misclassified examples
y_pred_tuned = best_svm.predict(X_test_scaled)
misclassified_idx = np.where(y_pred_tuned != y_test)[0]

print(f"Total misclassified samples: {len(misclassified_idx)}")
print(f"Misclassification rate: {len(misclassified_idx) / len(y_test):.2%}")

# Visualize some misclassified examples
if len(misclassified_idx) > 0:
    n_examples = min(8, len(misclassified_idx))
    sample_idx = np.random.choice(misclassified_idx, n_examples, replace=False)
    
    # Get original test indices
    test_indices = np.arange(len(labels))
    _, test_indices_split = train_test_split(
        test_indices, test_size=0.2, random_state=42, stratify=y
    )
    
    fig, axes = plt.subplots(2, 4, figsize=(16, 8))
    axes = axes.ravel()
    
    for i, idx in enumerate(sample_idx):
        img_idx = test_indices_split[idx]
        
        axes[i].imshow(images[img_idx], cmap='gray')
        axes[i].set_title(
            f"True: {class_names[y_test[idx]]}\n"
            f"Pred: {class_names[y_pred_tuned[idx]]}",
            fontsize=10,
            color='red'
        )
        axes[i].axis('off')
    
    plt.suptitle('Misclassified Examples', fontsize=14, y=1.02)
    plt.tight_layout()
    plt.show()
else:
    print("\u2705 Perfect classification - no errors!")

## 7. 模型保存與載入

### 保存完整Pipeline

保存預處理器(scaler)和模型，以便後續使用。

In [None]:
# Save model and scaler
MODEL_DIR = Path('../assets/models/object_classification')
MODEL_DIR.mkdir(parents=True, exist_ok=True)

model_path = MODEL_DIR / 'hog_svm_classifier.pkl'
scaler_path = MODEL_DIR / 'feature_scaler.pkl'
config_path = MODEL_DIR / 'model_config.pkl'

# Save model
joblib.dump(best_svm, model_path)
print(f"Model saved to: {model_path}")

# Save scaler
joblib.dump(scaler, scaler_path)
print(f"Scaler saved to: {scaler_path}")

# Save configuration
config = {
    'class_names': class_names,
    'img_size': IMG_SIZE,
    'hog_params': {
        'orientations': 9,
        'pixels_per_cell': (8, 8),
        'cells_per_block': (2, 2),
        'block_norm': 'L2-Hys'
    },
    'test_accuracy': test_acc_tuned,
    'best_params': grid_search.best_params_
}

joblib.dump(config, config_path)
print(f"Configuration saved to: {config_path}")

print(f"\nTotal model size: {(model_path.stat().st_size + scaler_path.stat().st_size) / 1024:.2f} KB")

### 載入並測試模型

In [None]:
# Load saved model
loaded_svm = joblib.load(model_path)
loaded_scaler = joblib.load(scaler_path)
loaded_config = joblib.load(config_path)

print("Model loaded successfully!")
print(f"\nModel configuration:")
print(f"  Classes: {loaded_config['class_names']}")
print(f"  Image size: {loaded_config['img_size']}")
print(f"  HOG params: {loaded_config['hog_params']}")
print(f"  Test accuracy: {loaded_config['test_accuracy']:.4f}")

# Verify model works
test_acc_loaded = loaded_svm.score(X_test_scaled, y_test)
print(f"\nVerification: loaded model accuracy = {test_acc_loaded:.4f}")
assert test_acc_loaded == test_acc_tuned, "Model mismatch!"
print("\u2705 Model verification passed!")

## 8. 實時分類應用

### 完整推理Pipeline

In [None]:
def classify_image(img_path, model, scaler, config):
    """
    Classify a single image using trained model
    
    Parameters:
    -----------
    img_path : str or Path
        Path to input image
    model : sklearn model
        Trained SVM classifier
    scaler : sklearn scaler
        Fitted StandardScaler
    config : dict
        Model configuration
        
    Returns:
    --------
    prediction : str
        Predicted class name
    confidence : float
        Confidence score (if available)
    """
    # Load and preprocess image
    img = cv2.imread(str(img_path))
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_resized = cv2.resize(img_gray, config['img_size'])
    
    # Extract HOG features
    features = hog(
        img_resized,
        visualize=False,
        feature_vector=True,
        **config['hog_params']
    )
    
    # Standardize features
    features_scaled = scaler.transform(features.reshape(1, -1))
    
    # Predict
    pred_label = model.predict(features_scaled)[0]
    pred_class = config['class_names'][pred_label]
    
    # Get confidence if model supports decision_function
    confidence = None
    if hasattr(model, 'decision_function'):
        decision_scores = model.decision_function(features_scaled)[0]
        # Convert to probability-like scores
        exp_scores = np.exp(decision_scores - np.max(decision_scores))
        confidence = exp_scores / exp_scores.sum()
    
    return pred_class, confidence, img, img_resized


# Test classification on random images
print("Testing real-time classification...\n")

# Get some test images
test_image_paths = []
for class_dir in (Path(DATASET_PATH)).iterdir():
    if class_dir.is_dir():
        imgs = list(class_dir.glob('*.jpg'))[:1]
        test_image_paths.extend(imgs)

n_display = min(6, len(test_image_paths))
sample_paths = np.random.choice(test_image_paths, n_display, replace=False)

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.ravel()

for idx, img_path in enumerate(sample_paths):
    pred_class, confidence, img_orig, img_processed = classify_image(
        img_path, loaded_svm, loaded_scaler, loaded_config
    )
    
    true_class = img_path.parent.name
    
    # Display original image
    axes[idx].imshow(cv2.cvtColor(img_orig, cv2.COLOR_BGR2RGB))
    
    # Title with prediction
    title = f"True: {true_class}\nPred: {pred_class}"
    color = 'green' if true_class == pred_class else 'red'
    axes[idx].set_title(title, fontsize=11, color=color, weight='bold')
    
    # Add confidence bar if available
    if confidence is not None:
        conf_text = f"Conf: {confidence.max():.2%}"
        axes[idx].text(0.5, 0.95, conf_text, 
                      transform=axes[idx].transAxes,
                      ha='center', va='top',
                      bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.7),
                      fontsize=9)
    
    axes[idx].axis('off')

plt.suptitle('Real-time Classification Results', fontsize=14, y=0.98)
plt.tight_layout()
plt.show()

## 9. 進階技巧與優化

### 數據增強 (Data Augmentation)

In [None]:
def augment_image(img):
    """
    Apply random augmentation to image
    
    Parameters:
    -----------
    img : numpy.ndarray
        Input grayscale image
        
    Returns:
    --------
    augmented : numpy.ndarray
        Augmented image
    """
    # Random flip
    if np.random.random() > 0.5:
        img = cv2.flip(img, 1)
    
    # Random rotation (-15 to +15 degrees)
    angle = np.random.uniform(-15, 15)
    h, w = img.shape
    M = cv2.getRotationMatrix2D((w/2, h/2), angle, 1.0)
    img = cv2.warpAffine(img, M, (w, h), borderMode=cv2.BORDER_REFLECT)
    
    # Random brightness adjustment
    brightness = np.random.uniform(0.8, 1.2)
    img = np.clip(img * brightness, 0, 255).astype(np.uint8)
    
    # Random noise
    if np.random.random() > 0.7:
        noise = np.random.normal(0, 5, img.shape)
        img = np.clip(img + noise, 0, 255).astype(np.uint8)
    
    return img

# Demonstrate augmentation
sample_img = images[0]

fig, axes = plt.subplots(2, 4, figsize=(16, 8))
axes = axes.ravel()

axes[0].imshow(sample_img, cmap='gray')
axes[0].set_title('Original', fontsize=12)
axes[0].axis('off')

for i in range(1, 8):
    aug_img = augment_image(sample_img.copy())
    axes[i].imshow(aug_img, cmap='gray')
    axes[i].set_title(f'Augmented {i}', fontsize=12)
    axes[i].axis('off')

plt.suptitle('Data Augmentation Examples', fontsize=14, y=0.98)
plt.tight_layout()
plt.show()

print("Data augmentation techniques:")
print("  1. Horizontal flip (50% probability)")
print("  2. Random rotation (-15° to +15°)")
print("  3. Brightness adjustment (0.8x to 1.2x)")
print("  4. Gaussian noise (30% probability)")

### 集成學習 (Ensemble Learning)

使用Bagging和Boosting提升模型性能。

In [None]:
# Bagging Classifier
print("Training Bagging ensemble...")
bagging_clf = BaggingClassifier(
    estimator=LinearSVC(random_state=42),
    n_estimators=10,
    max_samples=0.8,
    max_features=0.8,
    random_state=42,
    n_jobs=-1
)

bagging_clf.fit(X_train_scaled, y_train)
bagging_acc = bagging_clf.score(X_test_scaled, y_test)
print(f"Bagging accuracy: {bagging_acc:.4f}")

# Compare with base model
print(f"\nModel comparison:")
print(f"  Baseline SVM: {test_acc:.4f}")
print(f"  Tuned SVM: {test_acc_tuned:.4f}")
print(f"  Bagging SVM: {bagging_acc:.4f}")
print(f"  Improvement: {(bagging_acc - test_acc):.4f}")

### 性能優化總結

In [None]:
# Performance comparison
models = ['Baseline', 'Tuned', 'Ensemble']
accuracies = [test_acc, test_acc_tuned, bagging_acc]

fig, ax = plt.subplots(figsize=(10, 6))

bars = ax.bar(models, accuracies, color=['lightblue', 'lightgreen', 'lightcoral'], 
              edgecolor='black', linewidth=1.5)

# Add value labels on bars
for bar, acc in zip(bars, accuracies):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
           f'{acc:.4f}',
           ha='center', va='bottom', fontsize=12, weight='bold')

ax.set_ylabel('Accuracy', fontsize=12)
ax.set_title('Model Performance Comparison', fontsize=14, pad=20)
ax.set_ylim([0, 1])
ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\nOptimization strategies summary:")
print("  1. Hyperparameter tuning → Improved by", f"{(test_acc_tuned - test_acc):.4f}")
print("  2. Ensemble learning → Improved by", f"{(bagging_acc - test_acc):.4f}")
print("  3. Data augmentation → Can further improve with more data")
print("  4. Feature engineering → Try different feature descriptors (SIFT, SURF, ORB)")

## 10. 總結與延伸

### 關鍵要點

1. **HOG特徵**
   - 強大的傳統特徵描述器
   - 對形狀和輪廓敏感
   - 計算效率高
   - 參數選擇影響特徵維度和性能

2. **SVM分類器**
   - 適合中小型數據集
   - 核函數選擇重要
   - 需要特徵標準化
   - 超參數調整關鍵

3. **機器學習流程**
   - 數據準備 → 特徵提取 → 模型訓練 → 評估 → 優化
   - 交叉驗證確保可靠性
   - 學習曲線診斷問題
   - 錯誤分析指導改進

4. **實戰技巧**
   - 數據增強提升泛化
   - 集成學習提高性能
   - 模型保存便於部署
   - Pipeline化便於維護

### 其他特徵方法對比

| 特徵方法 | 優點 | 缺點 | 適用場景 |
|---------|------|------|----------|
| **HOG** | 對形狀敏感、快速、穩定 | 對紋理不敏感、固定窗口 | 行人檢測、物體分類 |
| **SIFT** | 尺度不變、旋轉不變 | 計算慢、專利保護 | 圖像匹配、全景拼接 |
| **SURF** | 比SIFT快、類似性能 | 專利保護 | 實時應用 |
| **ORB** | 極快、無專利 | 精度略低 | 移動設備、實時追蹤 |
| **LBP** | 極快、對光照魯棒 | 對旋轉敏感 | 紋理分類、人臉識別 |

### 何時使用HOG+SVM？

**適合**:
- 中小型數據集（<10萬樣本）
- 需要可解釋性
- 計算資源有限
- 需要快速部署
- 物體形狀特徵明顯

**不適合**:
- 大規模數據集（深度學習更好）
- 複雜場景理解
- 需要端到端學習
- 強紋理依賴的任務

### 延伸學習方向

1. **深度學習方法**
   - CNN分類器 (LeNet, VGG, ResNet)
   - Transfer Learning
   - Fine-tuning預訓練模型

2. **進階特徵**
   - Fisher Vector
   - Bag of Visual Words
   - Deep Features (CNN中間層)

3. **其他應用**
   - 物體檢測 (R-CNN, YOLO)
   - 語義分割
   - 實例分割

### 實戰項目建議

1. **行人檢測系統**
   - HOG+SVM經典應用
   - 結合滑動窗口
   - 多尺度檢測

2. **車輛分類器**
   - 多類別分類
   - 實時推理
   - 部署到邊緣設備

3. **醫療影像分類**
   - 結合領域知識
   - 數據增強策略
   - 類別不平衡處理

### 參考資源

- **論文**:
  - Dalal & Triggs (2005): "Histograms of Oriented Gradients for Human Detection"
  - Cortes & Vapnik (1995): "Support-Vector Networks"

- **文檔**:
  - Scikit-learn SVM Guide: https://scikit-learn.org/stable/modules/svm.html
  - Scikit-image HOG: https://scikit-image.org/docs/stable/auto_examples/features_detection/plot_hog.html

- **數據集**:
  - MNIST: 手寫數字分類
  - CIFAR-10: 通用物體分類
  - Caltech 101: 物體識別
  - INRIA Person: 行人檢測

---

## 下一步

完成本模組後，建議繼續學習：
- **5.1.3 深度學習入門** - CNN基礎與實作
- **5.2.1 物體檢測** - YOLO/SSD實戰
- **5.2.2 遷移學習** - 預訓練模型應用

**練習建議**: 
1. 在自己的數據集上訓練分類器
2. 實作滑動窗口物體檢測
3. 比較不同特徵方法的性能
4. 嘗試深度學習方法並對比結果