# Module 7.2 – Wafer Map Pattern Recognition

## Interactive Tutorial: Classical + Deep Learning Approaches

This notebook demonstrates end-to-end wafer map pattern recognition for semiconductor manufacturing, combining classical computer vision features with deep learning for automated defect classification.

### Learning Objectives
- Understand common wafer defect patterns and their manufacturing implications
- Implement classical feature extraction (radial histograms, GLCM, HOG)
- Build and train a compact CNN for pattern classification
- Evaluate models using semiconductor-specific metrics (PWS, Estimated Loss)
- Apply explainability techniques (SHAP, Grad-CAM) for production deployment

In [None]:
# Import required libraries
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import json

# Add the module path for imports
sys.path.append(str(Path().resolve()))

# Import our pattern recognition pipeline
# Note: Run this from the module-7 directory
exec(open('7.2-pattern-recognition-pipeline.py').read())

# Set up plotting
plt.style.use('default')
sns.set_palette("husl")
%matplotlib inline

## 1. Generate and Visualize Synthetic Wafer Maps

First, let's generate synthetic wafer maps representing different defect patterns commonly seen in semiconductor manufacturing.

In [None]:
# Generate synthetic wafer maps
print("Generating synthetic wafer map dataset...")
data = generate_synthetic_wafer_maps(n_samples=300, map_size=64, seed=42)

wafer_maps = data['wafer_maps']
labels = data['labels']
pattern_names = data['pattern_names']

print(f"Generated {len(wafer_maps)} wafer maps with {len(pattern_names)} pattern types")
print(f"Pattern distribution: {dict(zip(pattern_names, np.bincount(labels)))}")
print(f"Wafer map shape: {wafer_maps[0].shape}")

In [None]:
# Visualize examples of each pattern type
fig, axes = plt.subplots(1, 5, figsize=(20, 4))
fig.suptitle('Wafer Map Defect Patterns', fontsize=16, fontweight='bold')

for i, pattern_name in enumerate(pattern_names):
    # Find first example of this pattern
    pattern_idx = np.where(labels == i)[0][0]
    wafer_map = wafer_maps[pattern_idx]
    
    # Plot the wafer map
    im = axes[i].imshow(wafer_map, cmap='RdYlBu_r', interpolation='nearest')
    axes[i].set_title(f'{pattern_name}\n(Class {i})', fontweight='bold')
    axes[i].set_xticks([])
    axes[i].set_yticks([])
    
    # Add colorbar
    plt.colorbar(im, ax=axes[i], fraction=0.046, pad=0.04)

plt.tight_layout()
plt.show()

## 2. Classical Feature Extraction

Let's extract classical computer vision features and understand what they capture about different defect patterns.

In [None]:
# Initialize feature extractor
feature_extractor = ClassicalFeatureExtractor(n_radial_bins=10, n_angular_bins=8)

# Extract features for a few examples
example_indices = [np.where(labels == i)[0][0] for i in range(5)]
example_features = {}

for i, idx in enumerate(example_indices):
    wafer_map = wafer_maps[idx]
    pattern_name = pattern_names[i]
    
    # Extract individual feature types
    radial_hist = feature_extractor.extract_radial_histogram(wafer_map)
    angular_hist = feature_extractor.extract_angular_histogram(wafer_map)
    texture_features = feature_extractor.extract_texture_features(wafer_map)
    region_props = feature_extractor.extract_region_properties(wafer_map)
    
    example_features[pattern_name] = {
        'radial': radial_hist,
        'angular': angular_hist,
        'texture': texture_features,
        'region': region_props
    }

print("Classical features extracted for each pattern type")

In [None]:
# Visualize radial and angular histograms
fig, axes = plt.subplots(2, 5, figsize=(20, 8))
fig.suptitle('Classical Feature Analysis by Pattern Type', fontsize=16, fontweight='bold')

for i, pattern_name in enumerate(pattern_names):
    features = example_features[pattern_name]
    
    # Radial histogram (top row)
    axes[0, i].bar(range(len(features['radial'])), features['radial'], alpha=0.7)
    axes[0, i].set_title(f'{pattern_name}\nRadial Density')
    axes[0, i].set_xlabel('Radial Bin')
    axes[0, i].set_ylabel('Defect Density')
    
    # Angular histogram (bottom row)
    axes[1, i].bar(range(len(features['angular'])), features['angular'], alpha=0.7, color='orange')
    axes[1, i].set_title(f'Angular Density')
    axes[1, i].set_xlabel('Angular Bin')
    axes[1, i].set_ylabel('Defect Density')

plt.tight_layout()
plt.show()

In [None]:
# Texture feature comparison
texture_names = ['Contrast', 'Dissimilarity', 'Homogeneity', 'Energy', 'Correlation']
texture_data = []

for pattern_name in pattern_names:
    for j, texture_name in enumerate(texture_names):
        texture_data.append({
            'Pattern': pattern_name,
            'Feature': texture_name,
            'Value': example_features[pattern_name]['texture'][j]
        })

texture_df = pd.DataFrame(texture_data)

# Create heatmap
pivot_texture = texture_df.pivot(index='Pattern', columns='Feature', values='Value')
plt.figure(figsize=(10, 6))
sns.heatmap(pivot_texture, annot=True, cmap='viridis', fmt='.3f')
plt.title('GLCM Texture Features by Pattern Type', fontweight='bold')
plt.tight_layout()
plt.show()

## 3. Train Classical Models

Now let's train classical machine learning models using the extracted features.

In [None]:
# Train classical SVM model
print("Training classical SVM model...")
classical_pipeline = PatternRecognitionPipeline(
    approach='classical',
    model='svm',
    C=1.0
)

# Train on subset for speed
train_indices = np.random.choice(len(wafer_maps), size=200, replace=False)
X_train = wafer_maps[train_indices]
y_train = labels[train_indices]
wafer_ids_train = [f"W{i//20:03d}" for i in train_indices]

classical_pipeline.fit(X_train, y_train, wafer_ids_train, pattern_names)
print("Classical model training completed!")

In [None]:
# Evaluate classical model
test_indices = np.setdiff1d(np.arange(len(wafer_maps)), train_indices)[:50]  # Small test set
X_test = wafer_maps[test_indices]
y_test = labels[test_indices]

classical_metrics = classical_pipeline.evaluate(X_test, y_test)

print("Classical Model Performance:")
for metric, value in classical_metrics.items():
    print(f"  {metric}: {value:.3f}")

## 4. Train Deep Learning CNN

Let's compare with a deep learning approach using a compact CNN.

In [None]:
# Check if PyTorch is available
if HAS_TORCH:
    print("Training CNN model...")
    
    cnn_pipeline = PatternRecognitionPipeline(
        approach='deep_learning',
        model='cnn',
        epochs=10,
        batch_size=32,
        learning_rate=0.001
    )
    
    # Train CNN
    cnn_pipeline.fit(X_train, y_train, wafer_ids_train, pattern_names)
    
    # Evaluate CNN
    cnn_metrics = cnn_pipeline.evaluate(X_test, y_test)
    
    print("\nCNN Model Performance:")
    for metric, value in cnn_metrics.items():
        print(f"  {metric}: {value:.3f}")
        
else:
    print("PyTorch not available - skipping CNN training")
    cnn_metrics = None

## 5. Model Comparison and Analysis

Let's compare the performance of classical vs. deep learning approaches.

In [None]:
# Compare model performance
comparison_data = []

# Classical metrics
for metric, value in classical_metrics.items():
    comparison_data.append({'Model': 'Classical SVM', 'Metric': metric, 'Value': value})

# CNN metrics (if available)
if cnn_metrics:
    for metric, value in cnn_metrics.items():
        comparison_data.append({'Model': 'CNN', 'Metric': metric, 'Value': value})

comparison_df = pd.DataFrame(comparison_data)

# Create comparison plot
key_metrics = ['accuracy', 'f1_weighted', 'roc_auc_weighted', 'pws', 'estimated_loss']
plot_data = comparison_df[comparison_df['Metric'].isin(key_metrics)]

if not plot_data.empty:
    plt.figure(figsize=(12, 6))
    sns.barplot(data=plot_data, x='Metric', y='Value', hue='Model')
    plt.title('Model Performance Comparison', fontweight='bold')
    plt.ylabel('Score')
    plt.xticks(rotation=45)
    plt.legend()
    plt.tight_layout()
    plt.show()
else:
    print("Performance comparison plot not available")

## 6. Manufacturing-Specific Analysis

Let's analyze the results from a semiconductor manufacturing perspective.

In [None]:
# Get detailed predictions for analysis
classical_preds = classical_pipeline.predict(X_test)
classical_probs = classical_pipeline.predict_proba(X_test)

# Create confusion matrix
from sklearn.metrics import confusion_matrix, classification_report

cm = confusion_matrix(y_test, classical_pipeline.label_encoder.transform(classical_preds))

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=pattern_names, yticklabels=pattern_names)
plt.title('Classical Model Confusion Matrix', fontweight='bold')
plt.xlabel('Predicted Pattern')
plt.ylabel('True Pattern')
plt.tight_layout()
plt.show()

# Print classification report
print("\nDetailed Classification Report:")
print(classification_report(y_test, classical_pipeline.label_encoder.transform(classical_preds), 
                          target_names=pattern_names))

In [None]:
# Analyze confidence scores by pattern type
confidence_data = []
for i, (true_label, pred_probs) in enumerate(zip(y_test, classical_probs)):
    max_prob = np.max(pred_probs)
    confidence_data.append({
        'Sample': i,
        'True_Pattern': pattern_names[true_label],
        'Confidence': max_prob
    })

confidence_df = pd.DataFrame(confidence_data)

# Plot confidence distribution by pattern
plt.figure(figsize=(10, 6))
sns.boxplot(data=confidence_df, x='True_Pattern', y='Confidence')
plt.title('Prediction Confidence by Pattern Type', fontweight='bold')
plt.ylabel('Confidence Score')
plt.xlabel('Pattern Type')
plt.xticks(rotation=45)
plt.axhline(y=0.7, color='r', linestyle='--', alpha=0.7, label='Min Confidence Threshold')
plt.legend()
plt.tight_layout()
plt.show()

## 7. Feature Importance Analysis (Explainability)

For production deployment, it's crucial to understand what features drive model decisions.

In [None]:
# Extract feature names for interpretation
feature_names_detailed = (
    [f'radial_bin_{i}' for i in range(10)] +
    [f'angular_bin_{i}' for i in range(8)] +
    ['glcm_contrast', 'glcm_dissimilarity', 'glcm_homogeneity', 'glcm_energy', 'glcm_correlation'] +
    [f'hog_feature_{i}' for i in range(len(feature_extractor.extract_hog_features(X_test[0])))] +
    ['total_area', 'total_perimeter', 'n_components', 'largest_area', 'largest_perimeter', 'compactness']
)

print(f"Total features: {len(feature_names_detailed)}")

# For SVM, we can look at feature weights
if hasattr(classical_pipeline.model, 'coef_'):
    # Get feature importance for each class
    feature_importance = np.abs(classical_pipeline.model.coef_).mean(axis=0)
    
    # Get top 20 most important features
    top_indices = np.argsort(feature_importance)[-20:]
    top_features = [feature_names_detailed[i] for i in top_indices]
    top_importance = feature_importance[top_indices]
    
    # Plot feature importance
    plt.figure(figsize=(10, 8))
    plt.barh(range(len(top_features)), top_importance)
    plt.yticks(range(len(top_features)), top_features)
    plt.xlabel('Feature Importance (|Coefficient|)')
    plt.title('Top 20 Most Important Features (Classical SVM)', fontweight='bold')
    plt.tight_layout()
    plt.show()
    
    print("\nTop 10 Most Important Features:")
    for i, (feat, imp) in enumerate(zip(top_features[-10:], top_importance[-10:])):
        print(f"{i+1:2d}. {feat}: {imp:.4f}")
else:
    print("Feature importance not available for this model type")

## 8. Production Deployment Simulation

Let's simulate how this model would work in a production environment.

In [None]:
def simulate_production_decision(wafer_map, model, threshold_confidence=0.7):
    """Simulate production decision making based on pattern recognition."""
    
    # Make prediction
    prediction = model.predict(wafer_map[np.newaxis, ...])[0]
    probabilities = model.predict_proba(wafer_map[np.newaxis, ...])[0]
    confidence = np.max(probabilities)
    
    # Production decision logic
    if confidence < threshold_confidence:
        action = "HUMAN_REVIEW"
        reason = f"Low confidence ({confidence:.3f}) - requires expert inspection"
    elif prediction == "Scratch" and confidence > 0.8:
        action = "HOLD_LOT"
        reason = "Critical scratch pattern detected - investigate handling equipment"
    elif prediction == "Center" and confidence > 0.7:
        action = "ADJUST_PROCESS"
        reason = "Center pattern detected - check chuck temperature and gas flow"
    elif prediction == "Edge" and confidence > 0.7:
        action = "MONITOR_CLOSELY"
        reason = "Edge pattern detected - monitor edge bead removal process"
    elif prediction == "Ring" and confidence > 0.7:
        action = "MAINTENANCE_ALERT"
        reason = "Ring pattern detected - schedule equipment vibration check"
    else:
        action = "CONTINUE"
        reason = "Normal pattern or acceptable variation"
    
    return {
        'pattern': prediction,
        'confidence': confidence,
        'action': action,
        'reason': reason,
        'probabilities': dict(zip(pattern_names, probabilities))
    }

# Test production simulation on a few examples
print("Production Decision Simulation:")
print("=" * 50)

for i in range(min(5, len(X_test))):
    true_pattern = pattern_names[y_test[i]]
    decision = simulate_production_decision(X_test[i], classical_pipeline)
    
    print(f"\nWafer {i+1}:")
    print(f"  True Pattern: {true_pattern}")
    print(f"  Predicted: {decision['pattern']} (confidence: {decision['confidence']:.3f})")
    print(f"  Action: {decision['action']}")
    print(f"  Reason: {decision['reason']}")

## 9. Cost-Benefit Analysis

Let's analyze the business impact of our pattern recognition system.

In [None]:
# Define cost structure (example values)
pattern_costs = {
    'Normal': {'miss_cost': 0, 'false_alarm_cost': 5},
    'Center': {'miss_cost': 50, 'false_alarm_cost': 10},
    'Edge': {'miss_cost': 40, 'false_alarm_cost': 8},
    'Scratch': {'miss_cost': 100, 'false_alarm_cost': 15},
    'Ring': {'miss_cost': 60, 'false_alarm_cost': 12}
}

def calculate_business_impact(y_true, y_pred, pattern_names, cost_structure):
    """Calculate business impact of pattern recognition system."""
    
    total_cost = 0
    pattern_costs_detail = {name: {'miss': 0, 'false_alarm': 0, 'correct': 0} 
                           for name in pattern_names}
    
    for true_idx, pred_name in zip(y_true, y_pred):
        true_name = pattern_names[true_idx]
        
        if true_name == pred_name:
            # Correct prediction - no cost
            pattern_costs_detail[true_name]['correct'] += 1
        else:
            # Misclassification
            if true_name != 'Normal':
                # False negative - missed a defect pattern
                cost = cost_structure[true_name]['miss_cost']
                pattern_costs_detail[true_name]['miss'] += 1
            else:
                # False positive - false alarm
                cost = cost_structure[pred_name]['false_alarm_cost']
                pattern_costs_detail[pred_name]['false_alarm'] += 1
            
            total_cost += cost
    
    return total_cost, pattern_costs_detail

# Calculate cost for our model
total_cost, cost_breakdown = calculate_business_impact(
    y_test, classical_preds, pattern_names, pattern_costs
)

print(f"Total Business Cost: ${total_cost}")
print(f"Average Cost per Wafer: ${total_cost/len(y_test):.2f}")

print("\nCost Breakdown by Pattern:")
for pattern, costs in cost_breakdown.items():
    print(f"  {pattern}:")
    print(f"    Correct: {costs['correct']}")
    print(f"    Missed: {costs['miss']}")
    print(f"    False Alarms: {costs['false_alarm']}")

## 10. Summary and Next Steps

### Key Findings

1. **Classical vs. Deep Learning**: Both approaches achieve good performance, with different trade-offs
2. **Feature Importance**: Radial and angular distributions are key discriminators
3. **Manufacturing Integration**: Pattern recognition enables automated decision-making
4. **Cost Considerations**: False negatives (missed defects) are much more expensive than false positives

### Production Recommendations

1. **Start with Classical**: Lower complexity, good interpretability
2. **Monitor Confidence**: Flag low-confidence predictions for human review
3. **Pattern-Specific Thresholds**: Different confidence thresholds by defect type
4. **Continuous Learning**: Update models as new patterns emerge

### Future Enhancements

1. **Real Dataset Integration**: Connect to WM-811K or proprietary wafer map data
2. **Temporal Analysis**: Track pattern evolution across time/lots
3. **Multi-Resolution**: Analyze patterns at different spatial scales
4. **Active Learning**: Incorporate human feedback to improve models

In [None]:
# Final performance summary
print("Final Model Performance Summary:")
print("=" * 40)
print(f"Accuracy: {classical_metrics['accuracy']:.3f}")
print(f"F1-Score (Weighted): {classical_metrics['f1_weighted']:.3f}")
print(f"ROC-AUC: {classical_metrics['roc_auc_weighted']:.3f}")
print(f"PWS (Prediction Within Spec): {classical_metrics['pws']:.3f}")
print(f"Estimated Loss: {classical_metrics['estimated_loss']:.3f}")
print(f"Business Cost: ${total_cost} (${total_cost/len(y_test):.2f} per wafer)")

print("\n🎯 Model ready for production validation!")
print("\nNext steps:")
print("1. Validate on real fab data")
print("2. Set up production monitoring")
print("3. Train operators on new system")
print("4. Establish retraining schedule")