# üìï PYTORCH FILE 3-C: PRODUCTION & BEST PRACTICES

**Ph·∫ßn:** ADVANCED & PROFESSIONAL - FINAL

**M·ª•c ti√™u:**
- ‚úÖ Clean ML Pipeline
- ‚úÖ Reproducibility
- ‚úÖ Model Evaluation & Metrics
- ‚úÖ Save & Load Models
- ‚úÖ Inference Pipeline
- ‚úÖ Performance Optimization
- ‚úÖ Production Best Practices
- ‚úÖ Common Anti-patterns

**Th·ªùi l∆∞·ª£ng:** 2-3 tu·∫ßn

---

## üìö M·ª•c L·ª•c

### PH·∫¶N 1: CLEAN ML PIPELINE
1. Pipeline Architecture
2. Config Management
3. Data Pipeline
4. Training Pipeline
5. Experiment Tracking

### PH·∫¶N 2: REPRODUCIBILITY
1. Random Seeds
2. Deterministic Operations
3. Environment Management
4. Version Control

### PH·∫¶N 3: MODEL EVALUATION
1. Classification Metrics
2. Confusion Matrix
3. ROC & AUC
4. Cross-validation

### PH·∫¶N 4: SAVE & LOAD MODELS
1. torch.save() & torch.load()
2. State Dict
3. Checkpoints
4. ONNX Export

### PH·∫¶N 5: PRODUCTION BEST PRACTICES
1. Model Versioning
2. Monitoring
3. Performance Optimization
4. Common Anti-patterns

---

In [None]:
# Import th∆∞ vi·ªán
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import json
import yaml
import random
import os
from datetime import datetime
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, roc_curve, auc, classification_report
)

print(f"‚úÖ PyTorch version: {torch.__version__}")
print(f"‚úÖ CUDA available: {torch.cuda.is_available()}")

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"‚úÖ Using device: {device}")

---

# PH·∫¶N 1: CLEAN ML PIPELINE

## 1.1 Pipeline Architecture

### Clean ML Pipeline

```
Config ‚Üí Data Pipeline ‚Üí Model ‚Üí Training ‚Üí Evaluation ‚Üí Save/Export
```

### Benefits

- ‚úÖ **Reproducible**: C√πng config ‚Üí c√πng k·∫øt qu·∫£
- ‚úÖ **Maintainable**: D·ªÖ debug, update
- ‚úÖ **Scalable**: D·ªÖ scale l√™n production
- ‚úÖ **Collaborative**: Team d·ªÖ l√†m vi·ªác

## 1.2 Config Management

In [None]:
# Config class

class Config:
    """Configuration for ML pipeline"""
    
    # Data
    DATA_DIR = './data'
    BATCH_SIZE = 32
    NUM_WORKERS = 4
    
    # Model
    MODEL_NAME = 'resnet18'
    NUM_CLASSES = 10
    PRETRAINED = True
    
    # Training
    EPOCHS = 50
    LEARNING_RATE = 0.001
    WEIGHT_DECAY = 1e-4
    OPTIMIZER = 'adam'
    
    # Scheduler
    SCHEDULER = 'cosine'
    T_MAX = 50
    ETA_MIN = 1e-6
    
    # Paths
    MODEL_DIR = './models'
    LOG_DIR = './logs'
    CHECKPOINT_DIR = './checkpoints'
    
    # Reproducibility
    SEED = 42
    
    @classmethod
    def to_dict(cls):
        """Convert to dictionary"""
        return {k: v for k, v in cls.__dict__.items() 
                if not k.startswith('_') and not callable(v)}
    
    @classmethod
    def save(cls, path):
        """Save config to JSON"""
        with open(path, 'w') as f:
            json.dump(cls.to_dict(), f, indent=2)
        print(f"‚úÖ Config saved to {path}")
    
    @classmethod
    def load(cls, path):
        """Load config from JSON"""
        with open(path, 'r') as f:
            config_dict = json.load(f)
        
        for key, value in config_dict.items():
            setattr(cls, key, value)
        
        print(f"‚úÖ Config loaded from {path}")

# Test
config = Config()
print("üìã Config:")
print(json.dumps(config.to_dict(), indent=2))

## 1.3 Training Pipeline Class

In [None]:
class TrainingPipeline:
    """
    Clean training pipeline
    """
    def __init__(self, config):
        self.config = config
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        
        # Create directories
        self._create_directories()
        
        # Set seeds
        self._set_seeds()
        
        # Initialize
        self.model = None
        self.optimizer = None
        self.scheduler = None
        self.criterion = None
        self.history = {'train_loss': [], 'val_loss': [], 'val_acc': []}
    
    def _create_directories(self):
        """Create necessary directories"""
        Path(self.config.MODEL_DIR).mkdir(parents=True, exist_ok=True)
        Path(self.config.LOG_DIR).mkdir(parents=True, exist_ok=True)
        Path(self.config.CHECKPOINT_DIR).mkdir(parents=True, exist_ok=True)
        print("‚úÖ Directories created")
    
    def _set_seeds(self):
        """Set random seeds for reproducibility"""
        torch.manual_seed(self.config.SEED)
        torch.cuda.manual_seed_all(self.config.SEED)
        np.random.seed(self.config.SEED)
        random.seed(self.config.SEED)
        
        # Deterministic behavior
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False
        
        print(f"‚úÖ Seeds set to {self.config.SEED}")
    
    def build_model(self):
        """Build model from config"""
        import torchvision.models as models
        
        # Load base model
        if self.config.MODEL_NAME == 'resnet18':
            base_model = models.resnet18(pretrained=self.config.PRETRAINED)
            num_features = base_model.fc.in_features
            base_model.fc = nn.Linear(num_features, self.config.NUM_CLASSES)
            self.model = base_model
        
        self.model = self.model.to(self.device)
        print(f"‚úÖ Model built: {self.config.MODEL_NAME}")
        return self.model
    
    def build_optimizer(self):
        """Build optimizer from config"""
        if self.config.OPTIMIZER == 'adam':
            self.optimizer = optim.Adam(
                self.model.parameters(),
                lr=self.config.LEARNING_RATE,
                weight_decay=self.config.WEIGHT_DECAY
            )
        elif self.config.OPTIMIZER == 'sgd':
            self.optimizer = optim.SGD(
                self.model.parameters(),
                lr=self.config.LEARNING_RATE,
                momentum=0.9,
                weight_decay=self.config.WEIGHT_DECAY
            )
        
        print(f"‚úÖ Optimizer: {self.config.OPTIMIZER}")
        return self.optimizer
    
    def build_scheduler(self):
        """Build LR scheduler from config"""
        if self.config.SCHEDULER == 'cosine':
            self.scheduler = optim.lr_scheduler.CosineAnnealingLR(
                self.optimizer,
                T_max=self.config.T_MAX,
                eta_min=self.config.ETA_MIN
            )
        elif self.config.SCHEDULER == 'step':
            self.scheduler = optim.lr_scheduler.StepLR(
                self.optimizer,
                step_size=30,
                gamma=0.1
            )
        
        print(f"‚úÖ Scheduler: {self.config.SCHEDULER}")
        return self.scheduler
    
    def train_epoch(self, train_loader):
        """Train one epoch"""
        self.model.train()
        total_loss = 0.0
        
        for inputs, targets in train_loader:
            inputs, targets = inputs.to(self.device), targets.to(self.device)
            
            self.optimizer.zero_grad()
            outputs = self.model(inputs)
            loss = self.criterion(outputs, targets)
            loss.backward()
            self.optimizer.step()
            
            total_loss += loss.item()
        
        return total_loss / len(train_loader)
    
    def validate(self, val_loader):
        """Validate model"""
        self.model.eval()
        total_loss = 0.0
        correct = 0
        total = 0
        
        with torch.no_grad():
            for inputs, targets in val_loader:
                inputs, targets = inputs.to(self.device), targets.to(self.device)
                
                outputs = self.model(inputs)
                loss = self.criterion(outputs, targets)
                
                total_loss += loss.item()
                _, predicted = outputs.max(1)
                total += targets.size(0)
                correct += predicted.eq(targets).sum().item()
        
        avg_loss = total_loss / len(val_loader)
        accuracy = 100. * correct / total
        
        return avg_loss, accuracy
    
    def train(self, train_loader, val_loader):
        """Complete training loop"""
        self.criterion = nn.CrossEntropyLoss()
        
        print(f"\nüöÄ Starting training for {self.config.EPOCHS} epochs...\n")
        
        for epoch in range(self.config.EPOCHS):
            # Train
            train_loss = self.train_epoch(train_loader)
            
            # Validate
            val_loss, val_acc = self.validate(val_loader)
            
            # Scheduler step
            if self.scheduler is not None:
                self.scheduler.step()
            
            # Record history
            self.history['train_loss'].append(train_loss)
            self.history['val_loss'].append(val_loss)
            self.history['val_acc'].append(val_acc)
            
            # Print progress
            print(f"Epoch {epoch+1}/{self.config.EPOCHS}:")
            print(f"  Train Loss: {train_loss:.4f}")
            print(f"  Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%")
            
            # Save checkpoint
            if (epoch + 1) % 10 == 0:
                self.save_checkpoint(epoch + 1)
        
        print("\n‚úÖ Training completed!")
        return self.history
    
    def save_checkpoint(self, epoch):
        """Save checkpoint"""
        checkpoint_path = Path(self.config.CHECKPOINT_DIR) / f'checkpoint_epoch_{epoch}.pth'
        torch.save({
            'epoch': epoch,
            'model_state_dict': self.model.state_dict(),
            'optimizer_state_dict': self.optimizer.state_dict(),
            'scheduler_state_dict': self.scheduler.state_dict() if self.scheduler else None,
            'history': self.history
        }, checkpoint_path)
        print(f"  üíæ Checkpoint saved: {checkpoint_path}")
    
    def save_model(self, name='final_model'):
        """Save final model"""
        model_path = Path(self.config.MODEL_DIR) / f'{name}.pth'
        torch.save(self.model.state_dict(), model_path)
        print(f"‚úÖ Model saved to {model_path}")

print("‚úÖ TrainingPipeline class defined!")

---

# PH·∫¶N 2: REPRODUCIBILITY

## 2.1 Random Seeds

### T·∫°i sao c·∫ßn Reproducibility?

- ‚úÖ **Debug**: D·ªÖ t√¨m l·ªói v·ªõi consistent results
- ‚úÖ **Research**: Validate k·∫øt qu·∫£
- ‚úÖ **Production**: ƒê·∫£m b·∫£o deployment gi·ªëng training
- ‚úÖ **Collaboration**: Team c√≥ th·ªÉ reproduce

### Set All Seeds

In [None]:
def set_seed(seed=42):
    """
    Set all random seeds for reproducibility
    
    Args:
        seed: Random seed value
    """
    # Python random
    random.seed(seed)
    
    # NumPy
    np.random.seed(seed)
    
    # PyTorch
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)  # multi-GPU
    
    # CUDA
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    
    # Environment variable
    os.environ['PYTHONHASHSEED'] = str(seed)
    
    print(f"‚úÖ All seeds set to {seed}")
    print("‚ö†Ô∏è  Deterministic mode may be slower (~10%)")

# Test
set_seed(42)

# Generate random numbers
print("\nüé≤ Test reproducibility:")
print(f"Python random: {random.random()}")
print(f"NumPy random: {np.random.rand()}")
print(f"PyTorch random: {torch.rand(1).item()}")

# Reset and test again
set_seed(42)
print("\nüé≤ After reset (should be same):")
print(f"Python random: {random.random()}")
print(f"NumPy random: {np.random.rand()}")
print(f"PyTorch random: {torch.rand(1).item()}")

## 2.2 Environment Info Logging

In [None]:
def log_environment_info(save_path='environment_info.json'):
    """
    Log environment information for reproducibility
    """
    import platform
    import sys
    
    env_info = {
        'timestamp': datetime.now().isoformat(),
        'python_version': sys.version,
        'platform': platform.platform(),
        'pytorch_version': torch.__version__,
        'cuda_available': torch.cuda.is_available(),
        'cuda_version': torch.version.cuda if torch.cuda.is_available() else 'N/A',
        'cudnn_version': torch.backends.cudnn.version() if torch.cuda.is_available() else 'N/A',
        'device_name': torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'
    }
    
    # Save
    with open(save_path, 'w') as f:
        json.dump(env_info, f, indent=2)
    
    print("‚úÖ Environment info saved!")
    print("\nüìã Environment:")
    for key, value in env_info.items():
        print(f"   {key}: {value}")
    
    return env_info

# Log
env_info = log_environment_info()

---

# PH·∫¶N 3: MODEL EVALUATION

## 3.1 Classification Metrics

In [None]:
def evaluate_classification(model, test_loader, device):
    """
    Comprehensive classification evaluation
    """
    model.eval()
    all_preds = []
    all_labels = []
    all_probs = []
    
    with torch.no_grad():
        for inputs, labels in test_loader:
            inputs = inputs.to(device)
            outputs = model(inputs)
            probs = torch.softmax(outputs, dim=1)
            _, preds = outputs.max(1)
            
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.numpy())
            all_probs.extend(probs.cpu().numpy())
    
    all_preds = np.array(all_preds)
    all_labels = np.array(all_labels)
    all_probs = np.array(all_probs)
    
    # Calculate metrics
    accuracy = accuracy_score(all_labels, all_preds)
    precision = precision_score(all_labels, all_preds, average='weighted')
    recall = recall_score(all_labels, all_preds, average='weighted')
    f1 = f1_score(all_labels, all_preds, average='weighted')
    
    print("üìä EVALUATION RESULTS:")
    print("=" * 50)
    print(f"Accuracy:  {accuracy:.4f} ({accuracy*100:.2f}%)")
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1-Score:  {f1:.4f}")
    print("=" * 50)
    
    return {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1,
        'predictions': all_preds,
        'labels': all_labels,
        'probabilities': all_probs
    }

print("‚úÖ Evaluation function defined!")

## 3.2 Confusion Matrix

In [None]:
def plot_confusion_matrix(y_true, y_pred, class_names=None, normalize=False):
    """
    Plot confusion matrix
    """
    cm = confusion_matrix(y_true, y_pred)
    
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        fmt = '.2f'
        title = 'Normalized Confusion Matrix'
    else:
        fmt = 'd'
        title = 'Confusion Matrix'
    
    plt.figure(figsize=(10, 8))
    sns.heatmap(cm, annot=True, fmt=fmt, cmap='Blues',
                xticklabels=class_names, yticklabels=class_names,
                cbar_kws={'label': 'Count' if not normalize else 'Proportion'})
    plt.xlabel('Predicted', fontsize=12)
    plt.ylabel('True', fontsize=12)
    plt.title(title, fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()

print("‚úÖ Confusion matrix function defined!")

---

# PH·∫¶N 4: SAVE & LOAD MODELS

## 4.1 Save & Load State Dict (Recommended)

In [None]:
# SAVE - State Dict (recommended)

# Create example model
model = nn.Sequential(
    nn.Linear(10, 32),
    nn.ReLU(),
    nn.Linear(32, 2)
)

# Save state dict
save_path = 'model_state_dict.pth'
torch.save(model.state_dict(), save_path)

print(f"‚úÖ Model state dict saved to {save_path}")
print("\nüí° State dict = Only weights, not architecture")

In [None]:
# LOAD - State Dict

# Must create model with SAME architecture
model_loaded = nn.Sequential(
    nn.Linear(10, 32),
    nn.ReLU(),
    nn.Linear(32, 2)
)

# Load state dict
model_loaded.load_state_dict(torch.load(save_path))
model_loaded.eval()

print(f"‚úÖ Model state dict loaded from {save_path}")

# Verify
x = torch.randn(5, 10)
with torch.no_grad():
    out1 = model(x)
    out2 = model_loaded(x)

print(f"\nüîç Outputs match: {torch.allclose(out1, out2)}")

## 4.2 Save & Load Complete Model

In [None]:
# SAVE - Complete Model

save_path = 'complete_model.pth'
torch.save(model, save_path)

print(f"‚úÖ Complete model saved to {save_path}")
print("\nüí° Complete model = Architecture + Weights")

In [None]:
# LOAD - Complete Model

model_loaded = torch.load(save_path)
model_loaded.eval()

print(f"‚úÖ Complete model loaded from {save_path}")
print("\n‚ö†Ô∏è  Warning: Less flexible, may break with PyTorch version changes")

## 4.3 Save Checkpoint (Best Practice)

In [None]:
# SAVE - Checkpoint v·ªõi t·∫•t c·∫£ training state

optimizer = optim.Adam(model.parameters(), lr=0.001)
epoch = 10

checkpoint = {
    'epoch': epoch,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'loss': 0.123,
}

checkpoint_path = 'checkpoint.pth'
torch.save(checkpoint, checkpoint_path)

print(f"‚úÖ Checkpoint saved to {checkpoint_path}")
print("\nüí° Checkpoint includes:")
print("   - Epoch number")
print("   - Model weights")
print("   - Optimizer state")
print("   - Loss value")
print("   ‚Üí Can resume training!")

In [None]:
# LOAD - Checkpoint v√† resume training

# Create model and optimizer
model = nn.Sequential(
    nn.Linear(10, 32),
    nn.ReLU(),
    nn.Linear(32, 2)
)
optimizer = optim.Adam(model.parameters())

# Load checkpoint
checkpoint = torch.load(checkpoint_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch']
loss = checkpoint['loss']

model.train()  # Set to training mode

print(f"‚úÖ Checkpoint loaded from {checkpoint_path}")
print(f"   Resume from epoch: {start_epoch}")
print(f"   Previous loss: {loss}")
print("\nüîÑ Ready to continue training!")

## 4.4 ONNX Export

In [None]:
# Export to ONNX (cross-platform format)

model.eval()

# Dummy input
dummy_input = torch.randn(1, 10)

# Export
onnx_path = 'model.onnx'
torch.onnx.export(
    model,
    dummy_input,
    onnx_path,
    export_params=True,
    opset_version=11,
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={'input': {0: 'batch_size'},
                  'output': {0: 'batch_size'}}
)

print(f"‚úÖ Model exported to ONNX: {onnx_path}")
print("\nüí° ONNX format cho ph√©p:")
print("   - Deploy tr√™n nhi·ªÅu frameworks (TensorRT, ONNX Runtime, etc.)")
print("   - Cross-platform compatibility")
print("   - Inference optimization")

---

# PH·∫¶N 5: PRODUCTION BEST PRACTICES

## 5.1 Model Versioning

In [None]:
class ModelRegistry:
    """
    Simple model registry for versioning
    """
    def __init__(self, base_dir='model_registry'):
        self.base_dir = Path(base_dir)
        self.base_dir.mkdir(exist_ok=True)
        
        # Create subdirectories
        for subdir in ['production', 'staging', 'archive']:
            (self.base_dir / subdir).mkdir(exist_ok=True)
    
    def save_model(self, model, version, stage='staging', metadata=None):
        """
        Save model with version
        """
        # Create version directory
        model_dir = self.base_dir / stage / f'v{version}'
        model_dir.mkdir(parents=True, exist_ok=True)
        
        # Save model
        model_path = model_dir / 'model.pth'
        torch.save(model.state_dict(), model_path)
        
        # Save metadata
        if metadata is None:
            metadata = {}
        
        metadata.update({
            'version': version,
            'stage': stage,
            'saved_at': datetime.now().isoformat(),
            'pytorch_version': torch.__version__
        })
        
        metadata_path = model_dir / 'metadata.json'
        with open(metadata_path, 'w') as f:
            json.dump(metadata, f, indent=2)
        
        print(f"‚úÖ Model v{version} saved to {stage}")
        return model_dir
    
    def load_model(self, model_class, version=None, stage='production'):
        """
        Load model from registry
        """
        stage_dir = self.base_dir / stage
        
        if version is None:
            # Load latest
            versions = sorted(stage_dir.iterdir())
            if not versions:
                raise ValueError(f"No models in {stage}")
            model_dir = versions[-1]
        else:
            model_dir = stage_dir / f'v{version}'
        
        # Load model
        model_path = model_dir / 'model.pth'
        model = model_class()
        model.load_state_dict(torch.load(model_path))
        
        # Load metadata
        metadata_path = model_dir / 'metadata.json'
        with open(metadata_path, 'r') as f:
            metadata = json.load(f)
        
        print(f"‚úÖ Loaded model v{metadata['version']} from {stage}")
        return model, metadata

print("‚úÖ ModelRegistry class defined!")

## 5.2 Common Anti-patterns

### ‚ùå ANTI-PATTERN 1: Kh√¥ng set random seeds

```python
# ‚ùå WRONG
model = create_model()
# Results kh√¥ng reproducible!

# ‚úÖ CORRECT
set_seed(42)
model = create_model()
```

### ‚ùå ANTI-PATTERN 2: Kh√¥ng version models

```python
# ‚ùå WRONG
torch.save(model.state_dict(), 'model.pth')  # Overwrite!

# ‚úÖ CORRECT
torch.save(model.state_dict(), f'model_v{version}.pth')
```

### ‚ùå ANTI-PATTERN 3: Kh√¥ng normalize data

```python
# ‚ùå WRONG
# Training: normalize
train_data = train_data / 255.0
# Inference: kh√¥ng normalize!
prediction = model(test_data)  # Bug!

# ‚úÖ CORRECT
def preprocess(data):
    return data / 255.0

train_data = preprocess(train_data)
test_data = preprocess(test_data)
```

### ‚ùå ANTI-PATTERN 4: eval() mode khi train

```python
# ‚ùå WRONG
model.eval()
for data in train_loader:
    loss = train_step(data)  # BatchNorm, Dropout kh√¥ng ho·∫°t ƒë·ªông!

# ‚úÖ CORRECT
model.train()  # Set training mode
for data in train_loader:
    loss = train_step(data)
```

### ‚ùå ANTI-PATTERN 5: Qu√™n zero_grad()

```python
# ‚ùå WRONG
for data in train_loader:
    loss = compute_loss(data)
    loss.backward()  # Gradients accumulate!
    optimizer.step()

# ‚úÖ CORRECT
for data in train_loader:
    optimizer.zero_grad()  # Clear gradients
    loss = compute_loss(data)
    loss.backward()
    optimizer.step()
```

### ‚ùå ANTI-PATTERN 6: Device mismatch

```python
# ‚ùå WRONG
model = model.to('cuda')
data = torch.randn(10, 3)  # On CPU
output = model(data)  # Error!

# ‚úÖ CORRECT
model = model.to(device)
data = data.to(device)
output = model(data)
```

---

# üéì T·ªïng k·∫øt FILE 3-C & TO√ÄN B·ªò PYTORCH SERIES

## ‚úÖ FILE 3-C: Production & Best Practices

### 1. Clean ML Pipeline
- **Config management**: Centralized configuration
- **TrainingPipeline**: Modular, reproducible
- **Experiment tracking**: Log everything

### 2. Reproducibility
- **Set all seeds**: Python, NumPy, PyTorch
- **Deterministic mode**: cudnn.deterministic
- **Environment logging**: Version tracking

### 3. Model Evaluation
- **Classification metrics**: Accuracy, Precision, Recall, F1
- **Confusion matrix**: Visualize errors
- **Comprehensive evaluation**: All metrics

### 4. Save & Load Models
- **State dict**: Recommended (weights only)
- **Checkpoints**: Resume training
- **ONNX**: Cross-platform deployment

### 5. Production Best Practices
- **Versioning**: Track model versions
- **Anti-patterns**: Common mistakes to avoid

---

## üéâ HO√ÄN TH√ÄNH PYTORCH ADVANCED SERIES!

### FILE 3-A: Advanced Training
- ‚úÖ Custom Loss (Focal, Contrastive)
- ‚úÖ Custom Layers (ResidualBlock, Attention)
- ‚úÖ Autograd (Gradient Accumulation, Clipping)
- ‚úÖ LR Scheduling (Cosine, Warmup)

### FILE 3-B: Transfer Learning & Mixed Precision
- ‚úÖ Transfer Learning (Feature Extraction, Fine-tuning)
- ‚úÖ Pretrained Models (ResNet, MobileNet)
- ‚úÖ Mixed Precision (AMP, GradScaler)
- ‚úÖ 2-3x speedup

### FILE 3-C: Production
- ‚úÖ Clean Pipeline
- ‚úÖ Reproducibility
- ‚úÖ Evaluation
- ‚úÖ Save/Load
- ‚úÖ Best Practices

---

## üöÄ B·∫°n ƒë√£ s·∫µn s√†ng cho:

1. **Production ML Projects**
   - Build end-to-end pipelines
   - Deploy models
   - Monitor performance

2. **Advanced Topics**
   - Distributed training
   - Model compression
   - MLOps

3. **Specialized Domains**
   - Computer Vision
   - NLP
   - Reinforcement Learning

---

## üí° Top 10 Key Takeaways

1. **Set seeds** for reproducibility
2. **Transfer Learning** saves time and data
3. **Mixed Precision** = free 2-3x speedup
4. **State dict** best cho saving
5. **model.train()** vs **model.eval()** matters
6. **optimizer.zero_grad()** m·ªói iteration
7. **Device consistency** (CPU vs CUDA)
8. **Version models** properly
9. **Config-driven** architecture
10. **Test everything** before production

---

**üéâ Ch√∫c m·ª´ng b·∫°n ƒë√£ ho√†n th√†nh PyTorch Advanced Series! üéâ**

**B·∫°n gi·ªù ƒë√£ s·∫µn s√†ng build production-ready PyTorch models! üöÄ**