# 08 transfer learning applications
**Location: TensorVerseHub/notebooks/03_computer_vision/08_transfer_learning_applications.ipynb**

TODO: Implement comprehensive TensorFlow + tf.keras learning content.

## Learning Objectives
- TODO: Define specific learning objectives
- TODO: List key TensorFlow concepts covered
- TODO: Outline tf.keras integration points

In [None]:
import tensorflow as tf
import numpy as np
print(f"TensorFlow version: {tf.__version__}")
# TODO: Add comprehensive implementation

# Transfer Learning Applications with tf.keras

**File Location:** `notebooks/03_computer_vision/08_transfer_learning_applications.ipynb`

Master transfer learning with tf.keras.applications, implementing fine-tuning strategies, feature extraction, and domain adaptation techniques. Build production-ready models by leveraging pre-trained networks and optimizing for custom datasets.

## Learning Objectives
- Master tf.keras.applications pre-trained models
- Implement feature extraction and fine-tuning strategies
- Apply progressive unfreezing and discriminative learning rates
- Handle domain adaptation and custom dataset challenges
- Build production transfer learning pipelines
- Optimize models for different data constraints

---

## 1. tf.keras.applications and Pre-trained Models

```python
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import layers, models, applications
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
import urllib.request
from PIL import Image
import warnings
warnings.filterwarnings('ignore')

print(f"TensorFlow version: {tf.__version__}")
tf.random.set_seed(42)
np.random.seed(42)

# Setup directories and data
def setup_transfer_learning_environment():
    """Setup environment for transfer learning experiments"""
    
    # Create directories
    os.makedirs('transfer_learning_data', exist_ok=True)
    os.makedirs('models', exist_ok=True)
    
    print("=== Available Pre-trained Models in tf.keras.applications ===")
    
    # List of available models with their properties
    models_info = {
        'VGG16': {'params': '138M', 'top1_acc': '71.3%', 'size': '528MB'},
        'VGG19': {'params': '143M', 'top1_acc': '71.1%', 'size': '549MB'},
        'ResNet50': {'params': '25.6M', 'top1_acc': '74.9%', 'size': '99MB'},
        'ResNet101': {'params': '44.7M', 'top1_acc': '76.4%', 'size': '171MB'},
        'ResNet152': {'params': '60.4M', 'top1_acc': '76.6%', 'size': '232MB'},
        'InceptionV3': {'params': '23.9M', 'top1_acc': '77.9%', 'size': '92MB'},
        'InceptionResNetV2': {'params': '55.9M', 'top1_acc': '80.3%', 'size': '215MB'},
        'Xception': {'params': '22.9M', 'top1_acc': '79.0%', 'size': '88MB'},
        'DenseNet121': {'params': '8.1M', 'top1_acc': '75.0%', 'size': '33MB'},
        'DenseNet201': {'params': '20.2M', 'top1_acc': '77.3%', 'size': '80MB'},
        'MobileNet': {'params': '4.3M', 'top1_acc': '70.4%', 'size': '16MB'},
        'MobileNetV2': {'params': '3.5M', 'top1_acc': '71.3%', 'size': '14MB'},
        'EfficientNetB0': {'params': '5.3M', 'top1_acc': '77.1%', 'size': '29MB'},
        'EfficientNetB7': {'params': '66.7M', 'top1_acc': '84.4%', 'size': '256MB'}
    }
    
    print(f"{'Model':<20} {'Parameters':<12} {'Top-1 Acc':<12} {'Size':<10}")
    print("-" * 60)
    for model_name, info in models_info.items():
        print(f"{model_name:<20} {info['params']:<12} {info['top1_acc']:<12} {info['size']:<10}")
    
    return models_info

models_info = setup_transfer_learning_environment()

# Load and prepare datasets
def prepare_datasets():
    """Prepare CIFAR-10 and CIFAR-100 for transfer learning"""
    
    # CIFAR-10
    (x_train_10, y_train_10), (x_test_10, y_test_10) = tf.keras.datasets.cifar10.load_data()
    
    # CIFAR-100
    (x_train_100, y_train_100), (x_test_100, y_test_100) = tf.keras.datasets.cifar100.load_data()
    
    # Normalize to [0, 1]
    x_train_10 = x_train_10.astype('float32') / 255.0
    x_test_10 = x_test_10.astype('float32') / 255.0
    x_train_100 = x_train_100.astype('float32') / 255.0
    x_test_100 = x_test_100.astype('float32') / 255.0
    
    # Flatten labels
    y_train_10 = y_train_10.flatten()
    y_test_10 = y_test_10.flatten()
    y_train_100 = y_train_100.flatten()
    y_test_100 = y_test_100.flatten()
    
    print("Dataset Information:")
    print(f"CIFAR-10: {x_train_10.shape} train, {x_test_10.shape} test, {len(np.unique(y_train_10))} classes")
    print(f"CIFAR-100: {x_train_100.shape} train, {x_test_100.shape} test, {len(np.unique(y_train_100))} classes")
    
    return (x_train_10, y_train_10, x_test_10, y_test_10), (x_train_100, y_train_100, x_test_100, y_test_100)

cifar10_data, cifar100_data = prepare_datasets()
(x_train_10, y_train_10, x_test_10, y_test_10) = cifar10_data
(x_train_100, y_train_100, x_test_100, y_test_100) = cifar100_data

# Demonstrate loading pre-trained models
def demonstrate_pretrained_models():
    """Demonstrate loading and using pre-trained models"""
    
    print("\n=== Loading Pre-trained Models ===")
    
    # Load different models
    models_to_test = [
        ('ResNet50', applications.ResNet50),
        ('MobileNetV2', applications.MobileNetV2),
        ('EfficientNetB0', applications.EfficientNetB0)
    ]
    
    for name, model_class in models_to_test:
        print(f"\n--- {name} ---")
        
        # Load with ImageNet weights
        model = model_class(
            weights='imagenet',
            include_top=False,  # Exclude final classification layer
            input_shape=(224, 224, 3)
        )
        
        print(f"Parameters: {model.count_params():,}")
        print(f"Trainable parameters: {sum([tf.keras.backend.count_params(w) for w in model.trainable_weights]):,}")
        print(f"Layers: {len(model.layers)}")
        
        # Test with random input
        test_input = tf.random.normal((1, 224, 224, 3))
        features = model(test_input)
        print(f"Feature shape: {features.shape}")
        
        # Show some layer names
        print(f"First 5 layers: {[layer.name for layer in model.layers[:5]]}")
        print(f"Last 5 layers: {[layer.name for layer in model.layers[-5:]]}")

demonstrate_pretrained_models()
```

## 2. Feature Extraction Strategy

```python
# Feature extraction implementation
class FeatureExtractor:
    """Feature extraction using pre-trained models"""
    
    def __init__(self, model_name='ResNet50', input_shape=(224, 224, 3)):
        self.model_name = model_name
        self.input_shape = input_shape
        self.base_model = None
        self.feature_extractor = None
        
    def build_feature_extractor(self, pooling='avg'):
        """Build feature extractor from pre-trained model"""
        
        # Map model names to classes
        model_map = {
            'ResNet50': applications.ResNet50,
            'MobileNetV2': applications.MobileNetV2,
            'EfficientNetB0': applications.EfficientNetB0,
            'VGG16': applications.VGG16,
            'InceptionV3': applications.InceptionV3
        }
        
        if self.model_name not in model_map:
            raise ValueError(f"Model {self.model_name} not supported")
            
        # Load base model
        self.base_model = model_map[self.model_name](
            weights='imagenet',
            include_top=False,
            input_shape=self.input_shape,
            pooling=pooling
        )
        
        # Freeze all layers for feature extraction
        self.base_model.trainable = False
        
        print(f"Feature extractor built with {self.model_name}")
        print(f"Feature dimension: {self.base_model.output.shape[-1]}")
        
        return self.base_model
    
    def create_classifier(self, num_classes, dropout_rate=0.2):
        """Create classifier on top of feature extractor"""
        
        if self.base_model is None:
            raise ValueError("Build feature extractor first")
            
        # Create full model
        inputs = tf.keras.Input(shape=self.input_shape)
        
        # Preprocessing for ImageNet models
        x = applications.imagenet_utils.preprocess_input(inputs)
        
        # Feature extraction
        features = self.base_model(x, training=False)
        
        # Classification head
        x = layers.Dropout(dropout_rate)(features)
        predictions = layers.Dense(num_classes, activation='softmax')(x)
        
        model = tf.keras.Model(inputs, predictions)
        return model
    
    def extract_features(self, images, batch_size=32):
        """Extract features from images"""
        
        if self.base_model is None:
            raise ValueError("Build feature extractor first")
            
        # Preprocess images
        preprocessed = applications.imagenet_utils.preprocess_input(images * 255.0)
        
        # Resize to model input size
        if images.shape[1:3] != self.input_shape[:2]:
            preprocessed = tf.image.resize(preprocessed, self.input_shape[:2])
        
        # Extract features in batches
        features = self.base_model.predict(preprocessed, batch_size=batch_size, verbose=0)
        
        return features

# Demonstrate feature extraction
def demonstrate_feature_extraction():
    """Demonstrate feature extraction approach"""
    
    print("\n=== Feature Extraction Demonstration ===")
    
    # Create feature extractor
    extractor = FeatureExtractor('ResNet50')
    base_model = extractor.build_feature_extractor()
    
    # Create classifier for CIFAR-10
    model = extractor.create_classifier(num_classes=10, dropout_rate=0.3)
    
    print(f"Full model parameters: {model.count_params():,}")
    print(f"Trainable parameters: {sum([tf.keras.backend.count_params(w) for w in model.trainable_weights]):,}")
    
    # Compile and test
    model.compile(
        optimizer=tf.keras.optimizers.Adam(0.001),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    # Resize CIFAR-10 to 224x224 for pre-trained models
    def resize_data(x, target_size=(224, 224)):
        return tf.image.resize(x, target_size).numpy()
    
    x_train_resized = resize_data(x_train_10[:1000])  # Use subset for demo
    x_test_resized = resize_data(x_test_10[:200])
    
    # Train feature extraction model
    history = model.fit(
        x_train_resized, y_train_10[:1000],
        validation_data=(x_test_resized, y_test_10[:200]),
        epochs=10,
        batch_size=32,
        verbose=0
    )
    
    print(f"Feature extraction best val accuracy: {max(history.history['val_accuracy']):.4f}")
    
    return model, history

feature_model, feature_history = demonstrate_feature_extraction()
```

## 3. Fine-tuning Strategies

```python
# Advanced fine-tuning implementations
class AdvancedFineTuner:
    """Advanced fine-tuning with progressive unfreezing and discriminative learning rates"""
    
    def __init__(self, model_name='ResNet50', input_shape=(224, 224, 3)):
        self.model_name = model_name
        self.input_shape = input_shape
        self.base_model = None
        self.model = None
        
    def build_model(self, num_classes):
        """Build model for fine-tuning"""
        
        model_map = {
            'ResNet50': applications.ResNet50,
            'MobileNetV2': applications.MobileNetV2,
            'EfficientNetB0': applications.EfficientNetB0,
            'VGG16': applications.VGG16
        }
        
        # Load base model
        self.base_model = model_map[self.model_name](
            weights='imagenet',
            include_top=False,
            input_shape=self.input_shape
        )
        
        # Initially freeze all layers
        self.base_model.trainable = False
        
        # Build complete model
        inputs = tf.keras.Input(shape=self.input_shape)
        x = applications.imagenet_utils.preprocess_input(inputs)
        
        # Base model features
        x = self.base_model(x, training=False)
        
        # Global pooling and regularization
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.BatchNormalization()(x)
        x = layers.Dropout(0.3)(x)
        
        # Additional dense layers for fine-tuning
        x = layers.Dense(512, activation='relu')(x)
        x = layers.BatchNormalization()(x)
        x = layers.Dropout(0.3)(x)
        
        # Final classification
        predictions = layers.Dense(num_classes, activation='softmax')(x)
        
        self.model = tf.keras.Model(inputs, predictions)
        return self.model
    
    def progressive_unfreeze(self, stage=1):
        """Progressively unfreeze layers for fine-tuning"""
        
        if self.base_model is None:
            raise ValueError("Build model first")
            
        total_layers = len(self.base_model.layers)
        
        if stage == 1:
            # Stage 1: Only train classifier
            self.base_model.trainable = False
            print("Stage 1: Training classifier only")
            
        elif stage == 2:
            # Stage 2: Unfreeze top 25% of layers
            unfreeze_from = int(total_layers * 0.75)
            self.base_model.trainable = True
            
            for layer in self.base_model.layers[:unfreeze_from]:
                layer.trainable = False
                
            print(f"Stage 2: Unfrozen layers from {unfreeze_from} to {total_layers}")
            
        elif stage == 3:
            # Stage 3: Unfreeze top 50% of layers
            unfreeze_from = int(total_layers * 0.5)
            self.base_model.trainable = True
            
            for layer in self.base_model.layers[:unfreeze_from]:
                layer.trainable = False
                
            print(f"Stage 3: Unfrozen layers from {unfreeze_from} to {total_layers}")
            
        elif stage == 4:
            # Stage 4: Fine-tune all layers
            self.base_model.trainable = True
            print("Stage 4: All layers unfrozen")
        
        # Count trainable parameters
        trainable_params = sum([tf.keras.backend.count_params(w) for w in self.model.trainable_weights])
        print(f"Trainable parameters: {trainable_params:,}")
    
    def compile_with_discriminative_lr(self, stage=1):
        """Compile model with discriminative learning rates"""
        
        if stage == 1:
            # High learning rate for classifier
            optimizer = tf.keras.optimizers.Adam(0.001)
        elif stage == 2:
            # Lower learning rate for pre-trained layers
            optimizer = tf.keras.optimizers.Adam(0.0001)
        elif stage == 3:
            # Even lower learning rate
            optimizer = tf.keras.optimizers.Adam(0.00005)
        else:
            # Very low learning rate for full fine-tuning
            optimizer = tf.keras.optimizers.Adam(0.00001)
        
        self.model.compile(
            optimizer=optimizer,
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy']
        )
        
        print(f"Compiled with learning rate: {optimizer.learning_rate}")

# Custom callbacks for fine-tuning
class ProgressiveUnfreezeCallback(tf.keras.callbacks.Callback):
    """Callback to progressively unfreeze layers during training"""
    
    def __init__(self, fine_tuner, unfreeze_epochs=[10, 20, 30]):
        super().__init__()
        self.fine_tuner = fine_tuner
        self.unfreeze_epochs = unfreeze_epochs
        self.current_stage = 1
        
    def on_epoch_begin(self, epoch, logs=None):
        if epoch in self.unfreeze_epochs and self.current_stage < 4:
            self.current_stage += 1
            print(f"\nEpoch {epoch}: Progressing to stage {self.current_stage}")
            
            # Unfreeze more layers
            self.fine_tuner.progressive_unfreeze(self.current_stage)
            
            # Recompile with new learning rate
            self.fine_tuner.compile_with_discriminative_lr(self.current_stage)

class GradualWarmupScheduler(tf.keras.callbacks.Callback):
    """Gradual warmup learning rate scheduler"""
    
    def __init__(self, warmup_epochs=5, base_lr=0.001, warmup_lr=0.0001):
        super().__init__()
        self.warmup_epochs = warmup_epochs
        self.base_lr = base_lr
        self.warmup_lr = warmup_lr
        
    def on_epoch_begin(self, epoch, logs=None):
        if epoch < self.warmup_epochs:
            lr = self.warmup_lr + (self.base_lr - self.warmup_lr) * epoch / self.warmup_epochs
            tf.keras.backend.set_value(self.model.optimizer.learning_rate, lr)
            print(f"Warmup LR: {lr:.6f}")

# Demonstrate advanced fine-tuning
def demonstrate_fine_tuning():
    """Demonstrate advanced fine-tuning strategies"""
    
    print("\n=== Advanced Fine-tuning Demonstration ===")
    
    # Create fine-tuner
    fine_tuner = AdvancedFineTuner('MobileNetV2')
    model = fine_tuner.build_model(num_classes=10)
    
    # Prepare data (resize for ImageNet models)
    def prepare_fine_tune_data(x, y, subset_size=1500):
        x_resized = tf.image.resize(x[:subset_size], (224, 224)).numpy()
        return x_resized, y[:subset_size]
    
    x_train_ft, y_train_ft = prepare_fine_tune_data(x_train_10, y_train_10)
    x_val_ft, y_val_ft = prepare_fine_tune_data(x_test_10, y_test_10, 300)
    
    print(f"Fine-tuning data: {x_train_ft.shape} train, {x_val_ft.shape} val")
    
    # Stage 1: Train classifier only
    fine_tuner.progressive_unfreeze(stage=1)
    fine_tuner.compile_with_discriminative_lr(stage=1)
    
    print("Stage 1: Training classifier...")
    stage1_history = model.fit(
        x_train_ft, y_train_ft,
        validation_data=(x_val_ft, y_val_ft),
        epochs=8,
        batch_size=32,
        verbose=0,
        callbacks=[
            tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True),
            tf.keras.callbacks.ReduceLROnPlateau(patience=2, factor=0.5)
        ]
    )
    
    stage1_acc = max(stage1_history.history['val_accuracy'])
    print(f"Stage 1 best accuracy: {stage1_acc:.4f}")
    
    # Stage 2: Fine-tune top layers
    fine_tuner.progressive_unfreeze(stage=2)
    fine_tuner.compile_with_discriminative_lr(stage=2)
    
    print("Stage 2: Fine-tuning top layers...")
    stage2_history = model.fit(
        x_train_ft, y_train_ft,
        validation_data=(x_val_ft, y_val_ft),
        epochs=8,
        batch_size=32,
        verbose=0,
        callbacks=[
            tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)
        ]
    )
    
    stage2_acc = max(stage2_history.history['val_accuracy'])
    print(f"Stage 2 best accuracy: {stage2_acc:.4f}")
    
    # Compare with feature extraction
    improvement = stage2_acc - max(feature_history.history['val_accuracy'])
    print(f"Fine-tuning improvement: {improvement:.4f}")
    
    return model, fine_tuner, (stage1_history, stage2_history)

fine_tuned_model, fine_tuner, fine_tune_histories = demonstrate_fine_tuning()
```

## 4. Domain Adaptation Techniques

```python
# Domain adaptation for different datasets
class DomainAdapter:
    """Domain adaptation techniques for transfer learning"""
    
    def __init__(self, source_model, target_classes):
        self.source_model = source_model
        self.target_classes = target_classes
        self.adapted_model = None
        
    def layer_wise_adaptation(self, adaptation_layers=['top', 'middle']):
        """Adapt specific layers for new domain"""
        
        # Get source model without top layers
        if hasattr(self.source_model, 'layers'):
            base_layers = self.source_model.layers[:-3]  # Remove last 3 layers
        else:
            base_layers = self.source_model.get_layer('global_average_pooling2d')
            
        # Build adapted model
        inputs = self.source_model.input
        x = inputs
        
        # Pass through base layers
        for layer in base_layers[:-1]:  # Skip the last few layers
            if isinstance(layer, tf.keras.layers.InputLayer):
                continue
            x = layer(x)
            
        # Add adaptation layers
        if 'middle' in adaptation_layers:
            # Add middle adaptation layer
            x = layers.Dense(1024, activation='relu', name='adaptation_middle')(x)
            x = layers.BatchNormalization(name='bn_adaptation_middle')(x)
            x = layers.Dropout(0.5, name='dropout_adaptation_middle')(x)
        
        # Global pooling if needed
        if len(x.shape) > 2:
            x = layers.GlobalAveragePooling2D(name='adaptation_pool')(x)
        
        if 'top' in adaptation_layers:
            # Add top adaptation layer
            x = layers.Dense(512, activation='relu', name='adaptation_top')(x)
            x = layers.BatchNormalization(name='bn_adaptation_top')(x)
            x = layers.Dropout(0.4, name='dropout_adaptation_top')(x)
        
        # Final classification for target domain
        outputs = layers.Dense(
            self.target_classes, 
            activation='softmax', 
            name='target_classification'
        )(x)
        
        self.adapted_model = tf.keras.Model(inputs, outputs, name='domain_adapted_model')
        return self.adapted_model
    
    def freeze_source_layers(self, freeze_ratio=0.8):
        """Freeze bottom layers from source model"""
        
        if self.adapted_model is None:
            raise ValueError("Create adapted model first")
            
        total_layers = len(self.adapted_model.layers)
        freeze_until = int(total_layers * freeze_ratio)
        
        for i, layer in enumerate(self.adapted_model.layers):
            if i < freeze_until and 'adaptation' not in layer.name:
                layer.trainable = False
            else:
                layer.trainable = True
                
        trainable_params = sum([tf.keras.backend.count_params(w) 
                               for w in self.adapted_model.trainable_weights])
        print(f"Frozen {freeze_until}/{total_layers} layers")
        print(f"Trainable parameters: {trainable_params:,}")

class MultiSourceAdapter:
    """Combine knowledge from multiple pre-trained models"""
    
    def __init__(self, model_names=['ResNet50', 'MobileNetV2'], input_shape=(224, 224, 3)):
        self.model_names = model_names
        self.input_shape = input_shape
        self.source_models = []
        self.ensemble_model = None
        
    def load_source_models(self):
        """Load multiple source models"""
        
        model_map = {
            'ResNet50': applications.ResNet50,
            'MobileNetV2': applications.MobileNetV2,
            'EfficientNetB0': applications.EfficientNetB0
        }
        
        for name in self.model_names:
            model = model_map[name](
                weights='imagenet',
                include_top=False,
                input_shape=self.input_shape,
                pooling='avg'
            )
            model.trainable = False  # Freeze for feature extraction
            self.source_models.append((name, model))
            print(f"Loaded {name}: {model.output.shape[-1]} features")
    
    def create_ensemble_adapter(self, num_classes):
        """Create ensemble model combining multiple sources"""
        
        if not self.source_models:
            self.load_source_models()
            
        inputs = tf.keras.Input(shape=self.input_shape)
        x = applications.imagenet_utils.preprocess_input(inputs)
        
        # Extract features from each source model
        features = []
        for name, model in self.source_models:
            feature = model(x, training=False)
            features.append(feature)
        
        # Combine features
        if len(features) > 1:
            combined = layers.Concatenate(name='feature_fusion')(features)
        else:
            combined = features[0]
        
        # Adaptation layers
        x = layers.Dense(1024, activation='relu', name='fusion_dense1')(combined)
        x = layers.BatchNormalization(name='fusion_bn1')(x)
        x = layers.Dropout(0.5, name='fusion_dropout1')(x)
        
        x = layers.Dense(512, activation='relu', name='fusion_dense2')(x)
        x = layers.BatchNormalization(name='fusion_bn2')(x)
        x = layers.Dropout(0.3, name='fusion_dropout2')(x)
        
        # Final classification
        outputs = layers.Dense(num_classes, activation='softmax', name='ensemble_output')(x)
        
        self.ensemble_model = tf.keras.Model(inputs, outputs, name='multi_source_adapter')
        return self.ensemble_model

# Test domain adaptation
def test_domain_adaptation():
    """Test domain adaptation techniques"""
    
    print("\n=== Domain Adaptation Testing ===")
    
    # Single source adaptation
    print("--- Single Source Adaptation ---")
    adapter = DomainAdapter(fine_tuned_model, target_classes=100)  # CIFAR-100 has 100 classes
    adapted_model = adapter.layer_wise_adaptation(['top', 'middle'])
    adapter.freeze_source_layers(freeze_ratio=0.7)
    
    print(f"Adapted model parameters: {adapted_model.count_params():,}")
    
    # Compile and test on CIFAR-100
    adapted_model.compile(
        optimizer=tf.keras.optimizers.Adam(0.0001),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    # Prepare CIFAR-100 data (resize for ImageNet models)
    x_train_100_resized = tf.image.resize(x_train_100[:800], (224, 224)).numpy()
    x_test_100_resized = tf.image.resize(x_test_100[:200], (224, 224)).numpy()
    
    # Train adapted model
    adaptation_history = adapted_model.fit(
        x_train_100_resized, y_train_100[:800],
        validation_data=(x_test_100_resized, y_test_100[:200]),
        epochs=10,
        batch_size=16,
        verbose=0,
        callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)]
    )
    
    adaptation_acc = max(adaptation_history.history['val_accuracy'])
    print(f"Domain adaptation accuracy (CIFAR-100): {adaptation_acc:.4f}")
    
    # Multi-source adaptation
    print("\n--- Multi-Source Adaptation ---")
    multi_adapter = MultiSourceAdapter(['ResNet50', 'MobileNetV2'])
    ensemble_model = multi_adapter.create_ensemble_adapter(num_classes=10)
    
    print(f"Ensemble model parameters: {ensemble_model.count_params():,}")
    
    # Compile and test
    ensemble_model.compile(
        optimizer=tf.keras.optimizers.Adam(0.001),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    # Test on CIFAR-10
    x_train_resized = tf.image.resize(x_train_10[:800], (224, 224)).numpy()
    x_test_resized = tf.image.resize(x_test_10[:200], (224, 224)).numpy()
    
    ensemble_history = ensemble_model.fit(
        x_train_resized, y_train_10[:800],
        validation_data=(x_test_resized, y_test_10[:200]),
        epochs=8,
        batch_size=16,
        verbose=0
    )
    
    ensemble_acc = max(ensemble_history.history['val_accuracy'])
    print(f"Multi-source ensemble accuracy: {ensemble_acc:.4f}")
    
    return adapted_model, ensemble_model

adapted_model, ensemble_model = test_domain_adaptation()
```

## 5. Production Transfer Learning Pipeline

```python
# Production-ready transfer learning pipeline
class ProductionTransferLearningPipeline:
    """Complete production pipeline for transfer learning"""
    
    def __init__(self, config):
        self.config = config
        self.model = None
        self.history = {}
        self.best_weights = None
        
    def build_model(self):
        """Build transfer learning model based on config"""
        
        # Load base model
        model_map = {
            'ResNet50': applications.ResNet50,
            'MobileNetV2': applications.MobileNetV2,
            'EfficientNetB0': applications.EfficientNetB0,
            'VGG16': applications.VGG16,
            'InceptionV3': applications.InceptionV3
        }
        
        base_model = model_map[self.config['base_model']](
            weights='imagenet',
            include_top=False,
            input_shape=self.config['input_shape']
        )
        
        # Build complete model
        inputs = tf.keras.Input(shape=self.config['input_shape'])
        
        # Preprocessing
        x = applications.imagenet_utils.preprocess_input(inputs)
        
        # Base model
        if self.config['strategy'] == 'feature_extraction':
            base_model.trainable = False
            x = base_model(x, training=False)
        else:  # fine_tuning
            base_model.trainable = True
            x = base_model(x, training=True)
        
        # Custom head
        x = layers.GlobalAveragePooling2D()(x)
        
        # Add custom layers based on config
        for units in self.config['dense_layers']:
            x = layers.Dense(units, activation='relu')(x)
            x = layers.BatchNormalization()(x)
            x = layers.Dropout(self.config['dropout_rate'])(x)
        
        # Final classification
        predictions = layers.Dense(
            self.config['num_classes'],
            activation='softmax'
        )(x)
        
        self.model = tf.keras.Model(inputs, predictions)
        
        # Freeze layers for fine-tuning strategy
        if self.config['strategy'] == 'fine_tuning':
            self._setup_layer_freezing()
        
        return self.model
    
    def _setup_layer_freezing(self):
        """Setup layer freezing for fine-tuning"""
        
        base_model = self.model.layers[2]  # Base model is typically the 3rd layer
        total_layers = len(base_model.layers)
        
        if self.config['freeze_ratio'] > 0:
            freeze_until = int(total_layers * self.config['freeze_ratio'])
            
            for layer in base_model.layers[:freeze_until]:
                layer.trainable = False
        
        print(f"Frozen {int(total_layers * self.config['freeze_ratio'])}/{total_layers} base layers")
    
    def create_data_generators(self, train_data, val_data):
        """Create data generators with augmentation"""
        
        # Training data generator with augmentation
        train_datagen = ImageDataGenerator(
            rotation_range=self.config['augmentation']['rotation_range'],
            width_shift_range=self.config['augmentation']['width_shift_range'],
            height_shift_range=self.config['augmentation']['height_shift_range'],
            horizontal_flip=self.config['augmentation']['horizontal_flip'],
            zoom_range=self.config['augmentation']['zoom_range'],
            fill_mode='nearest'
        )
        
        # Validation data generator (no augmentation)
        val_datagen = ImageDataGenerator()
        
        # Create generators
        train_generator = train_datagen.flow(
            train_data[0], train_data[1],
            batch_size=self.config['batch_size']
        )
        
        val_generator = val_datagen.flow(
            val_data[0], val_data[1],
            batch_size=self.config['batch_size']
        )
        
        return train_generator, val_generator
    
    def train(self, train_data, val_data):
        """Execute training pipeline"""
        
        print("=== Starting Production Transfer Learning Pipeline ===")
        
        # Build model
        model = self.build_model()
        
        # Setup training
        optimizer_map = {
            'adam': tf.keras.optimizers.Adam,
            'adamw': tf.keras.optimizers.AdamW,
            'sgd': tf.keras.optimizers.SGD
        }
        
        optimizer = optimizer_map[self.config['optimizer']](
            learning_rate=self.config['learning_rate']
        )
        
        model.compile(
            optimizer=optimizer,
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy', 'top_k_categorical_accuracy']
        )
        
        print(f"Model built: {model.count_params():,} parameters")
        print(f"Trainable: {sum([tf.keras.backend.count_params(w) for w in model.trainable_weights]):,}")
        
        # Create data generators
        train_gen, val_gen = self.create_data_generators(train_data, val_data)
        
        # Callbacks
        callbacks = [
            tf.keras.callbacks.ModelCheckpoint(
                'best_model.h5',
                monitor='val_accuracy',
                save_best_only=True,
                mode='max'
            ),
            tf.keras.callbacks.EarlyStopping(
                monitor='val_loss',
                patience=self.config['patience'],
                restore_best_weights=True
            ),
            tf.keras.callbacks.ReduceLROnPlateau(
                monitor='val_loss',
                factor=0.2,
                patience=self.config['lr_patience'],
                min_lr=1e-7
            ),
            tf.keras.callbacks.CSVLogger('training_log.csv')
        ]
        
        # Training
        steps_per_epoch = len(train_data[0]) // self.config['batch_size']
        validation_steps = len(val_data[0]) // self.config['batch_size']
        
        self.history = model.fit(
            train_gen,
            epochs=self.config['epochs'],
            steps_per_epoch=steps_per_epoch,
            validation_data=val_gen,
            validation_steps=validation_steps,
            callbacks=callbacks,
            verbose=1
        )
        
        # Results
        best_acc = max(self.history.history['val_accuracy'])
        print(f"\nTraining completed!")
        print(f"Best validation accuracy: {best_acc:.4f}")
        
        return model, self.history
    
    def evaluate_and_save(self, test_data, save_path='production_model'):
        """Evaluate model and save for production"""
        
        if self.model is None:
            raise ValueError("Train model first")
        
        # Final evaluation
        test_loss, test_acc, test_top5 = self.model.evaluate(
            test_data[0], test_data[1], verbose=0
        )
        
        print(f"\nFinal Test Results:")
        print(f"Test Accuracy: {test_acc:.4f}")
        print(f"Test Top-5 Accuracy: {test_top5:.4f}")
        print(f"Test Loss: {test_loss:.4f}")
        
        # Save model in multiple formats
        self.model.save(f'{save_path}.h5')  # Full model
        self.model.save(save_path)  # SavedModel format
        
        # Convert to TFLite for mobile deployment
        converter = tf.lite.TFLiteConverter.from_keras_model(self.model)
        tflite_model = converter.convert()
        
        with open(f'{save_path}.tflite', 'wb') as f:
            f.write(tflite_model)
        
        print(f"Model saved in multiple formats at {save_path}")
        
        # Generate model summary
        summary = {
            'config': self.config,
            'test_accuracy': float(test_acc),
            'test_top5_accuracy': float(test_top5),
            'test_loss': float(test_loss),
            'total_parameters': int(self.model.count_params()),
            'trainable_parameters': int(sum([tf.keras.backend.count_params(w) 
                                           for w in self.model.trainable_weights])),
            'training_epochs': len(self.history.history['loss']),
            'best_val_accuracy': float(max(self.history.history['val_accuracy']))
        }
        
        import json
        with open(f'{save_path}_summary.json', 'w') as f:
            json.dump(summary, f, indent=2)
        
        return summary

# Production configuration
production_config = {
    'base_model': 'MobileNetV2',
    'input_shape': (224, 224, 3),
    'num_classes': 10,
    'strategy': 'fine_tuning',  # 'feature_extraction' or 'fine_tuning'
    'freeze_ratio': 0.7,  # Freeze bottom 70% of layers
    'dense_layers': [512, 256],  # Custom dense layers
    'dropout_rate': 0.3,
    'optimizer': 'adam',
    'learning_rate': 0.0001,
    'batch_size': 32,
    'epochs': 20,
    'patience': 5,
    'lr_patience': 3,
    'augmentation': {
        'rotation_range': 20,
        'width_shift_range': 0.2,
        'height_shift_range': 0.2,
        'horizontal_flip': True,
        'zoom_range': 0.2
    }
}

# Run production pipeline
def run_production_pipeline():
    """Run complete production transfer learning pipeline"""
    
    print("\n=== Running Production Pipeline ===")
    
    # Prepare data
    def prepare_production_data(x, y, target_size=(224, 224)):
        x_resized = tf.image.resize(x, target_size).numpy()
        return x_resized, y
    
    # Use larger subsets for production demo
    train_data = prepare_production_data(x_train_10[:2000], y_train_10[:2000])
    val_data = prepare_production_data(x_test_10[:400], y_test_10[:400])
    test_data = prepare_production_data(x_test_10[400:600], y_test_10[400:600])
    
    # Create pipeline
    pipeline = ProductionTransferLearningPipeline(production_config)
    
    # Train
    model, history = pipeline.train(train_data, val_data)
    
    # Evaluate and save
    summary = pipeline.evaluate_and_save(test_data, 'cifar10_transfer_model')
    
    return pipeline, summary

# Execute production pipeline
production_pipeline, production_summary = run_production_pipeline()

print(f"\nüéâ Production pipeline completed!")
print(f"üìä Final test accuracy: {production_summary['test_accuracy']:.4f}")
print(f"üìà Model efficiency: {production_summary['trainable_parameters']:,} trainable params")
```

## 6. Transfer Learning Best Practices and Comparison

```python
# Comprehensive comparison of transfer learning strategies
class TransferLearningComparison:
    """Compare different transfer learning approaches"""
    
    def __init__(self):
        self.results = {}
        
    def compare_strategies(self, train_data, val_data, strategies, epochs=8):
        """Compare different transfer learning strategies"""
        
        print("=== Transfer Learning Strategy Comparison ===")
        
        for strategy_name, config in strategies.items():
            print(f"\nTesting {strategy_name}...")
            
            # Build model based on strategy
            if config['type'] == 'feature_extraction':
                model = self._build_feature_extraction_model(config)
            elif config['type'] == 'fine_tuning':
                model = self._build_fine_tuning_model(config)
            elif config['type'] == 'from_scratch':
                model = self._build_from_scratch_model(config)
            
            # Compile
            model.compile(
                optimizer=tf.keras.optimizers.Adam(config['lr']),
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy']
            )
            
            # Train
            import time
            start_time = time.time()
            
            history = model.fit(
                train_data[0], train_data[1],
                validation_data=val_data,
                epochs=epochs,
                batch_size=32,
                verbose=0,
                callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)]
            )
            
            training_time = time.time() - start_time
            
            # Store results
            self.results[strategy_name] = {
                'best_val_acc': max(history.history['val_accuracy']),
                'final_val_acc': history.history['val_accuracy'][-1],
                'training_time': training_time,
                'parameters': model.count_params(),
                'trainable_params': sum([tf.keras.backend.count_params(w) 
                                       for w in model.trainable_weights]),
                'epochs_trained': len(history.history['loss'])
            }
            
            print(f"  Best accuracy: {self.results[strategy_name]['best_val_acc']:.4f}")
            print(f"  Training time: {training_time:.2f}s")
            print(f"  Trainable params: {self.results[strategy_name]['trainable_params']:,}")
    
    def _build_feature_extraction_model(self, config):
        """Build feature extraction model"""
        base_model = applications.MobileNetV2(
            weights='imagenet', include_top=False, 
            input_shape=(224, 224, 3), pooling='avg'
        )
        base_model.trainable = False
        
        inputs = tf.keras.Input(shape=(224, 224, 3))
        x = applications.imagenet_utils.preprocess_input(inputs)
        x = base_model(x, training=False)
        x = layers.Dropout(0.3)(x)
        outputs = layers.Dense(10, activation='softmax')(x)
        
        return tf.keras.Model(inputs, outputs)
    
    def _build_fine_tuning_model(self, config):
        """Build fine-tuning model"""
        base_model = applications.MobileNetV2(
            weights='imagenet', include_top=False, input_shape=(224, 224, 3)
        )
        
        # Freeze bottom layers
        for layer in base_model.layers[:-20]:
            layer.trainable = False
        
        inputs = tf.keras.Input(shape=(224, 224, 3))
        x = applications.imagenet_utils.preprocess_input(inputs)
        x = base_model(x, training=True)
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dropout(0.3)(x)
        outputs = layers.Dense(10, activation='softmax')(x)
        
        return tf.keras.Model(inputs, outputs)
    
    def _build_from_scratch_model(self, config):
        """Build model from scratch"""
        model = tf.keras.Sequential([
            layers.Conv2D(32, 3, activation='relu', input_shape=(224, 224, 3)),
            layers.MaxPooling2D(),
            layers.Conv2D(64, 3, activation='relu'),
            layers.MaxPooling2D(),
            layers.Conv2D(128, 3, activation='relu'),
            layers.MaxPooling2D(),
            layers.GlobalAveragePooling2D(),
            layers.Dense(512, activation='relu'),
            layers.Dropout(0.5),
            layers.Dense(10, activation='softmax')
        ])
        return model
    
    def print_comparison(self):
        """Print detailed comparison"""
        
        print("\n=== Transfer Learning Comparison Results ===")
        print(f"{'Strategy':<20} {'Best Acc':<10} {'Train Time':<12} {'Trainable Params':<15} {'Efficiency':<10}")
        print("-" * 80)
        
        for name, result in self.results.items():
            efficiency = result['best_val_acc'] / (result['trainable_params'] / 1000000)  # Acc per million params
            print(f"{name:<20} {result['best_val_acc']:<10.4f} "
                  f"{result['training_time']:<12.2f} {result['trainable_params']:<15,} "
                  f"{efficiency:<10.2f}")
        
        # Best in each category
        print("\n=== Best Performers ===")
        
        best_acc = max(self.results.items(), key=lambda x: x[1]['best_val_acc'])
        fastest = min(self.results.items(), key=lambda x: x[1]['training_time'])
        most_efficient = min(self.results.items(), key=lambda x: x[1]['trainable_params'])
        
        print(f"Best Accuracy: {best_acc[0]} ({best_acc[1]['best_val_acc']:.4f})")
        print(f"Fastest Training: {fastest[0]} ({fastest[1]['training_time']:.2f}s)")
        print(f"Most Parameter Efficient: {most_efficient[0]} ({most_efficient[1]['trainable_params']:,} params)")

# Define strategies to compare
comparison_strategies = {
    'Feature_Extraction': {
        'type': 'feature_extraction',
        'lr': 0.001
    },
    'Fine_Tuning_Conservative': {
        'type': 'fine_tuning',
        'lr': 0.0001
    },
    'Fine_Tuning_Aggressive': {
        'type': 'fine_tuning', 
        'lr': 0.001
    },
    'From_Scratch': {
        'type': 'from_scratch',
        'lr': 0.001
    }
}

# Run comparison
comparison = TransferLearningComparison()

# Prepare comparison data (smaller subset for speed)
comp_train = (
    tf.image.resize(x_train_10[:800], (224, 224)).numpy(),
    y_train_10[:800]
)
comp_val = (
    tf.image.resize(x_test_10[:200], (224, 224)).numpy(),
    y_test_10[:200]
)

comparison.compare_strategies(comp_train, comp_val, comparison_strategies, epochs=6)
comparison.print_comparison()

# Transfer Learning Best Practices Summary
def print_transfer_learning_best_practices():
    """Print comprehensive transfer learning best practices"""
    
    print("\n" + "="*70)
    print("TRANSFER LEARNING BEST PRACTICES")
    print("="*70)
    
    practices = {
        "üéØ STRATEGY SELECTION": [
            "Use feature extraction for small datasets (<1000 samples per class)",
            "Apply fine-tuning for medium datasets (1000-10000 samples per class)",
            "Consider from-scratch training for very large datasets (>100k samples)",
            "Start with feature extraction, then progress to fine-tuning",
            "Use ensemble of multiple pre-trained models for best accuracy"
        ],
        
        "üèóÔ∏è MODEL ARCHITECTURE": [
            "Choose pre-trained model based on your domain similarity to ImageNet",
            "Use global average pooling instead of flattening before classifier",
            "Add batch normalization and dropout in custom layers",
            "Keep classifier simple (1-2 dense layers) to avoid overfitting",
            "Consider the efficiency vs accuracy trade-off for deployment"
        ],
        
        "‚ùÑÔ∏è LAYER FREEZING STRATEGIES": [
            "Freeze all layers for feature extraction",
            "Unfreeze top 10-25% layers for fine-tuning",
            "Use progressive unfreezing: start conservative, gradually unfreeze",
            "Never unfreeze batch normalization layers in early stages",
            "Consider layer-wise discriminative learning rates"
        ],
        
        "üìö LEARNING RATE GUIDELINES": [
            "Use 10x lower LR for pre-trained layers vs new layers",
            "Start with 1e-4 for fine-tuning, 1e-3 for feature extraction",
            "Apply learning rate warmup for the first few epochs",
            "Use cosine decay or reduce-on-plateau scheduling",
            "Monitor gradient norms to detect learning rate issues"
        ],
        
        "üíæ DATA HANDLING": [
            "Match input preprocessing to pre-trained model requirements",
            "Apply data augmentation carefully - avoid changing domain too much",
            "Use progressive resizing: start small, gradually increase size",
            "Consider domain-specific augmentations over generic ones",
            "Maintain original aspect ratios when possible"
        ],
        
        "üîÑ TRAINING PROCESS": [
            "Train in stages: classifier ‚Üí top layers ‚Üí more layers",
            "Use early stopping and model checkpointing",
            "Monitor both training and validation metrics closely",
            "Save model at each stage for rollback capability",
            "Apply regularization (dropout, weight decay) appropriately"
        ],
        
        "üé™ ADVANCED TECHNIQUES": [
            "Use mixup or cutmix for improved generalization",
            "Apply label smoothing for better calibration",
            "Consider pseudo-labeling for semi-supervised learning",
            "Use knowledge distillation from larger to smaller models",
            "Implement test-time augmentation for better predictions"
        ],
        
        "üì± PRODUCTION DEPLOYMENT": [
            "Quantize models for mobile/edge deployment",
            "Use TensorFlow Lite for mobile optimization",
            "Consider model pruning for size reduction",
            "Profile inference speed vs accuracy trade-offs",
            "Implement proper model versioning and A/B testing"
        ]
    }
    
    for category, items in practices.items():
        print(f"\n{category}")
        for item in items:
            print(f"  ‚Ä¢ {item}")
    
    print("\n" + "="*70)
    print("DECISION FLOWCHART")
    print("="*70)
    
    flowchart = {
        "Small Dataset (<1k/class)": "Feature Extraction + Data Augmentation",
        "Medium Dataset (1k-10k/class)": "Fine-tuning with Progressive Unfreezing", 
        "Large Dataset (>10k/class)": "Full Fine-tuning or Training from Scratch",
        "Similar Domain to ImageNet": "Any pre-trained CNN (ResNet, EfficientNet)",
        "Very Different Domain": "Generic features + Domain Adaptation",
        "Mobile/Edge Deployment": "MobileNet, EfficientNet-B0 with quantization",
        "High Accuracy Required": "EfficientNet-B4+, Ensemble methods",
        "Fast Training Required": "Feature Extraction with MobileNet",
        "Limited Compute": "MobileNet + Feature Extraction"
    }
    
    for scenario, recommendation in flowchart.items():
        print(f"{scenario:<30}: {recommendation}")

print_transfer_learning_best_practices()
```

---

## Summary

**File Location:** `notebooks/03_computer_vision/08_transfer_learning_applications.ipynb`

This comprehensive notebook mastered transfer learning with tf.keras.applications:

### Core Transfer Learning Strategies:
1. **Feature Extraction**: Freeze pre-trained model, train only classifier
2. **Fine-tuning**: Unfreeze top layers, use discriminative learning rates  
3. **Progressive Unfreezing**: Gradually unfreeze layers during training
4. **Domain Adaptation**: Adapt models to different domains/datasets

### Pre-trained Models Mastered:
- **Classic CNNs**: VGG16/19, ResNet50/101/152, InceptionV3
- **Efficient Models**: MobileNet, MobileNetV2, EfficientNet series
- **Specialized**: Xception, DenseNet, InceptionResNetV2
- **Multi-source**: Ensemble approaches combining multiple models

### Advanced Techniques Implemented:
- **Progressive Unfreezing**: Stage-wise layer unfreezing
- **Discriminative Learning Rates**: Different LR for different layers
- **Domain Adaptation**: Layer-wise adaptation for new domains
- **Multi-source Fusion**: Combining multiple pre-trained models
- **Production Pipeline**: Complete automated transfer learning system

### Key Performance Insights:
- **Feature extraction** works well for small datasets (<1k samples/class)
- **Fine-tuning** provides 5-15% accuracy improvement over feature extraction
- **Progressive unfreezing** prevents catastrophic forgetting
- **MobileNet** offers best efficiency/accuracy trade-off for mobile
- **EfficientNet** achieves state-of-the-art accuracy with reasonable compute

### Production Best Practices:
- Match preprocessing to pre-trained model requirements
- Use progressive training strategies to prevent overfitting
- Apply appropriate data augmentation for target domain
- Implement proper model versioning and deployment pipelines
- Consider inference speed vs accuracy trade-offs

### Strategic Decision Framework:
- **Small data**: Feature extraction + aggressive augmentation
- **Medium data**: Fine-tuning with progressive unfreezing
- **Large data**: Full fine-tuning or training from scratch
- **Mobile deployment**: MobileNet + quantization
- **High accuracy**: EfficientNet + ensemble methods

### Next Steps:
- Apply to semantic segmentation tasks (U-Net, DeepLab)
- Implement object detection with transfer learning
- Explore vision transformers and hybrid approaches
- Deploy optimized models to production environments

This foundation enables leveraging pre-trained models effectively for any computer vision task with minimal training data and compute resources!