# Experiment 7: Pre-trained Models for Oil & Gas Equipment Classification

**Course:** Introduction to Deep Learning | **Module:** Transfer Learning

---

## Objective

Explore and implement pre-trained deep learning models for oil & gas equipment image classification using transfer learning techniques, comparing different architectures and fine-tuning strategies.

## Learning Outcomes

By the end of this experiment, you will:

1. Understand transfer learning principles and pre-trained model advantages
2. Implement multiple pre-trained architectures (ResNet, EfficientNet, MobileNet, ViT)
3. Apply different fine-tuning strategies for domain adaptation
4. Compare model performance, efficiency, and deployment considerations
5. Optimize models for edge deployment and production environments

## Background & Theory

**Transfer Learning** leverages knowledge gained from pre-trained models on large datasets (like ImageNet) and adapts them to specific domains. This approach significantly reduces training time and data requirements while achieving superior performance.

**Key Concepts:**

- **Pre-trained Models:** Networks trained on large-scale datasets with learned feature representations
- **Feature Extraction:** Using pre-trained features with frozen weights
- **Fine-tuning:** Adapting pre-trained weights to new domain with lower learning rates
- **Domain Adaptation:** Adjusting models from general to specific application domains
- **Model Efficiency:** Balancing accuracy with computational and memory requirements

**Mathematical Foundation:**

- Transfer learning: f_target(x) = g(f_pretrained(x)) where g is domain-specific head
- Fine-tuning loss: L = L_task + λL_regularization
- Learning rate scheduling: lr_fine = α × lr_pretrained where α << 1
- Feature similarity: sim(f_source, f_target) = cos(f_source, f_target)

**Model Architectures:**

- **ResNet:** Deep residual networks with skip connections
- **EfficientNet:** Compound scaling for optimal accuracy-efficiency trade-off
- **MobileNet:** Lightweight architecture for mobile deployment
- **Vision Transformer:** Attention-based architecture for image understanding

**Applications in Oil & Gas:**

- Automated equipment inspection and condition monitoring
- Safety compliance verification through image analysis
- Predictive maintenance based on visual equipment assessment
- Remote monitoring of offshore and onshore facilities
- Quality control in manufacturing and installation processes


## Setup & Dependencies

**What to Expect:** This section establishes the Python environment for transfer learning with pre-trained models. We'll install PyTorch, torchvision for pre-trained models, and timm library for state-of-the-art architectures like EfficientNet and Vision Transformers.

**Process Overview:**

1. **Package Installation:** Install PyTorch ecosystem (torch, torchvision), timm for advanced models, and image processing libraries
2. **Model Library Setup:** Configure access to pre-trained models from ImageNet and other large-scale datasets
3. **Environment Configuration:** Set up device detection (CPU/GPU) and random seeds for reproducible transfer learning
4. **Image Processing Tools:** Configure PIL, transforms, and data loading utilities for image classification
5. **Evaluation Framework:** Set up metrics and visualization tools for comparing different architectures

**Expected Outcome:** A fully configured environment ready for transfer learning experiments with multiple pre-trained architectures (ResNet, EfficientNet, MobileNet, ViT) and comprehensive evaluation tools.


In [1]:
# Install required packages
import subprocess, sys
packages = ['torch', 'torchvision', 'numpy', 'matplotlib', 'pandas', 'scikit-learn', 'Pillow', 'timm']
for pkg in packages:
    try: 
        if pkg == 'timm': __import__('timm')
        else: __import__(pkg.replace('-', '_').lower())
    except ImportError: 
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', pkg])

import torch, torch.nn as nn, torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split
import torchvision.transforms as transforms
import torchvision.models as models
import timm  # PyTorch Image Models for additional architectures
import numpy as np, pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
import json, random, time
from pathlib import Path
from PIL import Image

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)
random.seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Data directory setup
DATA_DIR = Path('data')
if not DATA_DIR.exists():
    DATA_DIR = Path('Expirements/data')
if not DATA_DIR.exists():
    DATA_DIR = Path('.')
    print('Warning: Using current directory for data')

# ArivuAI styling
plt.style.use('default')
colors = {'primary': '#004E89', 'secondary': '#3DA5D9', 'accent': '#F1A208', 'dark': '#4F4F4F'}

print(f'✓ PyTorch version: {torch.__version__}')
print(f'✓ Torchvision version: {torch.version.__version__ if hasattr(torch.version, "__version__") else "N/A"}')
print(f'✓ TIMM version: {timm.__version__}')
print(f'✓ Device: {device}')
print(f'✓ Data directory: {DATA_DIR.absolute()}')
print('✓ All packages installed and configured')
print('✓ Random seeds set for reproducible results')
print('✓ ArivuAI styling applied')

  from .autonotebook import tqdm as notebook_tqdm


✓ PyTorch version: 2.8.0+cpu
✓ Torchvision version: 2.8.0+cpu
✓ TIMM version: 1.0.20
✓ Device: cpu
✓ Data directory: d:\Suni Files\AI Code Base\Oil and Gas\Oil and Gas Pruthvi College Course Material\Updated\Expirements\Experiment_7_Pretrained_Models\data
✓ All packages installed and configured
✓ Random seeds set for reproducible results
✓ ArivuAI styling applied


## Pre-trained Model Configuration

Load configuration and set up different pre-trained architectures for comparison.


In [3]:
class PretrainedModelManager:
    def __init__(self, config_path):
        """Initialize pre-trained model manager with configuration"""
        try:
            with open(config_path, 'r') as f:
                self.config = json.load(f)
            print('✓ Configuration loaded from JSON')
        except FileNotFoundError:
            print('Creating default configuration...')
            self.config = self._create_default_config()
        
        self.pretrained_models = self.config['pretrained_models']
        self.num_classes = len(self.config['oil_gas_equipment_classes'])
        self.class_names = self.config['oil_gas_equipment_classes']
    
    def _create_default_config(self):
        """Create default configuration if JSON file not found"""
        return {
            'pretrained_models': {
                'resnet50': {'architecture': 'ResNet-50', 'parameters': '25.6M'},
                'efficientnet_b0': {'architecture': 'EfficientNet-B0', 'parameters': '5.3M'}
            },
            'oil_gas_equipment_classes': {
                '0': 'Drilling_Rig', '1': 'Pump_Jack', '2': 'Storage_Tank', '3': 'Pipeline_Valve'
            }
        }
    
    def create_model(self, model_name, strategy='feature_extraction'):
        """Create and configure pre-trained model for transfer learning"""
        if model_name == 'resnet50':
            model = models.resnet50(pretrained=True)
            # Replace final layer
            model.fc = nn.Linear(model.fc.in_features, self.num_classes)
            
        elif model_name == 'efficientnet_b0':
            model = timm.create_model('efficientnet_b0', pretrained=True, num_classes=self.num_classes)
            
        elif model_name == 'mobilenet_v3':
            model = models.mobilenet_v3_large(pretrained=True)
            model.classifier[-1] = nn.Linear(model.classifier[-1].in_features, self.num_classes)
            
        elif model_name == 'vit_base':
            model = timm.create_model('vit_base_patch16_224', pretrained=True, num_classes=self.num_classes)
            
        else:
            raise ValueError(f"Unknown model: {model_name}")
        
        # Apply transfer learning strategy
        if strategy == 'feature_extraction':
            # Freeze all layers except classifier
            for param in model.parameters():
                param.requires_grad = False
            
            # Unfreeze classifier layers
            if hasattr(model, 'fc'):
                for param in model.fc.parameters():
                    param.requires_grad = True
            elif hasattr(model, 'classifier'):
                for param in model.classifier.parameters():
                    param.requires_grad = True
            elif hasattr(model, 'head'):
                for param in model.head.parameters():
                    param.requires_grad = True
        
        elif strategy == 'fine_tuning':
            # Freeze early layers, unfreeze later layers
            total_layers = len(list(model.parameters()))
            freeze_layers = int(total_layers * 0.7)  # Freeze 70% of layers
            
            for i, param in enumerate(model.parameters()):
                param.requires_grad = i >= freeze_layers
        
        # Full fine-tuning keeps all parameters trainable (default)
        
        return model
    
    def get_model_info(self, model_name):
        """Get detailed information about a specific model"""
        if model_name in self.pretrained_models:
            return self.pretrained_models[model_name]
        return None
    
    def compare_models(self):
        """Compare different pre-trained models"""
        comparison_data = []
        
        for model_name, info in self.pretrained_models.items():
            comparison_data.append({
                'Model': info['architecture'],
                'Parameters': info['parameters'],
                'Input Size': f"{info['input_size'][0]}x{info['input_size'][1]}",
                'ImageNet Accuracy': f"{info['top1_accuracy']:.2f}%",
                'Key Advantage': info['advantages'][0]
            })
        
        return pd.DataFrame(comparison_data)

# Initialize model manager
model_manager = PretrainedModelManager(DATA_DIR / 'pretrained_image_config.json')

print(f'✓ Pre-trained model manager initialized')
print(f'✓ Available models: {list(model_manager.pretrained_models.keys())}')
print(f'✓ Number of classes: {model_manager.num_classes}')
print(f'✓ Equipment classes: {list(model_manager.class_names.values())}')

# Display model comparison
print('\nModel Comparison:')
comparison_df = model_manager.compare_models()
print(comparison_df.to_string(index=False))

✓ Configuration loaded from JSON
✓ Pre-trained model manager initialized
✓ Available models: ['resnet50', 'efficientnet_b0', 'mobilenet_v3', 'vit_base']
✓ Number of classes: 8
✓ Equipment classes: ['Drilling_Rig', 'Pump_Jack', 'Storage_Tank', 'Pipeline_Valve', 'Compressor_Station', 'Flare_Stack', 'Separator_Vessel', 'Control_Panel']

Model Comparison:
             Model Parameters Input Size ImageNet Accuracy             Key Advantage
         ResNet-50      25.6M    224x224            76.15% Deep residual connections
   EfficientNet-B0       5.3M    224x224            77.69%          Compound scaling
      MobileNet-V3       4.2M    224x224            75.77%        Lightweight design
Vision Transformer        86M    224x224            84.53%      Attention mechanisms


## Synthetic Dataset Generation

Create synthetic oil & gas equipment images for transfer learning experiments.


In [4]:
class SyntheticEquipmentDataset(Dataset):
    def __init__(self, model_manager, samples_per_class=100, transform=None):
        """Create synthetic dataset for oil & gas equipment"""
        self.model_manager = model_manager
        self.transform = transform
        self.samples_per_class = samples_per_class
        self.class_names = model_manager.class_names
        self.num_classes = len(self.class_names)
        
        # Generate synthetic images and labels
        print('Generating synthetic equipment images...')
        self.images = []
        self.labels = []
        
        for class_id in range(self.num_classes):
            for _ in range(samples_per_class):
                img = self._generate_equipment_image(class_id)
                self.images.append(img)
                self.labels.append(class_id)
        
        print(f'✓ Generated {len(self.images)} synthetic images')
    
    def _generate_equipment_image(self, class_id):
        """Generate synthetic equipment image based on class"""
        # Create base image
        img = np.zeros((224, 224, 3), dtype=np.uint8)
        
        # Background (industrial setting)
        bg_color = [135, 135, 135] + np.random.randint(-20, 20, 3)
        img[:, :] = np.clip(bg_color, 0, 255)
        
        # Equipment-specific patterns
        if class_id == 0:  # Drilling_Rig
            # Vertical tower structure
            tower_width = 30
            center_x = 112
            img[20:200, center_x-tower_width//2:center_x+tower_width//2] = [100, 80, 60]
            
        elif class_id == 1:  # Pump_Jack
            # Curved beam structure
            center_x, center_y = 80, 112
            img[center_y-10:center_y+10, center_x:center_x+80] = [120, 100, 80]
            
        elif class_id == 2:  # Storage_Tank
            # Cylindrical tank
            center_x, center_y = 112, 140
            y, x = np.ogrid[:224, :224]
            mask = (x - center_x)**2 + (y - center_y)**2 <= 50**2
            img[mask] = [140, 140, 120]
            
        elif class_id == 3:  # Pipeline_Valve
            # Horizontal pipe with valve
            img[100:124, 50:174] = [100, 100, 120]  # Pipe
            img[90:134, 100:124] = [80, 80, 100]    # Valve
            
        elif class_id == 4:  # Compressor_Station
            # Rectangular building structure
            img[80:160, 60:164] = [120, 120, 100]
            
        elif class_id == 5:  # Flare_Stack
            # Tall vertical stack
            img[20:180, 108:116] = [80, 80, 80]
            # Flame at top
            img[20:40, 104:120] = [255, 100, 0]
            
        elif class_id == 6:  # Separator_Vessel
            # Horizontal cylindrical vessel
            img[90:134, 40:184] = [130, 130, 110]
            
        elif class_id == 7:  # Control_Panel
            # Rectangular panel with indicators
            img[70:154, 70:154] = [60, 60, 80]
            # Add some indicator lights
            img[90:100, 90:100] = [0, 255, 0]  # Green light
            img[90:100, 120:130] = [255, 0, 0]  # Red light
        
        # Add noise and variations
        noise = np.random.normal(0, 10, img.shape)
        img = np.clip(img.astype(np.float32) + noise, 0, 255).astype(np.uint8)
        
        return img
    
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, idx):
        image = self.images[idx]
        label = self.labels[idx]
        
        # Convert to PIL Image for transforms
        image = Image.fromarray(image)
        
        if self.transform:
            image = self.transform(image)
        
        return image, label

# Create transforms (ImageNet normalization)
train_transform = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

val_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Create dataset
full_dataset = SyntheticEquipmentDataset(model_manager, samples_per_class=100)

# Split dataset
total_size = len(full_dataset)
train_size = int(0.7 * total_size)
val_size = int(0.15 * total_size)
test_size = total_size - train_size - val_size

train_dataset, val_dataset, test_dataset = random_split(
    full_dataset, [train_size, val_size, test_size],
    generator=torch.Generator().manual_seed(42)
)

# Apply transforms
train_dataset.dataset.transform = train_transform
val_dataset.dataset.transform = val_transform
test_dataset.dataset.transform = val_transform

print(f'✓ Dataset created and split:')
print(f'• Training: {len(train_dataset)} samples')
print(f'• Validation: {len(val_dataset)} samples')
print(f'• Test: {len(test_dataset)} samples')
print(f'• Classes: {list(model_manager.class_names.values())}')

Generating synthetic equipment images...
✓ Generated 800 synthetic images
✓ Dataset created and split:
• Training: 560 samples
• Validation: 120 samples
• Test: 120 samples
• Classes: ['Drilling_Rig', 'Pump_Jack', 'Storage_Tank', 'Pipeline_Valve', 'Compressor_Station', 'Flare_Stack', 'Separator_Vessel', 'Control_Panel']


## Summary & Validation

This is a simplified version of Experiment 7 for testing. The complete implementation would include model training, comparison, and deployment optimization.

**Key Components Demonstrated:**

- Transfer learning theory and pre-trained model advantages
- Multiple architecture support (ResNet, EfficientNet, MobileNet, ViT)
- Different fine-tuning strategies (feature extraction, fine-tuning, full training)
- Synthetic oil & gas equipment image generation
- ImageNet normalization and data augmentation

**Next Steps:**

- Implement training loops for different transfer learning strategies
- Add model comparison and performance benchmarking
- Include efficiency analysis (inference time, model size, accuracy)
- Implement model optimization for edge deployment
- Add comprehensive evaluation and visualization tools
