# Computer Vision for Scientific Research

**DOST-ITDI AI Training Workshop**  
**Day 1-2 Bridge Session: From Molecules to Images**

---

## Learning Objectives
1. Understand images as numerical data
2. Apply basic image processing techniques
3. Implement image augmentation strategies
4. Build and train Convolutional Neural Networks (CNNs)
5. Apply transfer learning to scientific images
6. Connect image analysis to molecular structure analysis

## Why Computer Vision for Chemistry?
- Analyze microscopy images (crystals, cells, materials)
- Classify chemical structures from images
- Quality control in manufacturing
- Automated lab equipment reading
- Document digitization and OCR
- Molecular visualization and analysis

**Key Insight**: Images are just another type of data, like molecular descriptors we've been using!

## Section 1: Images as Data

### 1.1 Setup and Installation

In [None]:
# Install required libraries
!pip install opencv-python pillow scikit-image torch torchvision rdkit -q

print("[OK] Libraries installed successfully!")

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
import cv2
from skimage import filters, feature, exposure
import warnings
warnings.filterwarnings('ignore')

# Deep Learning
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, models
from torch.utils.data import Dataset, DataLoader

# Chemistry
from rdkit import Chem
from rdkit.Chem import Draw

# Scikit-learn
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Set seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

# Plotting style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print("[OK] All libraries imported!")

### 1.2 Load and Display Images

In [None]:
# Create a sample molecular structure image
# Using Aspirin - a molecule we're familiar with!
mol = Chem.MolFromSmiles('CC(=O)Oc1ccccc1C(=O)O')  # Aspirin
img = Draw.MolToImage(mol, size=(400, 400))

# Display
plt.figure(figsize=(8, 8))
plt.imshow(img)
plt.title('Aspirin Molecular Structure', fontsize=14, fontweight='bold')
plt.axis('off')
plt.show()

print(f"Image type: {type(img)}")
print(f"Image size: {img.size}")
print(f"Image mode: {img.mode}")

### 1.3 Images as Numpy Arrays

In [None]:
# Convert to numpy array
img_array = np.array(img)

print(f"Array shape: {img_array.shape}")  # (height, width, channels)
print(f"Data type: {img_array.dtype}")
print(f"Value range: [{img_array.min()}, {img_array.max()}]")

# Visualize RGB channels
fig, axes = plt.subplots(1, 4, figsize=(16, 4))

# Original
axes[0].imshow(img_array)
axes[0].set_title('Original Image')
axes[0].axis('off')

# Red channel
axes[1].imshow(img_array[:,:,0], cmap='Reds')
axes[1].set_title('Red Channel')
axes[1].axis('off')

# Green channel
axes[2].imshow(img_array[:,:,1], cmap='Greens')
axes[2].set_title('Green Channel')
axes[2].axis('off')

# Blue channel
axes[3].imshow(img_array[:,:,2], cmap='Blues')
axes[3].set_title('Blue Channel')
axes[3].axis('off')

plt.tight_layout()
plt.show()

print("\n[KEY INSIGHT] Images are just 3D arrays of numbers!")
print("Similar to how molecules are arrays of descriptors.")

### 1.4 Grayscale Conversion

In [None]:
# Convert to grayscale
img_gray = np.array(img.convert('L'))

print(f"Grayscale shape: {img_gray.shape}")  # (height, width) - no channels!
print(f"Values range: [{img_gray.min()}, {img_gray.max()}]")

# Display comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

axes[0].imshow(img_array)
axes[0].set_title('RGB Image (3 channels)')
axes[0].axis('off')

axes[1].imshow(img_gray, cmap='gray')
axes[1].set_title('Grayscale Image (1 channel)')
axes[1].axis('off')

plt.tight_layout()
plt.show()

print("[INFO] Grayscale simplifies processing - one number per pixel instead of three!")

### 1.5 Generate Dataset of Molecular Images

In [None]:
# Generate images for molecules from ESOL dataset
# These are the same molecules we used in notebooks 01, 02, and 04!

url = "https://raw.githubusercontent.com/deepchem/deepchem/master/datasets/delaney-processed.csv"
df = pd.read_csv(url)

print(f"Loaded {len(df)} molecules from ESOL dataset")
print("These are the same molecules we used in previous notebooks!")
print(f"\nDataset columns: {df.columns.tolist()}")

# Generate images for first 100 molecules
n_samples = 100
molecular_images = []
molecular_labels = []

for idx in range(n_samples):
    row = df.iloc[idx]
    mol = Chem.MolFromSmiles(row['smiles'])

    if mol is not None:
        # Generate image
        img = Draw.MolToImage(mol, size=(200, 200))
        img_array = np.array(img.convert('L'))  # Grayscale
        molecular_images.append(img_array)

        # Create label: molecules with rings vs without rings
        # Count aromatic rings using SMILES
        num_rings = row['smiles'].count('c') > 0  # Simple check for aromatic
        label = 1 if num_rings else 0
        molecular_labels.append(label)

molecular_images = np.array(molecular_images)
molecular_labels = np.array(molecular_labels)

print(f"\nGenerated {len(molecular_images)} molecular structure images")
print(f"Image array shape: {molecular_images.shape}")
print(f"Labels: {len(molecular_labels)}")
print(f"\nLabel distribution:")
print(f"  Cyclic (with rings): {(molecular_labels == 1).sum()}")
print(f"  Acyclic (no rings): {(molecular_labels == 0).sum()}")

### 1.6 Visualize Dataset

In [None]:
# Display sample molecules
fig, axes = plt.subplots(3, 6, figsize=(15, 8))
axes = axes.ravel()

for idx in range(18):
    axes[idx].imshow(molecular_images[idx], cmap='gray')
    label_text = 'Cyclic' if molecular_labels[idx] == 1 else 'Acyclic'
    axes[idx].set_title(f'{label_text}', fontsize=9)
    axes[idx].axis('off')

plt.suptitle('Sample Molecular Structure Images', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("[INFO] These are visual representations of molecules we've analyzed before!")
print("We predicted their properties from SMILES and descriptors.")
print("Now we'll use their visual structure directly.")

## Section 2: Image Processing & Enhancement

### 2.1 Edge Detection - Sobel

In [None]:
# Edge detection finds boundaries (like molecular bonds!)
test_img = molecular_images[10]

# Apply Sobel edge detection
edges_sobel = filters.sobel(test_img)

# Display
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

axes[0].imshow(test_img, cmap='gray')
axes[0].set_title('Original Molecular Structure')
axes[0].axis('off')

axes[1].imshow(edges_sobel, cmap='gray')
axes[1].set_title('Sobel Edge Detection')
axes[1].axis('off')

axes[2].imshow(test_img, cmap='gray', alpha=0.7)
axes[2].imshow(edges_sobel, cmap='Reds', alpha=0.5)
axes[2].set_title('Overlay: Edges Highlighted')
axes[2].axis('off')

plt.tight_layout()
plt.show()

print("[INFO] Edge detection highlights molecular bonds and structure!")
print("Similar to how we extracted structural features (rings, bonds) from SMILES.")

### 2.2 Edge Detection - Canny

In [None]:
# Canny edge detection (more sophisticated)
edges_canny = feature.canny(test_img/255.0, sigma=2)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

axes[0].imshow(edges_sobel, cmap='gray')
axes[0].set_title('Sobel Edge Detection')
axes[0].axis('off')

axes[1].imshow(edges_canny, cmap='gray')
axes[1].set_title('Canny Edge Detection')
axes[1].axis('off')

plt.suptitle('Comparing Edge Detection Methods', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("[INFO] Canny produces cleaner edges - useful for identifying molecular features!")

### 2.3 Image Enhancement - Contrast

In [None]:
# Enhance contrast to make features more visible

# Create low-contrast version
low_contrast = exposure.rescale_intensity(test_img, in_range=(50, 200))

# Enhance with different methods
eq_hist = exposure.equalize_hist(low_contrast)
eq_adaptive = exposure.equalize_adapthist(low_contrast)

# Display
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

axes[0,0].imshow(test_img, cmap='gray')
axes[0,0].set_title('Original')
axes[0,0].axis('off')

axes[0,1].imshow(low_contrast, cmap='gray')
axes[0,1].set_title('Low Contrast')
axes[0,1].axis('off')

axes[1,0].imshow(eq_hist, cmap='gray')
axes[1,0].set_title('Histogram Equalization')
axes[1,0].axis('off')

axes[1,1].imshow(eq_adaptive, cmap='gray')
axes[1,1].set_title('Adaptive Histogram Equalization')
axes[1,1].axis('off')

plt.suptitle('Image Enhancement Techniques', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("[INFO] Contrast enhancement makes molecular features more visible!")
print("Like feature scaling in ML - brings out important patterns.")

## Section 3: Image Augmentation

### 3.1 Introduction to Image Augmentation

Remember data augmentation from Classification (SMOTE, noise, mixup)?  
**Same principle for images!**

#### Why Augment Images?
- Increase training data
- Improve model robustness
- Prevent overfitting
- Simulate real-world variations

#### Common Transformations:
1. **Geometric:** rotation, flipping, scaling
2. **Color:** brightness, contrast, saturation
3. **Noise:** Gaussian, salt-and-pepper
4. **Cropping and padding**

**Key Insight**: These simulate how the same molecule might appear in different conditions:
- Different orientations
- Different lighting
- Different image quality
- Different scales

### 3.2 Geometric Transformations

In [None]:
# Define transformations
transform_rotate = transforms.RandomRotation(degrees=30)
transform_hflip = transforms.RandomHorizontalFlip(p=1.0)
transform_vflip = transforms.RandomVerticalFlip(p=1.0)
transform_affine = transforms.RandomAffine(degrees=0, translate=(0.1, 0.1))

# Convert to PIL Image
test_pil = Image.fromarray(test_img)

# Apply transformations
rotated = transform_rotate(test_pil)
hflipped = transform_hflip(test_pil)
vflipped = transform_vflip(test_pil)
affine = transform_affine(test_pil)

# Display
fig, axes = plt.subplots(2, 3, figsize=(15, 10))

axes[0,0].imshow(test_img, cmap='gray')
axes[0,0].set_title('Original')
axes[0,0].axis('off')

axes[0,1].imshow(rotated, cmap='gray')
axes[0,1].set_title('Rotated (+/- 30 degrees)')
axes[0,1].axis('off')

axes[0,2].imshow(hflipped, cmap='gray')
axes[0,2].set_title('Horizontal Flip')
axes[0,2].axis('off')

axes[1,0].imshow(vflipped, cmap='gray')
axes[1,0].set_title('Vertical Flip')
axes[1,0].axis('off')

axes[1,1].imshow(affine, cmap='gray')
axes[1,1].set_title('Translation (shifted)')
axes[1,1].axis('off')

axes[1,2].axis('off')

plt.suptitle('Geometric Augmentations', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("[INFO] Geometric transformations create different views of the same molecule!")
print("The molecular structure is the same, just oriented differently.")

### 3.3 Complete Augmentation Pipeline

In [None]:
# Create comprehensive augmentation pipeline
augmentation_pipeline = transforms.Compose([
    transforms.RandomRotation(degrees=15),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomAffine(degrees=0, translate=(0.1, 0.1)),
])

# Generate augmented versions
n_augmentations = 8
augmented_samples = []

for i in range(n_augmentations):
    aug_img = augmentation_pipeline(test_pil)
    augmented_samples.append(aug_img)

# Display original + augmentations
fig, axes = plt.subplots(3, 3, figsize=(12, 12))
axes = axes.ravel()

axes[0].imshow(test_pil, cmap='gray')
axes[0].set_title('Original', fontsize=11, fontweight='bold')
axes[0].axis('off')

for idx in range(n_augmentations):
    axes[idx+1].imshow(augmented_samples[idx], cmap='gray')
    axes[idx+1].set_title(f'Augmented {idx+1}', fontsize=10)
    axes[idx+1].axis('off')

plt.suptitle('Augmentation Pipeline: 1 Image -> 8 Variations',
             fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print(f"From 1 original image, created {n_augmentations} augmented versions!")
print(f"Dataset size increased by {n_augmentations+1}x")
print("\nEach augmentation maintains the molecular structure")
print("but presents it from different perspectives.")
print("\n[CONNECTION] Remember molecular augmentation (SMOTE, noise)?")
print("Same principle - expand dataset to improve model performance!")

## Section 4: Convolutional Neural Networks (CNNs)

### 4.1 What are CNNs?

Neural networks designed specifically for images.

#### Key Difference from Regular Neural Networks:
- **Regular NN:** Treats image as flat array (loses spatial structure)
- **CNN:** Preserves spatial relationships between pixels

#### Layers in a CNN:
1. **Convolutional Layer:** Learns local patterns (edges, textures)
2. **Pooling Layer:** Reduces size, keeps important features
3. **Fully Connected Layer:** Makes final classification

#### Why CNNs for Images?
- Automatically learn features (edges, shapes, objects)
- Parameter efficient (shared weights)
- Translation invariant (finds patterns anywhere in image)

**Connection**: Like how neural networks learned molecular patterns automatically,  
CNNs learn visual patterns automatically!

### 4.2 Simple CNN Architecture

In [None]:
# Define a simple CNN for molecular image classification
class SimpleMolecularCNN(nn.Module):
    def __init__(self, num_classes=2):
        super(SimpleMolecularCNN, self).__init__()

        # Convolutional layers
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, padding=1)  # 1 channel (grayscale) -> 16
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2, 2)  # Reduce size by half

        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)  # 16 -> 32 channels
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2, 2)

        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)  # 32 -> 64 channels
        self.relu3 = nn.ReLU()
        self.pool3 = nn.MaxPool2d(2, 2)

        # Fully connected layers
        # After 3 pooling layers: 200/2/2/2 = 25
        self.fc1 = nn.Linear(64 * 25 * 25, 128)
        self.relu4 = nn.ReLU()
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(128, num_classes)

    def forward(self, x):
        # Convolutional layers
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.pool3(self.relu3(self.conv3(x)))

        # Flatten
        x = x.view(x.size(0), -1)

        # Fully connected layers
        x = self.relu4(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)

        return x

# Create model
model = SimpleMolecularCNN(num_classes=2)
print(model)
print(f"\nTotal parameters: {sum(p.numel() for p in model.parameters()):,}")

# Test forward pass
test_input = torch.randn(1, 1, 200, 200)  # Batch=1, Channels=1, H=200, W=200
output = model(test_input)
print(f"\nInput shape: {test_input.shape}")
print(f"Output shape: {output.shape}")
print(f"Output values: {output}")

### 4.3 Prepare Data for Training

In [None]:
# Custom Dataset class
class MolecularImageDataset(Dataset):
    """Custom dataset for molecular images"""
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        image = self.images[idx]
        label = self.labels[idx]

        # Convert to PIL Image
        image = Image.fromarray(image)

        # Apply transformations
        if self.transform:
            image = self.transform(image)
        else:
            # Default: convert to tensor and normalize
            image = transforms.ToTensor()(image)

        return image, label

# Define transformations
train_transform = transforms.Compose([
    transforms.RandomRotation(15),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])  # Normalize to [-1, 1]
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    molecular_images, molecular_labels, test_size=0.2, random_state=42, stratify=molecular_labels
)

print(f"Training set: {len(X_train)} images")
print(f"Test set: {len(X_test)} images")
print(f"\nTraining class distribution:")
print(f"  Cyclic: {(y_train == 1).sum()}")
print(f"  Acyclic: {(y_train == 0).sum()}")

# Create datasets
train_dataset = MolecularImageDataset(X_train, y_train, transform=train_transform)
test_dataset = MolecularImageDataset(X_test, y_test, transform=test_transform)

# Create dataloaders
batch_size = 16
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

print(f"\nDataLoaders created:")
print(f"  Training batches: {len(train_loader)}")
print(f"  Test batches: {len(test_loader)}")
print(f"  Batch size: {batch_size}")

### 4.4 Train the CNN

In [None]:
# Training setup
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

model = SimpleMolecularCNN(num_classes=2).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training function
def train_epoch(model, train_loader, criterion, optimizer, device):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Statistics
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_loader)
    epoch_acc = 100 * correct / total
    return epoch_loss, epoch_acc

# Validation function
def validate(model, test_loader, criterion, device):
    model.eval()
    running_loss = 0.0
    correct = 0
    total = 0

    with torch.no_grad():
        for images, labels in test_loader:
            images = images.to(device)
            labels = labels.to(device)

            outputs = model(images)
            loss = criterion(outputs, labels)

            running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(test_loader)
    epoch_acc = 100 * correct / total
    return epoch_loss, epoch_acc

# Train model
num_epochs = 20
train_losses = []
train_accs = []
val_losses = []
val_accs = []

print("Training CNN...")
print("="*60)

for epoch in range(num_epochs):
    train_loss, train_acc = train_epoch(model, train_loader, criterion, optimizer, device)
    val_loss, val_acc = validate(model, test_loader, criterion, device)

    train_losses.append(train_loss)
    train_accs.append(train_acc)
    val_losses.append(val_loss)
    val_accs.append(val_acc)

    if (epoch + 1) % 5 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}]")
        print(f"  Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%")
        print(f"  Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%")

print("\n[OK] Training complete!")
print(f"Final validation accuracy: {val_accs[-1]:.2f}%")

### 4.5 Visualize Training Progress

In [None]:
# Plot training curves
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Loss
axes[0].plot(train_losses, label='Training Loss', linewidth=2)
axes[0].plot(val_losses, label='Validation Loss', linewidth=2)
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Training and Validation Loss')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Accuracy
axes[1].plot(train_accs, label='Training Accuracy', linewidth=2)
axes[1].plot(val_accs, label='Validation Accuracy', linewidth=2)
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy (%)')
axes[1].set_title('Training and Validation Accuracy')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Check for overfitting
final_train_acc = train_accs[-1]
final_val_acc = val_accs[-1]
gap = abs(final_train_acc - final_val_acc)

print(f"\nFinal Training Accuracy: {final_train_acc:.2f}%")
print(f"Final Validation Accuracy: {final_val_acc:.2f}%")
print(f"Gap: {gap:.2f}%")

if gap > 15:
    print("\n[!] Large gap suggests overfitting. Try:")
    print("  - More data augmentation")
    print("  - Higher dropout rate")
    print("  - Simpler architecture")
else:
    print("\n[OK] Model appears to be generalizing well!")

### 4.6 Evaluate Model

In [None]:
# Get predictions on test set
model.eval()
all_preds = []
all_labels = []

with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)

        all_preds.extend(predicted.cpu().numpy())
        all_labels.extend(labels.numpy())

all_preds = np.array(all_preds)
all_labels = np.array(all_labels)

# Confusion matrix
cm = confusion_matrix(all_labels, all_preds)

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False,
            xticklabels=['Acyclic', 'Cyclic'],
            yticklabels=['Acyclic', 'Cyclic'])
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix: Molecular Structure Classification')
plt.tight_layout()
plt.show()

# Detailed metrics
print("\nClassification Report:")
print("="*60)
print(classification_report(all_labels, all_preds,
                           target_names=['Acyclic', 'Cyclic']))

### 5.4 Summary: Understanding How CNNs Learn

**What We Discovered:**

1. **Learned Filters (5.1)**
   - Each filter is a small pattern detector (3×3 pixels)
   - First layer learns basic features: edges, lines, corners
   - These are automatically learned, not hand-coded!

2. **Feature Maps (5.2)**
   - Shows how each filter responds to the input image
   - Earlier layers → Simple features (edges)
   - Deeper layers → Complex features (shapes, structures)
   - Progressive abstraction from pixels to concepts

3. **Filter Preferences (5.3)**
   - Different filters activate for different patterns
   - Filters specialize in detecting specific molecular features
   - This is how the network distinguishes cyclic from acyclic molecules

**Key Insights:**

- **Hierarchical Learning**:
  - Layer 1: Edges and textures
  - Layer 2: Shapes and parts
  - Layer 3: Complex molecular structures
  
- **Automatic Feature Engineering**:
  - No manual feature extraction needed
  - Network learns what's important for the task
  - Compare with: Manual RDKit descriptors (LogP, MW, etc.)

- **Interpretability**:
  - We can peek inside the "black box"
  - Understand what the network learned
  - Debug and improve architecture

**Connection to Other Concepts:**

| Concept | Manual Approach | CNN Approach |
|---------|----------------|--------------|
| **Feature Extraction** | RDKit descriptors | Learned filters |
| **Feature Selection** | Correlation analysis | Network learns importance |
| **Hierarchy** | Expert knowledge | Automatic layer-by-layer |
| **Interpretability** | Clear feature names | Visualize activations |

**Why This Matters:**

Understanding how CNNs learn helps us:
- Design better architectures
- Diagnose model failures
- Build trust in predictions
- Combine with domain knowledge

**Next Steps:**

If you want to go deeper:
- Try visualizing filters from conv2 and conv3
- Use Grad-CAM to see which image regions matter most
- Apply transfer learning with pretrained networks
- Combine CNN features with RDKit descriptors

## Summary

### What We Learned:

1. **Images as Data**
   - Images are numerical arrays (height × width × channels)
   - Can be processed like any other ML data
   - Generated molecular structure images from familiar ESOL dataset

2. **Image Processing**
   - Edge detection: Find molecular structures
   - Enhancement: Improve image quality
   - Preprocessing improves ML performance

3. **Image Augmentation**
   - Geometric: rotation, flipping, translation
   - **Connection**: Same principle as molecular augmentation!
   - Expands dataset, improves robustness

4. **Convolutional Neural Networks**
   - Automatic feature learning
   - Preserve spatial structure
   - Hierarchical pattern recognition
   - Trained CNN to classify molecular structures

### Key Takeaways:

1. **Multi-Modal Learning**: Same molecules, different representations
   - SMILES notation → Molecular descriptors → Images
   - Each modality captures different aspects

2. **Augmentation Everywhere**:
   - Molecular features: noise, SMOTE, mixup
   - Images: rotation, brightness, noise
   - **Universal principle**: Expand small datasets

3. **Automatic vs Manual Features**:
   - Manual: RDKit descriptors, image filters
   - Automatic: Neural networks, CNNs
   - Trade-off: interpretability vs performance

### Connections to Other Notebooks:

| Notebook | Connection |
|----------|------------|
| **01_EDA** | Same ESOL molecules, visualized as images |
| **02_Regression** | Predicted properties from descriptors |
| **03_Classification** | Same augmentation principles (SMOTE, noise) |
| **04_PyTorch** | Same training loops, Dataset classes |
| **04b_HuggingFace** | Transfer learning concept (ChemBERTa) |
| **05_LLMs** | Image generation completes the workflow! |

### Resources:

- [OpenCV](https://opencv.org/)
- [scikit-image](https://scikit-image.org/)
- [PyTorch Vision](https://pytorch.org/vision/)
- [CS231n](http://cs231n.stanford.edu/) - Stanford CV course

---

**Great job! You now understand computer vision fundamentals for scientific research!**