# GNN Molecular Toxicity Prediction - Training & Grid Search

This notebook demonstrates training GIN models for molecular toxicity prediction using both single experiments and hyperparameter grid search.

## Features

- **Google Colab support**: automatic setup and dependency installation
- **Single model training**: configurable hyperparameters and data splitting
- **Grid search**: systematic hyperparameter optimization
- **Comprehensive analysis**: training curves, model comparison, and results visualization

---

## 1. Environment Setup (Google Colab)

Run the following cell to set up the environment in Google Colab:

In [None]:
import os
import shutil
import sys

# This cell is intended to be run only in Google Colab
try:
    import google.colab
    IN_COLAB = True
    print("Running in Google Colab...")
except ImportError:
    IN_COLAB = False
    print("Running in local environment...")

if IN_COLAB:
    # 1. Remove existing folder if it already exists
    repo_name = 'gnn-molecule-prediction'
    repo_path = os.path.join('/content', repo_name)

    if os.path.exists(repo_path):
        shutil.rmtree(repo_path)
        print(f"Removed existing directory: {repo_path}")
    else:
        print(f"No existing directory found.")

    # 2. Clone the GitHub repository
    %cd /content
    !git clone https://github.com/sth-s/gnn-molecule-prediction.git

    # 3. Change working directory to the project root
    %cd gnn-molecule-prediction

    # 4. Install dependencies via pip
    !pip install torch torch-geometric scikit-learn matplotlib seaborn deepchem
    !pip install rdkit-pypi  # For molecular processing
    
    print("\nDependencies installed successfully!")
    print(f"Working directory: {os.getcwd()}")
    
else:
    # For local development, just change to the parent directory
    if os.path.basename(os.getcwd()) == 'notebooks':
        os.chdir('../')
    print(f"Working directory: {os.getcwd()}")

# Verify the setup
print("\n=== Environment Verification ===")
print(f"Python version: {sys.version}")
print(f"Current directory: {os.getcwd()}")
print(f"Directory contents: {os.listdir('.')}")

## 2. Import Libraries and Setup

In [None]:
# Core libraries
import torch
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import json
import subprocess
import time
from datetime import datetime
from pathlib import Path

# Set style for plots
sns.set_theme(style="whitegrid")
sns.set_palette('muted')
plt.rcParams['figure.figsize'] = (12, 8)

# Check device availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

print("\n=== Setup Complete ===")

---

## 3. Basic Model Training

Let's start with basic training examples using the standalone script:

# Training GIN Model with Command-Line Script

This notebook demonstrates how to use the standalone training script `scripts/train.py` for molecular toxicity prediction. The script provides a command-line interface with configurable hyperparameters and data splitting methods.

## Features

- **Multiple data splitting methods**: random, scaffold-based, and index-based
- **Configurable hyperparameters**: learning rate, batch size, model architecture, etc.
- **Automatic model saving**: saves best model and optional checkpoints
- **Comprehensive logging**: detailed training progress and results
- **Reproducible results**: configurable random seed

## Basic Usage

Let's start with a basic training run using default parameters:

In [None]:
# Run basic training with default parameters
!python ../scripts/train.py --epochs 50 --exp_name "basic_demo"

## View Available Options

Let's see all available command-line options:

In [None]:
# View help information
!python ../scripts/train.py --help

## Custom Hyperparameters

Now let's try different hyperparameters:

In [None]:
# Training with custom hyperparameters
!python ../scripts/train.py \
    --epochs 100 \
    --lr 5e-4 \
    --hidden_channels 128 \
    --num_layers 4 \
    --dropout 0.3 \
    --batch_size 64 \
    --exp_name "custom_hyperparams"

## Different Data Splitting Methods

### Scaffold-based Split
This method groups molecules by their Murcko scaffolds, ensuring better generalization:

In [None]:
# Scaffold-based splitting
!python ../scripts/train.py \
    --epochs 75 \
    --split_method scaffold \
    --exp_name "scaffold_split_demo"

### Index-based Split
This method splits based on molecule indices (deterministic):

In [None]:
# Index-based splitting
!python ../scripts/train.py \
    --epochs 75 \
    --split_method index \
    --exp_name "index_split_demo"

## Custom Split Ratios

You can also customize the train/validation/test split ratios:

In [None]:
# Custom split ratios (80/10/10)
!python ../scripts/train.py \
    --epochs 50 \
    --train_ratio 0.8 \
    --val_ratio 0.1 \
    --test_ratio 0.1 \
    --exp_name "custom_split_ratios"

## Viewing Results

Let's explore the saved results from our training runs:

In [None]:
import json
import os
from pathlib import Path

# List all experiment directories
models_dir = Path("../models")
if models_dir.exists():
    experiments = [d.name for d in models_dir.iterdir() if d.is_dir()]
    print("Available experiments:")
    for exp in sorted(experiments):
        print(f"  - {exp}")
else:
    print("No experiments found. Run a training script first.")

## Compare Different Experiments

Let's compare the performance of different experiments:

In [None]:
# Load and display results from the most recent experiment
import matplotlib.pyplot as plt
import numpy as np

def load_experiment_results(exp_name):
    """Load results from an experiment."""
    # results_path = models_dir / exp_name / "results.json"
    results_path = Path("/home/sths/Documents/proj/gnn-molecule-prediction/best/results.json")
    if results_path.exists():
        with open(results_path, 'r') as f:
            return json.load(f)
    return None

def plot_training_curves(results):
    """Plot training and validation curves."""
    history = results['history']
    epochs = range(1, len(history['train_loss']) + 1)
    
    fig, axes = plt.subplots(2, 2, figsize=(12, 8))
    
    # Loss
    axes[0, 0].plot(epochs, history['train_loss'], label='Train', alpha=0.8)
    axes[0, 0].plot(epochs, history['val_loss'], label='Validation', alpha=0.8)
    axes[0, 0].set_title('Loss')
    axes[0, 0].set_xlabel('Epoch')
    axes[0, 0].set_ylabel('Loss')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # ROC AUC
    axes[0, 1].plot(epochs, history['train_roc_auc'], label='Train', alpha=0.8)
    axes[0, 1].plot(epochs, history['val_roc_auc'], label='Validation', alpha=0.8)
    axes[0, 1].set_title('ROC AUC')
    axes[0, 1].set_xlabel('Epoch')
    axes[0, 1].set_ylabel('ROC AUC')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # PR AUC
    axes[1, 0].plot(epochs, history['train_pr_auc'], label='Train', alpha=0.8)
    axes[1, 0].plot(epochs, history['val_pr_auc'], label='Validation', alpha=0.8)
    axes[1, 0].set_title('PR AUC')
    axes[1, 0].set_xlabel('Epoch')
    axes[1, 0].set_ylabel('PR AUC')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # Learning Rate
    if history['learning_rates']:
        axes[1, 1].plot(history['learning_rates'], alpha=0.8)
        axes[1, 1].set_title('Learning Rate')
        axes[1, 1].set_xlabel('Step')
        axes[1, 1].set_ylabel('Learning Rate')
        axes[1, 1].set_yscale('log')
        axes[1, 1].grid(True, alpha=0.3)
    else:
        axes[1, 1].text(0.5, 0.5, 'No LR history', ha='center', va='center')
        axes[1, 1].set_title('Learning Rate')
    
    plt.tight_layout()
    plt.show()

# Try to load and plot results from the most recent experiment
# if 'experiments' in locals() and experiments:
if True:
    # latest_exp = sorted(experiments)[-1]
    # print(f"Loading results from: {latest_exp}")
    
    results = load_experiment_results("best")
    if results:
        print(f"\nExperiment Summary:")
        print(f"  Best validation ROC AUC: {results['training_info']['best_val_roc_auc']:.4f} (epoch {results['training_info']['best_epoch']})")
        print(f"  Final test ROC AUC: {results['final_results']['test']['roc_auc']:.4f}")
        print(f"  Final test PR AUC: {results['final_results']['test']['pr_auc']:.4f}")
        print(f"  Training time: {results['training_info']['total_time_minutes']:.1f}s")
        
        # Plot training curves
        plot_training_curves(results)
    else:
        print("Could not load results.")
else:
    print("No experiments available to analyze.")

In [None]:
# Compare experiment results
def compare_experiments():
    """Compare results from all experiments."""
    if not models_dir.exists():
        print("No experiments found.")
        return
    
    comparison_data = []
    
    for exp_dir in models_dir.iterdir():
        if exp_dir.is_dir():
            results = load_experiment_results(exp_dir.name)
            if results:
                comparison_data.append({
                    'experiment': exp_dir.name,
                    'split_method': results['args']['split_method'],
                    'lr': results['args']['lr'],
                    'hidden_channels': results['args']['hidden_channels'],
                    'num_layers': results['args']['num_layers'],
                    'dropout': results['args']['dropout'],
                    'best_val_roc': results['final_results']['best_validation']['roc_auc'],
                    'test_roc': results['test']['roc_auc'],
                    'test_pr': results['test']['pr_auc'],
                    'training_time': results['training_info']['total_time_minutes']
                })
    
    if comparison_data:
        import pandas as pd
        
        df = pd.DataFrame(comparison_data)
        df = df.sort_values('test_roc', ascending=False)
        
        print("Experiment Comparison (sorted by test ROC AUC):")
        print("=" * 80)
        
        for _, row in df.iterrows():
            print(f"Experiment: {row['experiment']}")
            print(f"  Split: {row['split_method']:8s} | LR: {row['lr']:.1e} | Hidden: {row['hidden_channels']:3d} | Layers: {row['num_layers']} | Dropout: {row['dropout']:.2f}")
            print(f"  Val ROC: {row['best_val_roc']:.4f} | Test ROC: {row['test_roc']:.4f} | Test PR: {row['test_pr']:.4f} | Time: {row['training_time']:.1f}s")
            print()
    else:
        print("No experiment results found.")

compare_experiments()

## Loading Saved Models

You can also load the saved models for inference or further analysis:

In [None]:
import sys
import torch
sys.path.append('../src')

from model import GIN
from data_utils import load_tox21

def load_trained_model(exp_name, device='cpu'):
    """Load a trained model from an experiment."""
    exp_dir = models_dir / exp_name
    
    # Load configuration
    config_path = exp_dir / "config.json"
    with open(config_path, 'r') as f:
        config = json.load(f)
    
    # Load dataset to get dimensions
    dataset = load_tox21(
        root=config['data_root'],
        filename=config['filename'],
        smiles_col="smiles",
        mol_id_col="mol_id",
        cache_file="data.pt",
        recreate=False,
        auto_download=True,
        device=device
    )
    
    # Create model
    model = GIN(
        in_channels=dataset.num_node_features,
        hidden_channels=config['hidden_channels'],
        num_classes=dataset.num_classes,
        num_layers=config['num_layers'],
        dropout=config['dropout']
    )
    
    # Load weights
    model_path = exp_dir / "best_model.pt"
    model.load_state_dict(torch.load(model_path, map_location=device))
    model.eval()
    
    return model, dataset, config

# Example: Load the best performing model
if 'experiments' in locals() and experiments:
    # Load the first experiment as an example
    exp_name = experiments[0]
    print(f"Loading model from experiment: {exp_name}")
    
    try:
        model, dataset, config = load_trained_model(exp_name)
        print(f"Successfully loaded model with {sum(p.numel() for p in model.parameters()):,} parameters")
        print(f"Model configuration: {config['hidden_channels']} hidden channels, {config['num_layers']} layers")
    except Exception as e:
        print(f"Error loading model: {e}")
else:
    print("No experiments available to load.")

## Summary

The `scripts/train.py` script provides a comprehensive command-line interface for training GIN models on molecular toxicity prediction tasks. Key features include:

1. **Flexible data splitting**: Choose between random, scaffold-based, or index-based splitting
2. **Configurable hyperparameters**: Easily adjust learning rate, model architecture, training settings
3. **Automatic model management**: Saves best models and optional checkpoints
4. **Comprehensive logging**: Detailed progress tracking and result saving
5. **Reproducible experiments**: Consistent random seeding and configuration saving

This approach allows for systematic hyperparameter tuning and comparison of different training configurations while maintaining compatibility with the existing notebook-based workflow.

---

## 4. Hyperparameter Grid Search

Now let's perform systematic hyperparameter optimization using our grid search script.

### 4.1 Grid Search Options

First, let's see the available options for grid search:

In [None]:
# View grid search options
!python scripts/hyp_search.py --help

### 4.2 Small Grid Search Example

Let's run a small grid search with just a few experiments to test the system:

In [None]:
# Run a small grid search (limited to 4 experiments for testing)
!python scripts/hyp_search.py \
    --max_experiments 4 \
    --fast_search \
    --output_dir test_grid_search \
    --device auto

### 4.3 Full Grid Search

**Warning**: The full grid search will run 48 experiments.

Uncomment and run the cell below only if you want to perform the complete hyperparameter search:

In [None]:
# Full grid search (48 experiments) - UNCOMMENT TO RUN
# This will take several hours to complete!

# !python scripts/hyp_search.py \
#     --fast_search \
#     --output_dir full_grid_search \
#     --device auto

print("Full grid search is commented out. Uncomment the above lines to run all 48 experiments.")
print("\nGrid search configuration:")
print("- hidden_channels: [32, 128]")
print("- num_layers: [3, 4]")
print("- dropout: [0.2, 0.5]")
print("- batch_size: [32, 64]")
print("- lr: [0.01, 0.005, 0.001]")
print("- Total combinations: 2 × 2 × 2 × 2 × 3 = 48 experiments")

### 4.4 Grid Search Results Analysis

After running grid search, let's analyze the results:

In [None]:
def load_and_display_grid_search_results(results_dir="test_grid_search"):
    """
    Load and display grid search results from JSON file.
    """
    results_file = Path(results_dir) / "grid_search_results.json"
    
    if not results_file.exists():
        print(f"Results file not found: {results_file}")
        print("Run the grid search first!")
        return None
    
    with open(results_file, 'r') as f:
        results = json.load(f)
    
    print("=" * 60)
    print("GRID SEARCH RESULTS SUMMARY")
    print("=" * 60)
    
    info = results['grid_search_info']
    print(f"Timestamp: {info['timestamp']}")
    print(f"Total experiments: {info['total_experiments']}")
    print(f"Successful: {info['successful_experiments']}")
    print(f"Failed: {info['failed_experiments']}")
    print(f"Total time: {info['total_time_hours']:.2f} hours ({info['total_time_minutes']:.1f} minutes)")
    print(f"Fast search mode: {info['fast_search_mode']}")
    
    print("\n" + "="*40)
    print("TOP RESULTS:")
    print("="*40)
    
    for i, result in enumerate(results['best_results'][:5], 1):
        hp = result['hyperparameters']
        print(f"{i}. Experiment {result['experiment_id']}:")
        print(f"   ROC AUC: {result['val_roc_auc']:.4f} (test: {result['test_roc_auc']:.4f})")
        print(f"   Config: hidden={hp['hidden_channels']}, layers={hp['num_layers']}, "
              f"dropout={hp['dropout']}, batch={hp['batch_size']}, lr={hp['lr']}")
        print(f"   Time: {result['experiment_time_minutes']:.1f}min, "
              f"Epochs: {result['total_epochs']}/{result.get('planned_epochs', 'N/A')}")
        if result.get('early_stopped', False):
            print(f"   Early stopped at epoch {result['best_epoch']}")
        print()
    
    return results

# Load and display results
results = load_and_display_grid_search_results("test_grid_search")

In [None]:
def visualize_grid_search_results(results):
    """
    Create visualizations for grid search results.
    """
    if results is None:
        print("No results to visualize. Run grid search first.")
        return
    
    # Extract data for plotting
    experiments = results['all_experiments']
    successful_experiments = [exp for exp in experiments if exp['success']]
    
    if not successful_experiments:
        print("No successful experiments to visualize.")
        return
    
    # Prepare data
    val_roc_aucs = []
    test_roc_aucs = []
    experiment_times = []
    total_epochs = []
    hyperparams = []
    
    for exp in successful_experiments:
        if 'results' in exp and 'final_results' in exp['results']:
            # Get validation ROC AUC from training info
            val_roc = exp['results']['training_info']['best_val_roc_auc']
            # Get test ROC AUC from final results
            test_roc = exp['results']['final_results']['test']['roc_auc']
            
            val_roc_aucs.append(val_roc)
            test_roc_aucs.append(test_roc)
            experiment_times.append(exp['experiment_time_minutes'])
            total_epochs.append(exp['results']['training_info']['total_epochs'])
            
            # Store hyperparameters
            hp = exp['hyperparameters']
            hyperparams.append(f"H{hp['hidden_channels']}_L{hp['num_layers']}_D{hp['dropout']}_B{hp['batch_size']}_LR{hp['lr']}")
    
    # Create subplots
    fig, axes = plt.subplots(2, 2, figsize=(16, 12))
    
    # Plot 1: Validation vs Test ROC AUC
    axes[0, 0].scatter(val_roc_aucs, test_roc_aucs, alpha=0.7, s=100, c='blue')
    axes[0, 0].plot([0.5, 1.0], [0.5, 1.0], 'r--', alpha=0.5, label='Perfect correlation')
    axes[0, 0].set_xlabel('Validation ROC AUC')
    axes[0, 0].set_ylabel('Test ROC AUC')
    axes[0, 0].set_title('Validation vs Test Performance')
    axes[0, 0].grid(True, alpha=0.3)
    axes[0, 0].legend()
    
    # Add correlation coefficient
    corr = np.corrcoef(val_roc_aucs, test_roc_aucs)[0, 1]
    axes[0, 0].text(0.05, 0.95, f'Correlation: {corr:.3f}', transform=axes[0, 0].transAxes, 
                    bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))
    
    # Plot 2: ROC AUC vs Training Time
    axes[0, 1].scatter(experiment_times, val_roc_aucs, alpha=0.7, s=100, c='green')
    axes[0, 1].set_xlabel('Training Time (minutes)')
    axes[0, 1].set_ylabel('Validation ROC AUC')
    axes[0, 1].set_title('Performance vs Training Time')
    axes[0, 1].grid(True, alpha=0.3)
    
    # Plot 3: ROC AUC vs Epochs
    axes[1, 0].scatter(total_epochs, val_roc_aucs, alpha=0.7, s=100, c='purple')
    axes[1, 0].set_xlabel('Total Epochs Trained')
    axes[1, 0].set_ylabel('Validation ROC AUC')
    axes[1, 0].set_title('Performance vs Training Epochs')
    axes[1, 0].grid(True, alpha=0.3)
    
    # Plot 4: Performance distribution
    axes[1, 1].hist(val_roc_aucs, bins=min(10, len(val_roc_aucs)), alpha=0.7, color='orange', edgecolor='black')
    axes[1, 1].axvline(np.mean(val_roc_aucs), color='red', linestyle='--', label=f'Mean: {np.mean(val_roc_aucs):.3f}')
    axes[1, 1].axvline(np.median(val_roc_aucs), color='blue', linestyle='--', label=f'Median: {np.median(val_roc_aucs):.3f}')
    axes[1, 1].set_xlabel('Validation ROC AUC')
    axes[1, 1].set_ylabel('Number of Experiments')
    axes[1, 1].set_title('Performance Distribution')
    axes[1, 1].legend()
    axes[1, 1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Print statistics
    print("\n=== PERFORMANCE STATISTICS ===")
    print(f"Best validation ROC AUC: {max(val_roc_aucs):.4f}")
    print(f"Worst validation ROC AUC: {min(val_roc_aucs):.4f}")
    print(f"Mean validation ROC AUC: {np.mean(val_roc_aucs):.4f} ± {np.std(val_roc_aucs):.4f}")
    print(f"Median validation ROC AUC: {np.median(val_roc_aucs):.4f}")
    print(f"\nMean training time: {np.mean(experiment_times):.1f} ± {np.std(experiment_times):.1f} minutes")
    print(f"Mean epochs: {np.mean(total_epochs):.1f} ± {np.std(total_epochs):.1f}")

# Visualize results if available
if results:
    visualize_grid_search_results(results)

In [None]:
def analyze_hyperparameter_impact(results):
    """
    Analyze the impact of different hyperparameters on model performance.
    """
    if results is None:
        print("No results to analyze. Run grid search first.")
        return
    
    experiments = results['all_experiments']
    successful_experiments = [exp for exp in experiments if exp['success']]
    
    if not successful_experiments:
        print("No successful experiments to analyze.")
        return
    
    # Collect data
    data = []
    for exp in successful_experiments:
        if 'results' in exp and 'training_info' in exp['results']:
            hp = exp['hyperparameters']
            performance = exp['results']['training_info']['best_val_roc_auc']
            
            data.append({
                'hidden_channels': hp['hidden_channels'],
                'num_layers': hp['num_layers'],
                'dropout': hp['dropout'],
                'batch_size': hp['batch_size'],
                'lr': hp['lr'],
                'val_roc_auc': performance
            })
    
    if not data:
        print("No valid data found for analysis.")
        return
    
    # Convert to arrays for analysis
    import pandas as pd
    df = pd.DataFrame(data)
    
    print("\n=== HYPERPARAMETER IMPACT ANALYSIS ===")
    
    # Analyze each hyperparameter
    hyperparams = ['hidden_channels', 'num_layers', 'dropout', 'batch_size', 'lr']
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    axes = axes.flatten()
    
    for i, param in enumerate(hyperparams):
        # Group by parameter value and calculate mean performance
        grouped = df.groupby(param)['val_roc_auc'].agg(['mean', 'std', 'count']).reset_index()
        
        # Plot
        ax = axes[i]
        ax.bar(grouped[param].astype(str), grouped['mean'], 
               yerr=grouped['std'], capsize=5, alpha=0.7)
        ax.set_xlabel(param)
        ax.set_ylabel('Mean Validation ROC AUC')
        ax.set_title(f'Impact of {param}')
        ax.grid(True, alpha=0.3)
        
        # Add value labels on bars
        for j, (val, mean_val, count) in enumerate(zip(grouped[param], grouped['mean'], grouped['count'])):
            ax.text(j, mean_val + 0.01, f'{mean_val:.3f}\n(n={count})', 
                   ha='center', va='bottom', fontsize=10)
        
        # Print statistics
        print(f"\n{param.upper()}:")
        for _, row in grouped.iterrows():
            print(f"  {row[param]}: {row['mean']:.4f} ± {row['std']:.4f} (n={row['count']})")
    
    # Remove empty subplot
    axes[-1].remove()
    
    plt.tight_layout()
    plt.show()
    
    # Find best hyperparameter combinations
    print("\n=== BEST HYPERPARAMETER COMBINATIONS ===")
    top_experiments = df.nlargest(3, 'val_roc_auc')
    
    for i, (_, exp) in enumerate(top_experiments.iterrows(), 1):
        print(f"{i}. ROC AUC: {exp['val_roc_auc']:.4f}")
        print(f"   Config: hidden={exp['hidden_channels']}, layers={exp['num_layers']}, "
              f"dropout={exp['dropout']}, batch={exp['batch_size']}, lr={exp['lr']}")

# Analyze hyperparameter impact if results are available
if results:
    try:
        import pandas as pd
        analyze_hyperparameter_impact(results)
    except ImportError:
        print("pandas not available. Install with: pip install pandas")
        print("Skipping hyperparameter analysis...")

---

## 5. Conclusion and Next Steps

This notebook demonstrated:

1. **Environment setup** for both local and Google Colab environments
2. **Basic model training** with customizable hyperparameters
3. **Systematic grid search** for hyperparameter optimization
4. **Comprehensive analysis** of results and hyperparameter impact

### Key Findings

- The grid search explores 48 hyperparameter combinations
- Fast search mode with early stopping significantly reduces training time
- Results are automatically saved and can be analyzed systematically

### Useful Commands

```bash
# Single training run
python scripts/train.py --epochs 100 --lr 1e-3 --hidden_channels 128

# Grid search (fast mode)
python scripts/hyp_search.py --fast_search --max_experiments 10

# Full grid search
python scripts/hyp_search.py --fast_search
```