# üîÆ Automated Inference with OmniGenBench AutoInfer

Welcome to this comprehensive tutorial on **automated inference** using OmniGenBench's AutoInfer functionality. This guide will walk you through making predictions on genomic sequences using pre-trained and fine-tuned models.

### 1. The Computational Challenge: Making Predictions at Scale

**AutoInfer** streamlines the process of making predictions on genomic sequences by providing:
- **Unified Interface**: Single API for all model types and tasks
- **Batch Processing**: Efficient handling of thousands of sequences
- **Flexible Input**: Support for various file formats (JSON, CSV, TXT)
- **Rich Output**: Predictions, logits, and confidence scores

Applications across genomic analysis:
- **Transcription Factor Binding**: Predict TF binding sites in regulatory regions
- **Translation Efficiency**: Estimate mRNA translation rates
- **Promoter Detection**: Identify promoter regions in genomic sequences
- **RNA Structure**: Predict secondary structure elements
- **Variant Effect**: Assess functional impact of genetic variants

### 2. The Data: From Sequences to Predictions

AutoInfer handles various input formats:

| Input Type | Format | Use Case |
|-----------|--------|----------|
| Single sequence | String | Quick predictions |
| Multiple sequences | Comma-separated | Small batches |
| JSON file | `{"sequences": [...]}` | Structured data |
| CSV file | `sequence,label,...` | Tabular data |
| Text file | One per line | Simple lists |

### 3. The Tool: Pre-trained and Fine-tuned Models

#### Model Types
- **Foundation Models**: General genomic understanding (e.g., OmniGenome-186M)
- **Fine-tuned Models**: Task-specific models (e.g., TFB prediction, TE prediction)
- **Custom Models**: Your own fine-tuned models

All models support:
1. **Sequence Classification**: Single or multi-label classification
2. **Token Classification**: Per-nucleotide predictions
3. **Regression**: Continuous value prediction
4. **Embedding Extraction**: Vector representations

### 4. The Workflow: 6-Step Guide to Inference

```mermaid
flowchart TD
    subgraph "6-Step Workflow for AutoInfer"
        A["üì• Step 1: Environment Setup<br/>Install dependencies"] --> B["‚öôÔ∏è Step 2: Configuration<br/>Set device, seeds, batch size"]
        B --> C["üìä Step 3: Data Preparation<br/>Create/load input sequences"]
        C --> D["üîß Step 4: Load Model<br/>Initialize fine-tuned model"]
        D --> E["üéì Step 5: Run Inference<br/>Make predictions"]
        E --> F["? Step 6: Analysis & Visualization<br/>Interpret and export results"]
    end

    style A fill:#e1f5fe,stroke:#333,stroke-width:2px
    style B fill:#f3e5f5,stroke:#333,stroke-width:2px
    style C fill:#e8f5e8,stroke:#333,stroke-width:2px
    style D fill:#fff3e0,stroke:#333,stroke-width:2px
    style E fill:#ffe0f0,stroke:#333,stroke-width:2px
    style F fill:#f0ffe0,stroke:#333,stroke-width:2px
```

Let's start making predictions!

## üöÄ Step 1: Setup and Configuration

First, let's set up our environment and prepare sample data for inference.

### 1.1: Environment Setup

Install required packages for genomic inference.

In [None]:
!pip install omnigenbench torch transformers -U

### 1.2: Import Required Libraries

Import essential libraries for inference and analysis.

In [None]:
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import torch
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

# Import OmniGenBench components
from omnigenbench import (
    ModelHub,
    OmniTokenizer,
    OmniModelForSequenceClassification,
    OmniModelForTokenClassification,
)

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("[SUCCESS] Libraries imported successfully!")
print(f"\n[INFO] Environment Information:")
print(f"  PyTorch version: {torch.__version__}")
print(f"  CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"  CUDA version: {torch.version.cuda}")
    print(f"  GPU device: {torch.cuda.get_device_name(0)}")
else:
    print(f"  Running on CPU")

### 1.3: Set Reproducibility Configuration

Ensure reproducible results by fixing random seeds and configuring device management.

In [None]:
# Configuration for reproducibility
RANDOM_SEED = 42
BATCH_SIZE = 16

# Set random seeds for reproducibility
torch.manual_seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed(RANDOM_SEED)
    torch.cuda.manual_seed_all(RANDOM_SEED)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

# Auto-detect device with fallback
def get_device():
    """
    Automatically detect the best available device.
    Returns 'cuda' if GPU is available, otherwise 'cpu'.
    """
    if torch.cuda.is_available():
        device = 'cuda'
        print(f"[INFO] GPU detected: {torch.cuda.get_device_name(0)}")
    else:
        device = 'cpu'
        print(f"[INFO] No GPU detected, using CPU")
    return device

DEVICE = get_device()

print(f"\n[SUCCESS] Configuration complete!")
print(f"  Random seed: {RANDOM_SEED}")
print(f"  Device: {DEVICE}")
print(f"  Batch size: {BATCH_SIZE}")

### 1.4: Prepare Sample Data

Let's create various types of sample data to demonstrate different inference scenarios.

#### Data Types:
- **Promoter sequences**: For binary classification
- **Regulatory elements**: For multi-class classification
- **Long genomic regions**: For token-level predictions

In [None]:
# Sample genomic sequences for inference
sample_sequences = {
    "promoter_candidates": [
        "ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG",
        "GCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGCGC",
        "TATATATATATATATATATATATATATATATATATATATATATATATAT",
        "ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC",
        "CGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGAT",
    ],
    "regulatory_elements": [
        "CGGCGCGCCATATAAGCATCGAGCGCGCACGTGCGCTGCGCGCGCGCTACGCGCGCATGTGCGCGCACGTACGCGCG",
        "GCGCGCGCACGTGCGCACGTGCGCGCACGTGCGCGCGCACGTGCGCGCACGTGCGCGCACGTGCGCGCACGTGCGCGC",
        "GCGCGCCACCAATGCGCGCGCCACCATGTGCGCGCCACCATGTGCGCGCCACCATGTGCGCGCCACCATGTGCGCGCC",
    ],
    "long_sequences": [
        "ATCGATCGATCG" * 20,  # 240 bp
        "GCTAGCTAGCTA" * 20,  # 240 bp
    ]
}

# Create JSON format data
json_data = {
    "sequences": sample_sequences["promoter_candidates"]
}

# Save to file
with open("inference_sequences.json", "w") as f:
    json.dump(json_data, f, indent=2)

# Create JSON with metadata
json_with_metadata = {
    "data": [
        {
            "sequence": seq,
            "gene_id": f"GENE_{i:03d}",
            "description": f"Sample sequence {i+1}",
            "organism": "Arabidopsis thaliana"
        }
        for i, seq in enumerate(sample_sequences["promoter_candidates"])
    ]
}

with open("inference_with_metadata.json", "w") as f:
    json.dump(json_with_metadata, f, indent=2)

# Create CSV format data
csv_data = pd.DataFrame({
    "sequence": sample_sequences["promoter_candidates"],
    "gene_id": [f"GENE_{i:03d}" for i in range(len(sample_sequences["promoter_candidates"]))],
    "organism": "Arabidopsis thaliana"
})
csv_data.to_csv("inference_sequences.csv", index=False)

# Create text file (one sequence per line)
with open("inference_sequences.txt", "w") as f:
    for seq in sample_sequences["promoter_candidates"]:
        f.write(seq + "\n")

print("[SUCCESS] Sample data files created:")
print("  - inference_sequences.json")
print("  - inference_with_metadata.json")
print("  - inference_sequences.csv")
print("  - inference_sequences.txt")
print(f"\n[INFO] Total sequences prepared: {len(sample_sequences['promoter_candidates'])}")

## üîß Step 2: Load Model

Now let's load a pre-trained or fine-tuned model for inference.

### Model Loading Options

We'll demonstrate three approaches:
1. **Load from HuggingFace Hub**: Use publicly available fine-tuned models
2. **Load from local path**: Use your own trained models
3. **Use ModelHub**: Simplified loading with automatic configuration

### 2.1: Load Fine-tuned Model (Recommended)

Let's use a fine-tuned model for transcription factor binding prediction.

In [None]:
# Configuration
inference_config = {
    "model_name": "yangheng/ogb_tfb_finetuned",  # Fine-tuned TFB model
    "batch_size": BATCH_SIZE,
    "device": DEVICE,  # Use auto-detected device
}

print("[INFO] Loading fine-tuned model for inference...")
print(f"  Model: {inference_config['model_name']}")
print(f"  Device: {inference_config['device']}")

try:
    # Load model using ModelHub (simplified approach)
    model = ModelHub.load(
        inference_config["model_name"],
        device=inference_config["device"],
    )
    
    # Set to evaluation mode
    model = model.eval()
    
    print("\n[SUCCESS] Model loaded successfully!")
    print(f"  Model type: {type(model).__name__}")
    print(f"  Evaluation mode: {not model.training}")
    
except Exception as e:
    print(f"\n[ERROR] Failed to load model: {e}")
    print(f"[INFO] Please check:")
    print(f"  1. Internet connection (for HuggingFace Hub download)")
    print(f"  2. Model name is correct: {inference_config['model_name']}")
    print(f"  3. Sufficient disk space for model caching")
    raise

### 2.2: Alternative - Load from Local Path

If you have a locally trained model, you can load it directly.

In [None]:
# Example: Load from local directory
# Uncomment to use your own model

# local_model_path = "./my_trained_model"
# model = OmniModelForSequenceClassification.from_pretrained(
#     local_model_path,
#     trust_remote_code=True
# )
# model = model.eval()

print("[INFO] To use a local model, uncomment the code above and specify the path.")

## üéì Step 3: Run Inference

Now let's make predictions on our sample sequences using various methods.

### Inference Methods:
1. **Single Sequence**: Quick predictions on individual sequences
2. **Batch Inference**: Efficient processing of multiple sequences
3. **File-based Inference**: Process sequences from JSON/CSV files

### 3.1: Single Sequence Inference

In [None]:
# Single sequence prediction
single_sequence = "ATCGATCGATCGATCGATCGATCGATCGATCG"

print("[INFO] Running inference on single sequence...")
print(f"  Sequence: {single_sequence[:40]}...")
print(f"  Length: {len(single_sequence)} bp")

try:
    # Make prediction
    result = model.inference(single_sequence)
    
    print("\n[SUCCESS] Prediction completed!")
    print(f"\n[INFO] Result structure:")
    print(f"  Available keys: {list(result.keys())}")
    
    # Extract prediction (string label for classification)
    prediction = result['predictions']
    print(f"\n  Prediction: {prediction}")
    
    # Extract confidence (tensor scalar)
    if 'confidence' in result:
        confidence = float(result['confidence'])  # Convert tensor to Python float
        print(f"  Confidence: {confidence:.4f}")
    
    # Extract logits if available
    if 'logits' in result:
        logits = result['logits']
        print(f"  Logits shape: {logits.shape}")
        print(f"  Logits values: {logits.detach().cpu().numpy()}")
        
except Exception as e:
    print(f"\n[ERROR] Inference failed: {e}")
    print(f"[INFO] Troubleshooting:")
    print(f"  - Check sequence format (only A, T, C, G allowed)")
    print(f"  - Verify model is in eval mode")
    print(f"  - Check device memory availability")
    raise

### 3.2: Batch Inference

For multiple sequences, batch inference is more efficient.

In [None]:
# Batch inference on multiple sequences
sequences = sample_sequences["promoter_candidates"]

print(f"[INFO] Running batch inference on {len(sequences)} sequences...")

# Collect all predictions
batch_results = []

try:
    for i, seq in enumerate(sequences):
        result = model.inference(seq)
        
        # Extract relevant information
        prediction = result['predictions']
        confidence = float(result['confidence']) if 'confidence' in result else None
        
        batch_results.append({
            "sequence_id": i,
            "sequence": seq[:30] + "...",  # Truncate for display
            "prediction": prediction,
            "confidence": confidence
        })
    
    # Display results
    print("\n[SUCCESS] Batch inference completed!")
    print(f"\nResults ({len(batch_results)} sequences):")
    print("-" * 80)
    
    for res in batch_results:
        print(f"\nSequence {res['sequence_id']}: {res['sequence']}")
        print(f"  Prediction: {res['prediction']}")
        if res['confidence'] is not None:
            print(f"  Confidence: {res['confidence']:.4f}")
    
    # Summary statistics
    if all(r['confidence'] is not None for r in batch_results):
        confidences = [r['confidence'] for r in batch_results]
        print(f"\n[INFO] Confidence Statistics:")
        print(f"  Mean: {np.mean(confidences):.4f}")
        print(f"  Std:  {np.std(confidences):.4f}")
        print(f"  Min:  {np.min(confidences):.4f}")
        print(f"  Max:  {np.max(confidences):.4f}")
        
except Exception as e:
    print(f"\n[ERROR] Batch inference failed: {e}")
    print(f"[INFO] Processing stopped at sequence {i}/{len(sequences)}")
    raise

### 3.3: File-based Inference

Process sequences from JSON files with metadata preservation.

In [None]:
# Load sequences from JSON file
print("[INFO] Loading sequences from JSON file...")

try:
    with open("inference_with_metadata.json", "r") as f:
        data = json.load(f)
    
    print(f"  Loaded {len(data['data'])} sequences with metadata")
    
    # Run inference and preserve metadata
    results_with_metadata = []
    
    for idx, item in enumerate(data['data']):
        sequence = item['sequence']
        result = model.inference(sequence)
        
        # Extract prediction and confidence
        prediction = result['predictions']
        confidence = float(result['confidence']) if 'confidence' in result else None
        
        # Convert logits to list for JSON serialization
        logits_list = None
        if 'logits' in result:
            logits_list = result['logits'].detach().cpu().tolist()
        
        # Combine prediction with metadata
        results_with_metadata.append({
            "sequence": sequence,
            "metadata": {
                "gene_id": item['gene_id'],
                "description": item['description'],
                "organism": item['organism']
            },
            "predictions": prediction,
            "confidence": confidence,
            "logits": logits_list,
        })
        
        # Progress indicator
        if (idx + 1) % 2 == 0 or (idx + 1) == len(data['data']):
            print(f"  Progress: {idx + 1}/{len(data['data'])} sequences processed")
    
    print("\n[SUCCESS] File-based inference completed!")
    print(f"  Processed {len(results_with_metadata)} sequences")
    
    # Display sample result
    if results_with_metadata:
        print(f"\n[INFO] Sample result (first sequence):")
        sample = results_with_metadata[0]
        print(f"  Gene ID: {sample['metadata']['gene_id']}")
        print(f"  Prediction: {sample['predictions']}")
        print(f"  Confidence: {sample['confidence']:.4f if sample['confidence'] else 'N/A'}")
        
except FileNotFoundError:
    print("\n[ERROR] File 'inference_with_metadata.json' not found!")
    print("[INFO] Please run the data preparation cell (1.4) first")
    raise
except Exception as e:
    print(f"\n[ERROR] File-based inference failed: {e}")
    raise

### 3.4: CSV File Inference

Process sequences from CSV files and create a results DataFrame.

In [None]:
# Load sequences from CSV
print("[INFO] Loading sequences from CSV file...")

try:
    df = pd.read_csv("inference_sequences.csv")
    print(f"  Loaded {len(df)} sequences from CSV")
    print(f"  Columns: {list(df.columns)}")
    
    # Run inference on all sequences
    predictions = []
    confidences = []
    
    for idx, sequence in enumerate(df['sequence']):
        result = model.inference(sequence)
        
        # Extract prediction and confidence
        prediction = result['predictions']
        confidence = float(result['confidence']) if 'confidence' in result else None
        
        predictions.append(prediction)
        confidences.append(confidence)
        
        # Progress indicator for larger datasets
        if (idx + 1) % 2 == 0 or (idx + 1) == len(df):
            print(f"  Progress: {idx + 1}/{len(df)} sequences processed")
    
    # Add predictions to DataFrame
    df['prediction'] = predictions
    df['confidence'] = confidences
    
    # Save results
    output_file = "inference_results.csv"
    df.to_csv(output_file, index=False)
    
    print("\n[SUCCESS] CSV inference completed!")
    print(f"\n[INFO] Results DataFrame:")
    print(df.head())
    
    # Display statistics
    if df['confidence'].notna().any():
        print(f"\n[INFO] Prediction Statistics:")
        print(f"  Total sequences: {len(df)}")
        print(f"  Mean confidence: {df['confidence'].mean():.4f}")
        print(f"  Prediction distribution:")
        print(df['prediction'].value_counts().to_string(header=False))
    
    print(f"\n[INFO] Results saved to: {output_file}")
    
except FileNotFoundError:
    print("\n[ERROR] File 'inference_sequences.csv' not found!")
    print("[INFO] Please run the data preparation cell (1.4) first")
    raise
except KeyError as e:
    print(f"\n[ERROR] Required column missing: {e}")
    print("[INFO] CSV file must contain a 'sequence' column")
    raise
except Exception as e:
    print(f"\n[ERROR] CSV inference failed: {e}")
    raise

In [None]:
# Visualize prediction results
import matplotlib.pyplot as plt
import seaborn as sns

# Create figure with subplots
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# 1. Confidence Distribution
if 'confidence' in df.columns and df['confidence'].notna().any():
    ax1 = axes[0]
    df['confidence'].hist(bins=20, ax=ax1, color='skyblue', edgecolor='black')
    ax1.set_xlabel('Confidence Score', fontsize=12)
    ax1.set_ylabel('Frequency', fontsize=12)
    ax1.set_title('Distribution of Prediction Confidence', fontsize=14, fontweight='bold')
    ax1.axvline(df['confidence'].mean(), color='red', linestyle='--', 
                label=f'Mean: {df['confidence'].mean():.3f}')
    ax1.legend()
    ax1.grid(axis='y', alpha=0.3)

# 2. Prediction Distribution
ax2 = axes[1]
prediction_counts = df['prediction'].value_counts()
prediction_counts.plot(kind='bar', ax=ax2, color='coral', edgecolor='black')
ax2.set_xlabel('Prediction Class', fontsize=12)
ax2.set_ylabel('Count', fontsize=12)
ax2.set_title('Distribution of Predictions', fontsize=14, fontweight='bold')
ax2.tick_params(axis='x', rotation=45)
ax2.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.savefig('inference_analysis.png', dpi=300, bbox_inches='tight')
print("[SUCCESS] Visualization saved to 'inference_analysis.png'")
plt.show()

# Display summary table
print("\n[INFO] Inference Summary:")
print(f"{'Metric':<25} {'Value':<15}")
print("-" * 40)
print(f"{'Total sequences':<25} {len(df):<15}")
print(f"{'Unique predictions':<25} {df['prediction'].nunique():<15}")
if 'confidence' in df.columns:
    print(f"{'Mean confidence':<25} {df['confidence'].mean():<15.4f}")
    print(f"{'Std confidence':<25} {df['confidence'].std():<15.4f}")
    print(f"{'High confidence (>0.9)':<25} {(df['confidence'] > 0.9).sum():<15}")

## üìä Step 4: Visualization and Analysis

Let's visualize the inference results to better understand model predictions and confidence distributions.

## üéØ Step 5: CLI Integration

The same inference can be performed using the command-line interface.

### CLI Examples

Here are equivalent CLI commands for the operations we performed above:

In [None]:
# Display CLI usage examples
cli_examples = """
Command-Line Interface Examples:
================================

1. Single sequence inference:
   ogb autoinfer \
     --model yangheng/ogb_tfb_finetuned \
     --sequence "ATCGATCGATCGATCGATCGATCG" \
     --output-file predictions.json

2. Multiple sequences (comma-separated):
   ogb autoinfer \
     --model yangheng/ogb_tfb_finetuned \
     --sequence "ATCGATCG,GCGCGCGC,TATATAT" \
     --output-file predictions.json

3. Batch inference from JSON:
   ogb autoinfer \
     --model yangheng/ogb_tfb_finetuned \
     --input-file inference_sequences.json \
     --batch-size 16 \
     --output-file results.json

4. Inference from CSV:
   ogb autoinfer \
     --model yangheng/ogb_tfb_finetuned \
     --input-file inference_sequences.csv \
     --output-file predictions.json \
     --device cuda:0

5. Inference from text file:
   ogb autoinfer \
     --model yangheng/ogb_tfb_finetuned \
     --sequence inference_sequences.txt \
     --output-file predictions.json
"""

print(cli_examples)

## üéì Step 6: Summary and Best Practices

### Key Takeaways

1. **Model Loading**:
   - Use `ModelHub.load()` for simplified loading
   - Always set model to evaluation mode: `model.eval()`
   - Use auto-detected device for cross-platform compatibility
   - Wrap loading in try-except for error handling

2. **Inference API Return Format** (CRITICAL):
   ```python
   result = model.inference(sequence)
   # Returns dict with keys:
   # - 'predictions': str (single) or list (batch) of class labels
   # - 'logits': Tensor[num_labels] - raw model outputs
   # - 'confidence': Tensor(scalar) - max probability (use float() to convert)
   # - 'last_hidden_state': Tensor - for downstream analysis
   
   # NO 'probabilities' key! Use logits or confidence instead
   ```

3. **Device Management**:
   - Auto-detect device: `torch.cuda.is_available()`
   - Provide CPU fallback for portability
   - Print device information for transparency

4. **Error Handling**:
   - Wrap critical operations in try-except blocks
   - Provide actionable error messages
   - Validate inputs before processing

5. **Reproducibility**:
   - Set random seeds: `torch.manual_seed(SEED)`
   - Fix CUDA determinism: `torch.backends.cudnn.deterministic = True`
   - Document environment: PyTorch version, CUDA version

### Common Use Cases

| Task | Model | Application |
|------|-------|-------------|
| TF Binding | `ogb_tfb_finetuned` | Predict transcription factor binding sites |
| Translation Efficiency | `ogb_te_finetuned` | Estimate mRNA translation rates |
| Promoter Detection | Custom fine-tuned | Identify promoter regions |
| RNA Structure | Structure-specific | Predict secondary structure elements |

### Performance Tips

- **Batch Processing**: Process sequences in batches for efficiency
- **GPU Acceleration**: Use CUDA when available (10-50x speedup)
- **Memory Management**: Clear cache between large batches
  ```python
  if torch.cuda.is_available():
      torch.cuda.empty_cache()
  ```

### Next Steps

- **Fine-tune your own models**: See `sequence_classification_tutorial.ipynb`
- **Batch processing at scale**: Process thousands of sequences efficiently
- **Pipeline integration**: Incorporate into bioinformatics workflows
- **Advanced analysis**: Combine with embedding extraction and attention visualization

### Common Pitfalls & Solutions

| Problem | Cause | Solution |
|---------|-------|----------|
| `KeyError: 'probabilities'` | Using incorrect API key | Use `'confidence'` or `'logits'` instead |
| `RuntimeError: CUDA out of memory` | Batch too large | Reduce batch size or use CPU |
| `TypeError: can't convert cuda:0 device type tensor to numpy` | Tensor on GPU | Use `.cpu()` before `.numpy()` |
| Results vary between runs | No random seed | Set `torch.manual_seed(SEED)` |

### Resources

- **Documentation**: `docs/GETTING_STARTED.md`
- **CLI Reference**: `docs/cli.rst`
- **Examples**: `examples/` directory
- **API Reference**: https://omnigenbench.readthedocs.io
- **Issues**: https://github.com/yangheng95/OmniGenBench/issues

## üßπ Cleanup

Remove temporary files created during the tutorial (optional).

In [None]:
# Uncomment to remove temporary files
# import os

# files_to_remove = [
#     "inference_sequences.json",
#     "inference_with_metadata.json",
#     "inference_sequences.csv",
#     "inference_sequences.txt",
#     "inference_results.json",
#     "inference_results.csv",
#     "inference_summary.json",
#     "inference_analysis.png",
#     "complete_inference_results.json",
# ]

# for file in files_to_remove:
#     if os.path.exists(file):
#         os.remove(file)
#         print(f"Removed: {file}")

print("[INFO] To clean up temporary files, uncomment the code above.")