[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/digital-marketing-tum/image-analyzer/blob/main/src/notebooks/pipeline_colab.ipynb)

# Introduction

**Image Analyzer** is a comprehensive image analytics pipeline that extracts visual features from images using computer vision and machine learning techniques.  
It provides both a **web interface** and a **Jupyter/Colab interface** for batch image processing.

---

### Using Image Analyzer in Google Colab

This notebook will guide you through:

1. Setting up the Image Analyzer environment.  
2. Configuring where your images and results are stored.  
3. Running the analysis pipeline and generating summary reports.

Simply run each cell **in order** to complete your analysis.

---

### Using GPU Acceleration

To speed up processing (especially for object detection & caption generation):

1. Go to **Runtime ‚Üí Change runtime type** in Colab  
2. Select a GPU option e.g. **T4 GPU**
3. Save and restart the runtime if prompted

GPU acceleration can significantly reduce analysis time.

## üõ† Step 1 ‚Äî Install & Initialize Image Analyzer

Before we can run the analysis, we need to:

1. **Download the Image Analyzer code base** from GitHub.
2. **Install all required dependencies** for Google Colab.
3. **Initialize Image Analyzer** with the configuration file.

üí° **Note:** You only need to run this step **once per session**.

In [None]:
# Run this cell to load the code base
!git clone https://github.com/digital-marketing-tum/image-analyzer.git
!pip install tensorflow==2.15.0 keras==2.15.0
!pip install -r /content/image-analyzer/requirementsColab_py312.txt

In [None]:
# Initialize Image Analyzer using the configuration
import os
import sys
sys.path.append('/content/image-analyzer/src')
import image_analyzer as IA
ia = IA.IA(config_path = "/content/image-analyzer/config/configuration.yaml")
print("‚úÖ Image Analyzer initialized successfully!")

## üìÇ Step 2 ‚Äî Set Your Input & Output Folders

Before running the analysis, we need to tell **Image Analyzer**:

- **Where to find your images** (input directory).  
- **Where to save results** (output directory).

### Option 1 ‚Äî Local Colab Storage  
- Temporary ‚Äî all files will be deleted after the session ends.
- Upload your images to `/content/image-analyzer/data/`.

### Option 2 ‚Äî Google Drive (Recommended)  
- Persistent ‚Äî files stay after the session ends.  
- Requires mounting Google Drive and specifying the image folder.


In [None]:
# === Set Input & Output Directories ===

USE_GOOGLE_DRIVE = False  # Change to True if you want to use Google Drive

if not USE_GOOGLE_DRIVE:
    ia.output_dir = "/content/image-analyzer/output"                               # Local results folder
    ia.input_dir  = "/content/image-analyzer/data/test_human_20"                   # Local images folder
else:
    from google.colab import drive
    drive.mount('/content/drive')

    ia.output_dir = "/content/drive/MyDrive/image-analyzer/output"                 # Google Drive results folder
    ia.input_dir  = "/content/drive/MyDrive/image-analyzer/data/test_human_20"     # Google Drive images folder

# Create output directory if it doesn't exist
os.makedirs(ia.output_dir, exist_ok=True)

print(f"‚úÖ Output directory: {ia.output_dir}")
print(f"üìÅ Input directory: {ia.input_dir}")

## ‚ñ∂Ô∏è Step 3 ‚Äî Run the Analysis Pipeline

Now that Image Analyzer is set up and your input/output folders are configured, run the pipeline to process all images in the input directory.  

Results will be saved in the output directory in **CSV** and **Excel** formats.  

üí° *This step may take a while depending on dataset size and whether GPU acceleration is enabled.*

In [None]:
# Run the pipeline; you will find the results in the output directory mentioned above upon completion of this cell
results, logs = ia.process_batch()

## üìÑ Step 4 (Optional) ‚Äî Generate Min/Max Summary PDF

Once the analysis is complete, you can create a PDF report showing:

- An example image for each feature‚Äôs **lowest** value.  
- An example image for each feature‚Äôs **highest** value.

This is helpful for quickly reviewing feature extremes in your dataset.

In [None]:
# Create a PDF file that shows an exemplary image for each feature's min and max values
# By default, all object detection features are excluded (i.e., coco_*, imagenet_*, contains_*)
ia.create_argmin_argmax_pdf(exclude_object_detection_features = True)