# 📌 VideoDB F1 Race Search Pipeline (Turn Detection & Metadata Filtering)

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/quickstart/scene_level_metadata_indexing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 🎯 Objective
This notebook demonstrates **scene-level metadata filtering** in an F1 race video to enable precise search and retrieval.

## 🔍 What We’re Doing:
✔ Uploading an **F1 race video**  
✔ **Extracting scenes** every 2 seconds (1 frame per scene)  
✔ **Describing scenes** using AI-generated metadata  
✔ **Indexing scenes** with structured metadata (`camera_view` & `action_type`)  
✔ **Searching scenes** using **semantic search + metadata filtering**  


# 📦 Install VideoDB SDK  
Required for connecting and processing video data.  


In [None]:
!pip install videodb

Collecting videodb
  Downloading videodb-0.2.10.tar.gz (25 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting backoff>=2.2.1 (from videodb)
  Downloading backoff-2.2.1-py3-none-any.whl.metadata (14 kB)
Downloading backoff-2.2.1-py3-none-any.whl (15 kB)
Building wheels for collected packages: videodb
  Building wheel for videodb (setup.py) ... [?25l[?25hdone
  Created wheel for videodb: filename=videodb-0.2.10-py3-none-any.whl size=27143 sha256=17790c3062d5620ef1448b2c684de7485815f6d4e401e0aeb980f4ae41081e68
  Stored in directory: /root/.cache/pip/wheels/ac/43/46/922da11f9ba349968e03820b5e92a4949c78e423f6c8ec37a3
Successfully built videodb
Installing collected packages: backoff, videodb
Successfully installed backoff-2.2.1 videodb-0.2.10


# 🔑 Set Up API Key  
Authenticate with VideoDB to access indexing and search functionalities.  


In [None]:
import os

os.environ["VIDEO_DB_API_KEY"] = ""

# 🌐 Connect to VideoDB  
Establishes connection to manage video storage, indexing, and search.  


In [None]:
from videodb import connect

conn = connect()
coll = conn.get_collection()

print(coll.id)

c-81fc6459-fe30-44ac-8c5b-ea0898c2e152


# 🎥 Upload F1 Race Video  
Adds the video to VideoDB for further processing.  


In [None]:
video = coll.upload(url="https://www.youtube.com/watch?v=2-oslsgSaTI")
print(video.id)

m-z-01954d91-651d-7ef0-a022-18aad60eabdb


## ✂️ Extracting Scenes (Every 2 Seconds)
We split the video into **2-second scenes**, extracting a **single frame per scene** for indexing.

### **Why?**
- This ensures **granular indexing**, making **scene-level filtering more precise**.
- By extracting **key frames**, we can later **assign AI-generated metadata** to describe each scene accurately.


In [None]:
# Extract Scenes Every 2 Seconds (1 Frame per Scene)
from videodb import SceneExtractionType

scene_collection = video.extract_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 2, "select_frames": ["middle"]},
)

print(f"Scene Collection ID: {scene_collection.id}")

scenes = scene_collection.scenes

print(f"Total Scenes Extracted: {len(scenes)}")

Scene Collection ID: tt2smf1
Total Scenes Extracted: 148


## 🔍 Generating Scene Metadata
To **make scenes searchable**, we use AI to **describe & categorize** each scene with the following **structured metadata**:

### **📌 Scene-Level Metadata Fields:**
1️⃣ **`camera_view`** → **Where is the camera placed?**  
   - `"road_ahead"` → Driver’s **POV looking forward**  
   - `"helmet_selfie"` → Close-up of **driver’s helmet**  

2️⃣ **`action_type`** → **What is the driver doing?**  
   - `"clear_road"` → No cars ahead (clean lap)  
   - `"chasing"` → Following another car (intense racing moment)  

### **🚀 Why This Matters**
- **Metadata filtering** allows us to **search for specific race scenarios.**  
- **Combining metadata & semantic search** makes retrieval **highly precise**.  


In [None]:
from videodb.scene import Scene

# List to store described scenes
described_scenes = []

for scene in scenes:
    print(f"Scene from {scene.start}s to {scene.end}s")

    # Generate metadata
    camera_view = scene.describe(
        'Select ONLY one of these camera views (DO NOT describe it, JUST return the category name): ["road_ahead", "helmet_selfie"]. If the view does not match exactly, pick the closest one.'
    )

    action_type = scene.describe(
        'Select ONLY one of these options based on the action being performed by the driver (DO NOT describe it, JUST return the category name): ["clear_road", "chasing"]. If the view does not match exactly, pick the closest one.'
    )

    scene_description = scene.describe(
        "Clearly describe a Formula 1 scene by specifying the scene type, the drivers and teams involved, the specific location on the track, and the key action or significance of the moment. Use concise, yet rich language, targeting Formula 1 enthusiasts seeking precise scene descriptions."
    )

    print(f"Camera View: {camera_view} | Action Type: {action_type}")
    print(f"Scene Description: {scene_description}")

    # Create Scene object with metadata
    described_scene = Scene(
        video_id=video.id,
        start=scene.start,
        end=scene.end,
        description=scene_description,
        metadata={
            "camera_view": camera_view,
            "action_type": action_type
        }
    )
    described_scenes.append(described_scene)

print(f"Total Scenes Indexed: {len(described_scenes)}")


Scene from 0.0s to 2.0s
Camera View: road_ahead | Action Type: chasing
Scene Description: The scene is a tense overtaking maneuver at the Circuit de Monaco, featuring Max Verstappen in the Red Bull and Charles Leclerc in the Ferrari. Verstappen, having closed the gap significantly, is seen approaching Leclerc on the entry to the iconic Tunnel section, a narrow, winding portion of the track. With the tight confines and limited visibility, Verstappen must navigate the challenging turn with precision to make a successful overtake. The significance of the moment lies in the potential for a dramatic change in race leadership, as Verstappen seeks to regain the top position he lost earlier in the race.
Scene from 2.0s to 4.0s
Camera View: road_ahead | Action Type: chasing
Scene Description: The scene depicts a close-up, in-car perspective of a Formula 1 race. The driver, sporting a Walmart-branded livery, is navigating a tight corner on the track. The camera captures the driver's view as they

## 🗂 Indexing Scenes with Metadata
Now that we have **generated metadata** for each scene, we **index them** to make them **searchable**.

### **🚀 Why This is Powerful**
✔ **Scene-level metadata makes filtering more effective**.  
✔ **Instead of searching the entire video, we only search relevant indexed segments.**  
✔ **Future searches can now filter by camera view & driver action.**  


In [None]:
if described_scenes:
    scene_index_id = video.index_scenes(
        scenes=described_scenes,
        name="F1 Scenes"
    )
    print(f"Scenes Indexed under ID: {scene_index_id}")

Scenes Indexed under ID: 5748b6e4c9b64156


## 🔎 Searching Scenes with Metadata & AI  
Now that our scenes are indexed, we can **search using a combination of**:  
✅ **Semantic Search** → AI understands the meaning of the query.  
✅ **Metadata Filters** → Only return relevant scenes based on camera view & action type.  

---

#### 🔍 **Example 1: Finding Intense Chasing Moments**  
Search for **scenes where a driver is chasing another car**, viewed from the **driver's perspective**.  


In [None]:
from videodb import IndexType
from videodb import SearchType

search_results = video.search(
    query = "A skillful chasing scene",
    filter = [{"camera_view": "road_ahead"}, {"action_type": "chasing"}],   # Using metadata filter
    search_type = SearchType.semantic,
    index_type = IndexType.scene,
    result_threshold = 100,
    scene_index_id = scene_index_id  # Our indexed scenes
)
# Play the search results
search_results.play()

#### **🔍 Example 2: Finding Smooth Solo Driving Moments**  
Search for **scenes with clean, precise turns**, where the driver has an **open road ahead**.  


In [None]:
search_results = video.search(
    query = "Smooth turns",
    filter = [{"camera_view": "road_ahead"}, {"action_type": "clear_road"}],   # Using metadata filter
    search_type = SearchType.semantic,
    index_type = IndexType.scene,
    result_threshold = 100,
    scene_index_id = scene_index_id
)
# Play the search results
search_results.play()

# ✅ Conclusion: Precision Search with Scene Metadata  
With **scene-level metadata indexing**, we can:  
✔ **Precisely filter race footage** by camera angles & driver actions.  
✔ **Use AI-powered semantic search** to find **specific race moments.**  
✔ **Enhance video retrieval** for F1 analysis, highlights & research.  

🚀 **This approach unlocks smarter, metadata-driven video search—making every second of race footage instantly accessible.**  
