# ️📸 Scene Index: Keyframe Extraction

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/guides/video/scene-index/scene-extraction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

> This guide assumes you are already familiar with the concept of Scene Indexing. If you are not, please refer to our [Scene Index: QuickStart Guide]() to get up to speed.

In this guide, we will focus on configuring the Keyframe Extraction pipeline to optimize the Scene Index.

## Setup
---

### 📦  Installing packages   

In [None]:
!pip install videodb

### 🔑 API Keys

In [30]:
import os

os.environ["VIDEO_DB_API_KEY"] = ""

### 🌐 Connect to VideoDB

In [None]:
from videodb import connect

conn = connect()
coll = conn.get_collection()

### 🎥  Upload Video

In [4]:
video = coll.upload(url="https://www.youtube.com/watch?v=LejnTJL173Y")

## 📇 Scene Extraction Configuration
---

In the process of Video analysis, each video is first fragmented into segments called `Scene`. From each `Scene`, specific `Frame` are selected to accurately describe the scene.

The criteria for determining a new scene and the number of frames to pick from each scene are defined by the pipeline’s configuration.

Types of Scene Extraction Pipelines

There are two main types of Scene Extraction pipelines:

1. **Scene-Based Pipeline**:
This pipeline identifies scene changes based on visual content within the video. Key factors include: significant changes in the visual content, such as transitions, lighting changes, and movement.

2. **Time-Based Pipeline**:
This pipeline segments the video based on predefined time intervals. 


Each of these pipelines has configurable parameters that you can adjust to suit your specific requirements.


## 🎬 Scene-Based Extraction
---

There are two arguments that you can pass to `video.index_scenes()` to configure the pipeline:

### ⚙️ Extraction Type    

To set Scene-Based extraction as your pipeline, set the `extraction_type` argument to `SceneExtractionType.scene_based`.

>The default value for this argument is already `SceneExtractionType.scene_based`, so you can omit it if you prefer.




### ⚙️ Extraction Config

To configure the pipeline further, you can pass a Python Dict to the `extraction_config` argument.

The Python Dict can contain the following keys:
- **`threshold`**: Determines the sensitivity of the model towards scene changes within the video. Default value is `20`, which is good for detecting camera shot changes from a video
- **`frame_count`**: Accepts a number that specifies how many frames to pick from each scene. Default value is `1`

In [None]:
from videodb import SceneExtractionType

video.index_scenes(
    extraction_type=SceneExtractionType.scene_based,
    extraction_config={"threshold": 15, "frame_count": 2},
)

## ⏱️ Time-Based Extraction
---

There are two arguments that you can pass to `video.index_scenes()` to configure the pipeline:

### ⚙️ Extraction Type    

To set Time-Based extraction as your pipeline, set the `extraction_type` argument to `SceneExtractionType.time_based`.



### ⚙️ Extraction Config

To configure the Time-Based pipeline, you can pass a Python Dict to the `extraction_config` argument.

The Python Dict can contain the following keys:
- **`time`**: Specifies the interval (in seconds) at which scenes are segmented. Default value is `10`.
- **`select_frames`**: A list of frames to select from each segment. The list can contain strings from the following: "first", "middle", or "last", which selects the respective frames. Default value is `["first"]`


In [None]:
from videodb import SceneExtractionType

video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 60, "select_frames": ["middle"]},
)

## Extract Scenes without Indexing
---

If you want to extract scenes from the video without indexing them, you can use the `video.extract_scenes()` function.

This function accepts the `extraction_type` and `extraction_config`

In [None]:
video.extract_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 30, "select_frames": ["middle"]},
)

## Viewing, Inspecting, and Deleting Your Scenes
---

For every scene extraction pipeline that you run on a video, a   `SceneCollection` object is created. 

You can use following functions to View, Inspect and Delete your SceneCollections

**Viewing all `SceneCollection`s for a Video**

In [None]:
scene_collections = video.list_scene_collection()
for scene_collection in scene_collections:
    print("Scene Collection Id :",scene_collection["scene_collection_id"])

**Get `SceneCollection` by ID**

In [17]:
scene_collection = video.get_scene_collection("tt60sf")

**Inspecting `SceneCollection`**

In [None]:
print("This is scene collection id", scene_collection.id)
print("This is scene collection config", scene_collection.config)
scenes = scene_collection.scenes
for scene in scenes:
    print(f"Scene Duration {scene.start}-{scene.end}")
    for frame in scene.frames:
        print(f"- Frame at {frame.frame_time} {frame.url}")

**Delete a `SceneCollection`**

In [None]:
video.delete_scene_collection("tt60sf")

## 👨‍💻 Next Steps
---

Check out the other resources and tutorials using Scene Indexing
- Guide on [Configuring Vision model]() 
- Use Case oriented tutorials

If you have any questions or feedback. Feel free to reach out to us 🙌🏼

* [Discord](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fdiscord.gg%2Fpy9P639jGz)
* [GitHub](https://github.com/video-db)
* [VideoDB](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fvideodb.io)
* [Email](ashu@videodb.io)