# ⚡️ QuickStart: Scene Index

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/guides/video/scene-index/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This guide gives you an introduction to Scene Indexing. 

The versatility of scene indexing opens up a world of possibilities for finding visual information in videos. Advanced vision models empower us with unprecedented control over the extraction of information from videos. 

## Setup
---

### 📦  Installing packages   

In [None]:
!pip install videodb

### 🔑 API Keys

In [4]:
import os

os.environ["VIDEO_DB_API_KEY"] = ""

### 🌐 Connect to VideoDB

In [5]:
from videodb import connect

conn = connect()
coll = conn.get_collection()

### 🎥  Upload Video

In [6]:
video = coll.upload(url="https://www.youtube.com/watch?v=LejnTJL173Y")

## 📇 Index Scenes 
---

In just one command, the `index_scenes` function can index visual information in your video.

In [None]:
index_id = video.index_scenes()

`index_scenes()` has many optional parameters. You can use different `extraction algorithms` to select scene and frames.  
Additionally, you can use `prompts` to describe these scenes and frames using a vision model. Read more about [Scene and Frame object](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/advanced_visual_search.ipynb)

>Currently this index is well suited for **semantic search** so try have your prompts designed to output **well written prose** that can be indexed well for semantic search. 
>
>😎 Soon we are going to support json and sql data extraction and indexing

In [None]:
from videodb import SceneExtractionType, IndexType

index_id = video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time":10, "select_frames": ['first']},
    prompt="describe the image in 100 words",
    # callback_url=callback_url,
)

# Wait to Indexing to finish
scene_index = video.get_scene_index(index_id)
print(scene_index)

> Note: it might take a additional 5-10 seconds for your index to become available for search

In [None]:
# Once the indexing is done, you can start search over your video library. 
# Default : search all indexes 
res = video.search(query="drinking", index_type=IndexType.scene, index_id=index_id)
res.play()

## ⚙️ Index Scenes Parameters 
---

>**`index_scenes` parameters**:
>
> - `extraction_type`  - Choose scene extraction algorithm.
>
> - `extraction_config`  - Configuration of scene extraction algorithm.
>
> - `prompt` - Prompt to describe each scene in text.
>
> - `callback_url` - Notification url when the job is done.

Let’s go in detail of each parameter for this function 

### ⚙️ `extraction_type` & `extraction_config`

Visually, a video is a series of images. A 60fps video, for instance, has 60 frames per second and feels high in quality compared to a 30fps video. With extraction_type, we can experiment with the frame extraction algorithms.

We don’t have to send each frame to the vision model, as that would turn out to be inefficient and costly. 

*We first present the simplest way to sample frames. Choose a lower frequency than the fps of video👇*

**⚙️ Time-Based Extraction**

First, you need to set `extraction_type` to `SceneExtractionType.time_based`  
Then to configure, you can pass a Python Dict to the `extraction_config` argument with following keys.

* `time`: Specifies the interval (in seconds) at which scenes are segmented. Default value is `10` - Every 10sec is a scene.
* `select_frames`: A list of frames to select from each segment. The list can contain strings from the following: `"first"`, `"middle"`, or `"last"`, which selects the respective frames. Default value is `["first"]`

>Note: This algorithm may not perform well with static videos. We can develop more advanced methods to segment videos into a few scenes and frames. One such method is based on shot detection 👇
<br>


**⚙️ Shot-Based Extraction**
Videos share context between timestamps. A scene is a logical segment of a video that completes a concept. There are many ways to describe a scene. One way is to identify scene changes based on visual content within the video. Key factors are: <u>significant changes in the visual content</u>, such as **transitions, lighting changes, and movement**.

First, you need to set `extraction_type` to `SceneExtractionType.time_based`  
To configure, you can pass a Python Dict to the `extraction_config` argument with following keys.

* `threshold`: Determines the sensitivity of the model towards scene changes within the video. Default value is `20`, which known to be good for detecting camera shot changes from a video.
* `frame_count`: Accepts a number that specifies how many frames to pick from each shot. Default value is `1` Increasing this number will result in more frames being selected from each shot, which could provide a more detailed analysis of the scene.  
<br>

### ️⚙️ `prompt`

VideoDB seamlessly integrates with all vision models. Use the following prompt to describe each scene:

`“Describe clearly what is happening in the video. Add running_detected if you see a person running.”` 

This prompt is sent to the vision models. If you are interested in experimenting with your own model, check out our guide on [Advanced Visual Search Pipelines](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/advanced_visual_search.ipynb)

<br>

### ⚙️ `callback_url`

URL to send notification when the scene index process is completed.
<br>

>We offer an open pipeline where you can experiment in detail with extraction algorithms and prompts.   
>
>Check out the following:
>
>- [Playground for Scene Extractions](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/playground_scene_extraction.ipynb)
>
>- [Advanced Visual Search Pipelines](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/advanced_visual_search.ipynb)

## 🗂️ Managing Indexes
---

>  
> 💡 You can create multiple scene indexes for a video and rank the results after a search before presenting them to your user.

**List all scene Indexes created on a video**:

`Video.list_scene_index()` will return list of available scene indexes with `id` and `status` 

In [None]:
scene_indexes = video.list_scene_index()
print(scene_indexes)

**Get Specific Index:**  

`Video.get_scene_index()` will return list of indexed scenes with `scene_index_id`, `start`, `end` and `description`

In [None]:
scene_index = video.get_scene_index(index_id) 
print(scene_index)

**Delete a index:**

In [None]:
video.delete_scene_index(index_id)

## 🧑‍💻 Deep Dive
---


Check out the other resources and tutorials using Scene Indexing
* If you want to bring your own scene descriptions and annotations, explore the [Custom Annotations  Pipeline](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/custom_annotations.ipynb)
* Experiment with extraction algorithms, prompts, and search using the [Playground for Scene Extractions](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/playground_scene_extraction.ipynb)
* Check out our open and flexible [Advanced Visual Search Pipelines](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/advanced_visual_search.ipynb)


If you have any questions or feedback. Feel free to reach out to us 🙌🏼

* [Discord](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fdiscord.gg%2Fpy9P639jGz)
* [GitHub](https://github.com/video-db)
* [VideoDB](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fvideodb.io)
* [Email](ashu@videodb.io)