# ⚡️ QuickStart: Scene Index

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/quickstart/Scene%20Index%20QuickStart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This guide gives you an introduction to Scene Indexing and it's working. 

The versatility of scene indexing opens up a world of possibilities for finding visual information in videos.  Vision models now enable useful extraction of information from videos that you can easily index using VideoDB.  

Now, you can easily build RAG for queries like:
![](https://raw.githubusercontent.com/video-db/videodb-cookbook/main/images/scene_index/intro.png)

## Setup
---

### 📦  Installing packages   

In [1]:
!pip install -U videodb



### 🔑 API Keys

In [2]:
import os

#replace with your key
os.environ["VIDEO_DB_API_KEY"] = "sk-xxxx-yyyyy-zzzz"

### 🌐 Connect to VideoDB

In [3]:
from videodb import connect

conn = connect()
coll = conn.get_collection()

### 🎥  Upload Video

In [4]:
video = coll.upload(url="https://www.youtube.com/watch?v=LejnTJL173Y")

## 📇 Index Scenes 
---

In just one command, the `index_scenes` function can index visual information in your video.

In [None]:
index_id = video.index_scenes()

### Optional Parameters

`index_scenes()` function accepts a few optional parameters that you can use for optimizations. 

* You can use different extraction algorithms to select scene and frames. 
* Additionally, you can use prompts to describe these scenes and frames using a vision model. 
Read more about [Scene and Frame object](https://github.com/video-db/videodb-cookbook/blob/main/guides/video/scene-index/advanced_visual_search.ipynb)

In [6]:
from videodb import SceneExtractionType, IndexType

index_id = video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time":10, "select_frames": ['first']},
    prompt="describe the image in 100 words",
    # callback_url=callback_url,
)

# Wait to Indexing to finish
scene_index = video.get_scene_index(index_id)
scene_index

[{'description': 'The image depicts a man sitting in an office or conference room, characterized by the presence of glass windows with blinds behind him. He is wearing a dark suit, a white dress shirt, and a dark striped tie. The man appears to be contemplative, with his eyes closed or looking down, and a slight smile on his face. The background shows a bright, well-lit room with natural light filtering through the windows. The atmosphere seems professional and formal, suggesting a workplace or corporate environment. The man’s bald head and expression give an impression of a moment of reflection or contentment.',
  'end': 10.01,
  'start': 0.0},
 {'description': 'The image shows a man with a receding hairline, wearing a dark suit, light blue shirt, and dark striped tie. He appears to be seated, with a neutral or slightly contemplative expression on his face. Behind him, there are large office windows with horizontal blinds partially closed, through which an indistinct office environmen

> Note: it might take a additional 5-10 seconds for your index to become available for search

In [7]:
# search your video with index_id, 
# Default Case: search all indexes 
# query ; "drinking"

res = video.search(query="religious gathering", 
                  index_type=IndexType.scene, 
                  index_id=index_id)

res.play()

'https://console.videodb.io/player?url=https://stream.videodb.io/v3/published/manifests/eb0cbd2b-558e-4eda-bca4-893c5e9713ec.m3u8'

## ⚙️ Index Scenes Parameters 
---

>**`index_scenes` parameters**:
>
> - `extraction_type`  - Choose scene extraction algorithm.
>
> - `extraction_config`  - Configuration of scene extraction algorithm.
>
> - `prompt` - Prompt to describe each scene in text.
>
> - `callback_url` - Notification url when the job is done.

Let’s go in detail of each parameter for this function 

### ⚙️ `extraction_type` & `extraction_config`

Visually, a video is a series of images in a timeline. A 60 fps video, for instance, shows 60 frames per second and feels higher in quality compared to a 30 fps video. Parameter extraction_type, can be used to  experiment with the scene extraction algorithms and in-turn choosing the frames that are relevant to describe details. Checkout [Scene Extraction Algorithms](https://docs.videodb.io/scene-extraction-algorithms-84) for details.

![](https://raw.githubusercontent.com/video-db/videodb-cookbook/main/images/scene_index/VSF.png)

### ️⚙️ `prompt`

Prompt is for the vision models to understand the context and nature of output that you want. For example, if someone is interested in identifying running activity they can use following prompt to describe each scene:

>  “Describe clearly what is happening in the video. Add running_detected if you see a person running.”

If you are interested in experimenting with your own model, and prompts Checkout [Advanced Visual Search Pipelines.](https://github.com/video-db/videodb-cookbook/blob/main/guides/scene-index/advanced_visual_search.ipynb)

<br>

### ⚙️ `callback_url`

URL to send notification when the scene index process is completed. Checkout callback details [here](https://docs.videodb.io/callback-details-66#_lubHL)
<br>

<div style="height:40px;"></div>

## 🗂️ Managing Indexes
---

>  
> 💡 You can create multiple scene indexes for a video and rank the results after a search before presenting them to your user.

**List all scene Indexes created with a video**:

`video.list_scene_index()` will return list of available scene indexes with `id` `name` and `status` 

In [8]:
scene_indexes = video.list_scene_index()
print(scene_indexes)

[{'name': 'Scene Index 2024-07-22 10:06', 'scene_index_id': 'f4db35c5ce45a709', 'status': 'done'}]


**Get Specific Index:**  

`video.get_scene_index()` will return list of indexed scenes with `scene_index_id`, `start`, `end` and `description`

In [9]:
scene_index = video.get_scene_index(index_id) 
print(scene_index)

[{'description': 'The image depicts a man sitting in an office or conference room, characterized by the presence of glass windows with blinds behind him. He is wearing a dark suit, a white dress shirt, and a dark striped tie. The man appears to be contemplative, with his eyes closed or looking down, and a slight smile on his face. The background shows a bright, well-lit room with natural light filtering through the windows. The atmosphere seems professional and formal, suggesting a workplace or corporate environment. The man’s bald head and expression give an impression of a moment of reflection or contentment.', 'end': 10.01, 'start': 0.0}, {'description': 'The image shows a man with a receding hairline, wearing a dark suit, light blue shirt, and dark striped tie. He appears to be seated, with a neutral or slightly contemplative expression on his face. Behind him, there are large office windows with horizontal blinds partially closed, through which an indistinct office environment is 

**Delete a index:**

In [None]:
video.delete_scene_index(index_id)

## 🧑‍💻 Deep Dive
---


Check out the other resources and tutorials using Scene Indexing
* If you want to bring your own scene descriptions and annotations, explore the [Custom Annotations  Pipeline](https://github.com/video-db/videodb-cookbook/blob/main/guides/scene-index/custom_annotations.ipynb)
* Experiment with extraction algorithms, prompts, and search using the [Playground for Scene Extractions](https://github.com/video-db/videodb-cookbook/blob/main/guides/scene-index/playground_scene_extraction.ipynb)
* Check out our open and flexible [Advanced Visual Search Pipelines](https://github.com/video-db/videodb-cookbook/blob/main/guides/scene-index/advanced_visual_search.ipynb)


If you have any questions or feedback. Feel free to reach out to us 🙌🏼

* [Discord](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fdiscord.gg%2Fpy9P639jGz)
* [GitHub](https://github.com/video-db)
* [Website](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fvideodb.io)
* [Email](ashu@videodb.io)