# VideoDB Retriever

**Build RAG pipelines for Videos without ever thinking about the complexity of videos**

Building RAG (Retrieval-Augmented Generation) models for text is quite straightforward, thanks to the ease of parsing, indexing, and retrieving text data. However, applying RAG models to video content is much more complex. Videos combine visual, auditory, and textual elements, requiring more processing power and sophisticated algorithms for effective use. Plus, video files are large, making them resource-intensive to handle.

Enter [VideoDB](https://videodb.io), a database designed to simplify video complexity. It makes building RAG models for videos easier by addressing the challenges of video processing and indexing. Learn more about it at [videodb.io](https://videodb.io).

In this notebook, we introduce `VideoDBRetriever`, a tool designed to streamline the creation of RAG pipelines for video content, without the hassle of dealing with video complexity.
  
  

&nbsp;
## 🛠️️ Setup

###  Requirements

To use this notebook, you'll need API keys for both VideoDB and OpenAI. Follow these steps to set up your environment:

- **VideoDB API Key**: Get your API key from [VideoDB dashboard](https://console.videodb.io)
- **OpenAI API Key**: Get your API key from OpenAI platform.

> Set the `OPENAI_API_KEY` & `VIDEO_DB_API_KEY` environment variable with your API keys.

### Installing Dependencies

To get started, we'll need to install the following packages:

- llama-index
- llama-index-retrievers-videodb
- videodb

In [None]:
%pip install llama-index 

In [None]:
%pip install --pre llama-index-retrievers-videodb videodb

&nbsp;
## 🦙 Simple Video RAG Pipeline

Let's get started by uploading a few video files to [VideoDB](https://videodb.io). Next, we'll use `VideoDBRetriever` to fetch relevant video segments based on our queries.

Afterwards, we'll use these segments to create a context for LLM, which will then be augmented for further use.

### Data Ingestion

We will upload our videos to VideoDB. 

In [None]:
from videodb import connect

# connect to VideoDB
conn = connect()

# upload videos to default collection in VideoDB 
video1 = conn.upload(url="https://www.youtube.com/watch?v=sAuvP68J6DQ")
video2 = conn.upload(url="https://www.youtube.com/watch?v=H2wBQ6CSPiY")

### Indexing

We will use VideoDB's managed index to index our data 

VideoDB offers following indexes
- Semantic: Index on based of Spoken words
- Scene: Index on the based of Scene Description & Visuals _(Note: This feature is currently available only to beta users)_

In [None]:
video1.index_spoken_words()
video2.index_spoken_words()

### Querying

Next, we'll employ `VideoDBRetriever` to fetch relevant nodes from the VideoDB database. Following that, we'll utilize `llama-index` to construct a straightforward RAG pipeline.

In [None]:
from llama_index.retrievers.videodb import VideoDBRetriever
from llama_index.core import get_response_synthesizer
from llama_index.core.query_engine import RetrieverQueryEngine

In [None]:
# VideoDBRetriever by default uses the default collection in the VideoDB 
retriever = VideoDBRetriever()

response_synthesizer = get_response_synthesizer()

query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
)

In [None]:
response = query_engine.query("How did the hostages die")
print(response)

In [None]:
response = query_engine.query("How many hostages were there")
print(response)

**🎉 That's a Simple RAG Pipleine for Video**

&nbsp;
## ✨ More with VideoDB

### Configuring `VideoDBRetriever`
There are some configuration options available to `VideoDBRetriever`   
  

**Retriever for only one Video**:
```python
video1 = conn.upload("https://www.youtube.com/watch?v=sAuvP68J6DQ")
retriever_video1 = VideoDBRetriever(video=video1.id)
```

**Retriever for differnt types of Index**:
```python
# VideoDBRetriever that uses keyword search
keyword_retriever = VideoDBRetriever(search_type="keyword", video="my_video_id")

# VideoDBRetriever that uses semantic search
semantic_retriever = VideoDBRetriever(search_type="semantic")

# VideoDBRetriever that uses scene search
visual_retriever = VideoDBRetriever(search_type="scene")
```

**Configure Results & Search Threshold**:  
- `result_threshold`: is threshold for number of results returned by retriever; default value is `5`
- `score_threshold`: only nodes with score higher than `score_threshold` will be returned by retriever; default value is `0.2`  

```python
custom_retriever = VideoDBRetriever(result_threshold=2, score_threshold=0.5)
```

### Viewing Your Nodes

Although, The `Nodes` returned by Retriever are of type `TextNode`.
But they do really work like a `VideoNode`, where you can view each node instantly as Video.
Create Further Nodes from a Single Node, apply chunking techniques and all. without every worrying about complexity dealing with Video Data.


#### Compilation of all Retrieved Nodes

You can create a compilation of all Nodes using VideoDB

In [None]:
from videodb import connect, play_stream 
from videodb.timeline import Timeline
from videodb.asset import VideoAsset

conn = connect()
timeline = Timeline(conn)

retriever = VideoDBRetriever()
relevant_nodes = retriever.retrieve("How many hostages were there")

for node_obj in relevant_nodes:
    node = node_obj.node
    # create a video asset for each node
    node_asset = VideoAsset(asset_id=node.metadata["video_id"], start=node.metadata["start"], end=node.metadata["end"])
    # add the asset to timeline
    timeline.add_inline(node_asset)

stream_url = timeline.generate_stream()
play_stream(stream_url)

> You can also get relevant node using query engine too
>``` python 
>response = query_engine.query("my query")
>relevant_nodes = query_engine.source_nodes
>```

#### View Specific Node

That was a compilation of retrieved nodes, but you can view each retrieved node as a individual too.



In [None]:
from videodb import connect

retriever = VideoDBRetriever()

relevant_nodes = retriever.retrieve("How many hostages were there")
video_node = relevant_nodes[0].node

conn = connect()
coll = conn.get_collection()

video = coll.get_video(video_node.metadata["video_id"])
start = video_node.metadata["start"]
end = video_node.metadata["end"]

stream_url = video.generate_stream(timeline=[(start, end)])
play_stream(stream_url)

## 🧹 Cleanup

In [None]:
video1.delete()
video2.delete()

## 👨‍👩‍👧‍👦 Support & Community

If you have any questions or feedback.  
Please feel free to reach out to us

- [Discord](https://discord.gg/py9P639jGz)  
- [Github](https://videodb.io)  
- [VideoDB](https://videodb.io)  
- [Mail](mailto:contact@videodb.io)  