### Overview
This notebook walks your through the process of creating clips with LLM prompts using visual information. 

Pick a video, decide your prompt, generate a new clip ⚡️

It's as simple as it sounds.

If you want to go extra mile you can add Image Overlays or Audio overlays on these clips.

In [None]:
# But first, let's install the dependecies.
!pip install -r requirements.txt

### Loading The Video

Before proceeding, ensure access to [VideoDB](https://videodb.io) API key. If not, sign up for API access on the respective platforms.

> Get your API key from [VideoDB Console](https://console.videodb.io). ( Free for first 50 uploads, **No credit card required** ) 🎉

You can either source a new video from YouTube or select an existing one from your VideoDB collection.


In [None]:
import os
import videodb

from dotenv import load_dotenv


# TODO: setup .env file
load_dotenv()

# Connect to VideoDB
conn = videodb.connect()
coll = conn.get_collection()

# TODO: Add video_id if video already exists in the collection
video_id = os.getenv("VISUAL_DEMO_VIDEO_ID")
video_url = "https://www.youtube.com/watch?v=7Im2I6STbms"

if not video_id:
    video = coll.upload(url=video_url)
else:
    video = coll.get_video(video_id)

print(f"video_id: {video.id}, name: {video.name}")
video.play()

### Indexing The Visual Information

The sample video provided is an episode of Mr. Bean, which relies heavily on visual comedy rather than spoken words. Therefore, visual information is the most effective way to create the clip.

Here, you can either provide the `scene_index_id` of the video, if available, or leave it blank to index the video for visual retrieval.

To know more about scene indexing click [here](https://docs.videodb.io/scene-index-guide-80).


In [None]:
# Add scene_index_id here if already indexed.
scene_index_id = os.getenv("VISUAL_DEMO_SCENE_INDEX_ID")

if not scene_index_id:
    scene_index_id = video.index_scenes(
        extraction_config={
            "threshold": 20, 
            "frame_count": 3
        },
        prompt="Summarize the essence of the scene in one or two concise sentences without focusing on individual images."
    )
scenes = video.get_scene_index(scene_index_id)
print(f"Video is indexed with scene_index_id {scene_index_id} with {len(scenes)} scenes.")

### Inspecting The Indexed Scenes

In [None]:
for scene in scenes:
    print(f"{scene['start']}-{scene['end']}: {scene['description']}")

### Get Results From Your Prompt

The `scene_prompter` function in `video_prompter.py` takes the indexed scenes, chunks them, and then parallelly calls the LLM with user and system prompts to retrieve the desired matching scenes.

To create a clip using the `scene_prompter` function from a video, it's crucial to craft a specific prompt that will help identify the most relevant segments for your use case. This prompt should highlight the themes, activity, or specific visual cues you're interested in. 

The `scene_prompter` will return sentences or segments from the video that match your prompt. Review these to ensure they align with your needs. You can then use these segments to create your clip, focusing on the content that's most relevant to your use case.

In [None]:
from video_prompter import scene_prompter

# This prompt is for finding the iconic copying in examination scene of Mr. Bean
user_prompt = "find the moment where mr.bean is attempting to cheat by peeking over at the answer sheet of man beside him, just find this singular moment with high accuracy."
result = scene_prompter(scenes, user_prompt)

### Generate The Clip
To generate a clip, first we'll call `get_result_timestamps` from `video_prompter.py` it uses VideoDB's `keyword search` feature. We already leveraged the power of the LLM (Large Language Model) to identify relevant sentences. We'll use the search results to create a programmable video stream. Here's how you can approach this process:

We have the descriptions in the `results` variable. We input these keywords into VideoDB's keyword search feature. This function will search through the indexed scenes of your videos to find matches.

The search will return a SearchResult object, which contains detailed information about the found segments, including their timestamps, the text of the scene description.

**Create a Programmable Video Stream with Timeline**: With the specific segments identified, you can now use `get_clip_timeline` from `video_prompter.py` to get the Timeline to create a new programmable video stream. The Timeline tool allows you to stitch together video segments based on the timestamps provided in the SearchResult object. You can arrange, cut, or combine these segments to craft a fresh video stream that focuses on your topic of interest.

In [None]:
from videodb import play_stream
from videodb.timeline import Timeline
from video_prompter import get_result_timestamps, get_clip_timeline

timeline = Timeline(conn)

result_timestamps = get_result_timestamps(video, result, scene_index_id=scene_index_id)
# Since our requirement is to get only one scene for the meme where Mr. Bean is copying (as specified in the prompt), we set top_n=1.
# If you want to retrieve all scenes where copying occurs, you can remove top_n.
timeline, duration = get_clip_timeline(video, result_timestamps, timeline, top_n=1) # get timeline of 1st result
stream = timeline.generate_stream()
play_stream(stream)

### Stream With Text Overlay

You can add custom text to the meme for further personalization.

For more customization options, refer to the [TextAsset Styling Guide](https://docs.videodb.io/guide-textasset-75).

In [None]:
from videodb.asset import TextStyle
from videodb.timeline import TextAsset

timeline = Timeline(conn)
timeline, duration = get_clip_timeline(video, result_timestamps, timeline, top_n=1)
left = TextAsset(
    text="XXXX",
    duration=duration,
    style=TextStyle(
        x=190,
        y=15,
        font = "Inter",
        fontsize = 25,
        fontcolor = "#002869",
    )
)
right = TextAsset(
    text="YYYY",
    duration=duration,
    style=TextStyle(
        x=420,
        y=15,
        font = "Inter",
        fontsize = 25, 
        fontcolor = "#00692c",
    )
)
timeline.add_overlay(0, left)
timeline.add_overlay(0, right)
stream = timeline.generate_stream()
play_stream(stream)

### Stream With Image Overlay 

You can also add image overlays if needed. For more details, refer to the [Dynamic Video Stream Guide](https://docs.videodb.io/dynamic-video-stream-guide-44)

In [None]:
from videodb.asset import TextStyle, ImageAsset
from videodb import MediaType

timeline = Timeline(conn)
timeline, duration = get_clip_timeline(video, result_timestamps, timeline, top_n=1)

image1_id = os.getenv("VISUAL_DEMO_IMAGE_1")
if not image1_id:
    image1_url = "https://upload.wikimedia.org/wikipedia/sco/thumb/d/d1/Ferrari-Logo.svg/344px-Ferrari-Logo.svg.png"
    image1 = coll.upload(url=image1_url, media_type=MediaType.image)
    image1_id = image1.id
    print(f"image1_id: {image1_id}")

image2_id = os.getenv("VISUAL_DEMO_IMAGE_2")
if not image2_id:
    image2_url = "https://upload.wikimedia.org/wikipedia/en/thumb/6/66/McLaren_Racing_logo.svg/512px-McLaren_Racing_logo.svg.png"
    image2 = coll.upload(url=image2_url, media_type=MediaType.image)
    image2_id = image2.id
    print(f"image2_id: {image2_id}")

left = ImageAsset(
    asset_id=image1_id,
    duration=duration,
    width=70,
    height=124,
    x=150,
    y=200,
)
right = ImageAsset(
    asset_id=image2_id,
    duration=duration,
    width=128,
    height=40,
    x=400,
    y=240
)

timeline.add_overlay(0, left)
timeline.add_overlay(0, right)
stream = timeline.generate_stream()
play_stream(stream)

### Another Prompt With Simple Compilation

In [18]:
user_prompt = "find all the car gags with high accuracy"
result = scene_prompter(scenes, user_prompt)

In [None]:
timeline = Timeline(conn)
result_timestamps = get_result_timestamps(video, result, scene_index_id=scene_index_id)
timeline, duration  = get_clip_timeline(video, result_timestamps, timeline)
stream = timeline.generate_stream()
play_stream(stream)

If you have any questions or feedback. Feel free to reach out to us 🙌🏼

* [Discord](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fdiscord.gg%2Fpy9P639jGz)
* [GitHub](https://github.com/video-db)
* [Website](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fvideodb.io)
* [Email](ashu@videodb.io)