### Overview
This notebook walk your through the process of creating clips with LLM prompts. 

Pick a video, decide your prompt, generate a new clip ⚡️

It's as simple as it sounds.

If you want to go extra mile you can add Image Overlays or Audio overlays on these clips.

In [1]:
# But first, let's install the dependecies.
!pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [60]:
# Here we do all the required imports
import videodb

from dotenv import load_dotenv

from videodb import SearchType, MediaType, play_stream
from videodb.timeline import Timeline, VideoAsset, TextAsset, ImageAsset, AudioAsset
from videodb.asset import TextStyle

from video_prompter import scene_prompter

### Choose the video

You can either use a fresh video from Youtube etc. or choose an exisitng one already uploaded on your VideoDB collection.

In [79]:
# TODO: setup .env file
load_dotenv()

# Connect to VideoDB
conn = videodb.connect()
coll = conn.get_collection()

# TODO: Add video_id if video already exists in the collection
video_id = ""
video_url = "https://www.youtube.com/watch?v=7Im2I6STbms"

if not video_id:
    video = coll.upload(url=video_url)
else:
    video = coll.get_video(video_id)

print(f"video_id: {video.id}, name: {video.name}")

video_id: m-59f3bc33-8895-4116-98bc-33af6a8ad2cf, name: Mr. Bean | Episode 1 | Mr. Bean Official


### Index Scenes

Here you can either add the scene_index_id of the video if available or you can leave it blank to index the video for visual retreival.

In [92]:
# Add scene_index_id here if already indexed.
scene_index_id = ""

if not scene_index_id:
    scene_index_id = video.index_scenes(
        extraction_config={
            "threshold": 20, 
            "frame_count": 3
        },
        prompt="Summarize the essence of the scene in one or two concise sentences without focusing on individual images."
    )
scenes = video.get_scene_index(scene_index_id)
print(f"video is indexed with scene_index_id {scene_index_id} with {len(scenes)} scenes.")

video is indexed with scene_index_id ccbcd0a4a4a58a1f with 137 scenes.


### Run your Prompt
To create a clip using the `scene_prompter` function from a video, it's crucial to craft a specific prompt that will help identify the most relevant segments for your use case. This prompt should highlight the themes, topics, or specific phrases you're interested in. The function then analyzes the video's visual content to find segments that match your criteria.

Before you can use `scene_prompter`, make sure the video's scenes are indexed with the video.index_scenes() function. This prepares the video for analysis by making its visual content searchable.

The `scene_prompter` will return sentences or segments from the video that match your prompt. Review these to ensure they align with your needs. You can then use these segments to create your clip, focusing on the content that's most relevant to your use case.

In [52]:
user_prompt = "find the moment where mr.bean is attempting to cheat by peeking over at the answer sheet of man beside him, just find this singular moment with high accuracy."
result = scene_prompter(scenes, user_prompt)

### Generate the Clip
To generate a clip, we'll use VideoDB's `keyword search` feature. We already leveraged the power of the LLM (Large Language Model) to identify relevant sentences. We'll use the search results to create a programmable video stream. Here's how you can approach this process:

We have the keywords in the `results` variable. Input these keywords into VideoDB's keyword search feature. This function will search through the indexed scenes of your videos to find matches.

The search will return a SearchResult object, which contains detailed information about the found segments, including their timestamps, the text of the scene description, and possibly other metadata.

**Create a Programmable Video Stream with Timeline**: With the specific segments identified, you can now use the Timeline to create a new programmable video stream. The Timeline tool allows you to stitch together video segments based on the timestamps provided in the SearchResult object. You can arrange, cut, or combine these segments to craft a fresh video stream that focuses on your topic of interest.

In [45]:
def get_result_timestamps(result, sort="time"):
    result_timestamps = []

    for description in result:
        # keyword search on each result description
        search_res = video.search(description, index_type=SearchType.scene, search_type=SearchType.keyword, scene_index_id=scene_index_id)
        matched_segments = search_res.get_shots()
        
        # no exact match found.
        if len(matched_segments) == 0:
            continue

        # videoashot of matched segment 
        video_shot = matched_segments[0]

        # storing the timestamps and description
        result_timestamps.append((video_shot.start, video_shot.end, video_shot.text))

    # sorting the result by time
    if sort and sort == "time":
        result_timestamps = sorted(set(result_timestamps), key=lambda x:x[0])
    return result_timestamps

# Creating and returning timeline of given result timestamps
def get_clip_timeline(result_timestamps, top_n=None, debug=False):
    timeline = Timeline(conn)
    duration = 0
    if top_n:
        existing_count = len(result_timestamps)
        result_timestamps = result_timestamps[:top_n]
        if debug:
            print(f"Picked top {top_n} from {existing_count}")
    for result_timestamp in result_timestamps:
        start = float(result_timestamp[0])
        end = float(result_timestamp[1])
        description = result_timestamp[2]
        if debug:
            print(start, end, description)
        duration += (end-start)
        timeline.add_inline(VideoAsset(asset_id=video.id, start=start, end=end))
    return timeline

### Simple stream of result 

In [None]:
result_timestamps = get_result_timestamps(result)
timeline = get_clip_timeline(result_timestamps, top_n=1) # get timeline of 1st result
stream = timeline.generate_stream()
play_stream(stream)


### Stream with text overlay

In [None]:
timeline = get_clip_timeline(result_timestamps, top_n=1)
left = TextAsset(
    text="XXXX",
    duration=duration,
    style=TextStyle(
        x=180,
        y=15,
        font = "Inter",
        fontsize = 25,
        fontcolor = "#002869",
    )
)
right = TextAsset(
    text="YYYY",
    duration=duration,
    style=TextStyle(
        x=390,
        y=15,
        font = "Inter",
        fontsize = 25, 
        fontcolor = "#00692c",
    )
)
timeline.add_overlay(0, left)
timeline.add_overlay(0, right)
stream = timeline.generate_stream()
play_stream(stream)

### Stream with Image overlay 

In [None]:
timeline = get_clip_timeline(result_timestamps, top_n=1)

image1_id = ""
if not image1_id:
    image1_url = "https://upload.wikimedia.org/wikipedia/sco/thumb/d/d1/Ferrari-Logo.svg/344px-Ferrari-Logo.svg.png"
    image1 = coll.upload(url=image1_url, media_type=MediaType.image)
    image1_id = image1.id
    print(f"image1_id: {image1_id}")

image2_id = "" 
if not image2_id:
    image2_url = "https://upload.wikimedia.org/wikipedia/en/thumb/6/66/McLaren_Racing_logo.svg/512px-McLaren_Racing_logo.svg.png"
    image2 = coll.upload(url=image2_url, media_type=MediaType.image)
    image2_id = image2.id
    print(f"image2_id: {image2_id}")

left = ImageAsset(
    asset_id=image1_id,
    duration=duration,
    width=128,
    height=40,
    x=180,
    y=180,
)
right = ImageAsset(
    asset_id=image2_id,
    duration=duration,
    width=86,
    height=140,
    x=390,
    y=200
)
timeline.add_overlay(0, left)
timeline.add_overlay(0, right)
stream = timeline.generate_stream()
play_stream(stream)

### Another prompt with simple compilation

In [None]:
user_prompt = "find all the car gags with high accuracy"
result = scene_prompter(scenes, user_prompt)
result_timestamps = get_result_timestamps(result)
timeline = get_clip_timeline(result_timestamps)
stream = timeline.generate_stream()
play_stream(stream)