### Overview
This notebook walks your through the process of creating clips with LLM prompts using spoken content of the video. 

Pick a video, decide your prompt, generate a new clip ⚡️

It's as simple as it sounds.

If you want to go extra mile you can score and rank your results, add Image Overlays or Audio overlays on these clips.

In [None]:
# But first, let's install the dependecies.
!pip install -r requirements.txt

### Choose the Video

You can either use a fresh video from Youtube etc. or choose an exisitng one already uploaded on your VideoDB collection.

In [None]:
import os
import videodb

from dotenv import load_dotenv


# TODO: setup .env file
load_dotenv()

# Connect to VideoDB
conn = videodb.connect()
coll = conn.get_collection()

# NOTE: Please set video_id or SPOKEN_DEMO_VIDEO_ID in .env if video already exists in the collection
video_id = os.getenv("SPOKEN_DEMO_VIDEO_ID")
video_url = "https://www.youtube.com/watch?v=HpUR7-Oe1ss"

if not video_id:
    video = coll.upload(url=video_url)
else:
    video = coll.get_video(video_id)

print(f"video_id: {video.id}, name: {video.name}")
video.play()

### Indexing Spoken Information

In [2]:
try:
    transcript_text = video.get_transcript_text()
except Exception:
    video.index_spoken_words()
    transcript_text = video.get_transcript_text()

### Run your Prompt

To create a clip using the `text_prompter` function from a video, it's crucial to craft a specific prompt that will help identify the most relevant segments for your use case. This prompt should highlight the themes, topics, or specific phrases you're interested in. The function then analyzes the video's spoken content to find segments that match your criteria. 

Before you can use `text_prompter`, make sure the video's spoken content is indexed with the `video.index_spoken_words()` function. This prepares the video for analysis by making its spoken content searchable.

The `text_prompter` will return sentences or segments from the video that match your prompt. Review these to ensure they align with your needs. You can then use these segments to create your clip, focusing on the content that's most relevant to your use case.

In [None]:
from video_prompter import text_prompter

# Choose a prompt to create create clip. 
user_prompt = "find sentences where a deal is discussed"
result = text_prompter(transcript_text, user_prompt)
print(f"Found {len(result)} segments in the video.")

### Generate the Clip

To generate a clip, we'll  use **VideoDB**'s `keyword search` feature. We already leveraged the power of the LLM (Large Language Model) to identify relevant sentences. We'll use the search results to create a `programmable video stream`. Here's how you can approach this process:

We have the keywords in the `results` variable. Input these keywords into VideoDB's keyword search feature. This function will search through the indexed spoken content of your videos to find matches. 

The search will return a SearchResult object, which contains detailed information about the found segments, including their timestamps, the text of the spoken content, and possibly other metadata.

**Create a Programmable Video Stream with Timeline**: With the specific segments identified, you can now use the Timeline to create a new programmable video stream. The Timeline tool allows you to stitch together video segments based on the timestamps provided in the SearchResult object. You can arrange, cut, or combine these segments to craft a fresh video stream that focuses on your topic of interest.

In [None]:
from videodb import play_stream
from videodb.timeline import Timeline
from video_prompter import get_result_timestamps, build_video_timeline

timeline = Timeline(conn)
result_timestamps = get_result_timestamps(video, result, index_type="spoken_word")
timeline, duration  = build_video_timeline(video, result_timestamps, timeline)
stream = timeline.generate_stream()
print(stream)
play_stream(stream)

### Modify the Timeline

The programmable stream part of VideoDB allows you to not just watch the original clip but also modify and personalize the stream. Here we can add up the logo on each clip easily. You can read more about it here - https://docs.videodb.io/version-0-0-3-timeline-and-assets-44

In [None]:
# upload Image
from videodb import MediaType

image_id = os.getenv("SPOKEN_DEMO_IMAGE_ID")
if not image_id:
    image = conn.upload(url="https://www.freepnglogos.com/uploads/logo-ig-png/logo-ig-instagram-new-logo-vector-download-13.png", media_type=MediaType.image)
    image_id = image.id

In [11]:
from videodb.asset import VideoAsset, AudioAsset, ImageAsset

image_asset = ImageAsset(
    asset_id=image_id,
    width=40,
    height=40,
    x=20,
    y=10,
    duration=7
)

In [None]:
timeline.add_overlay(0, image_asset)
stream = timeline.generate_stream()
play_stream(stream)

### Bonus : Ranking using LLM
If you want to choose only a few top results and wodering how to do it, have LLM to rank your results and create a score that you can use to decide the order of segments. You can modify the ranking prompt creativiely to drive the outcome of it. We would love to see what you create 🙌🏼

In [14]:
from llm_agent import LLM
import re
import json
from math import floor

def ranking_prompt_llm(text, prompt):
    ranking_prompt = """Given the text provided below and a specific User Prompt, evaluate the relevance of the text
    in relation to the user's prompt. Please assign a relevance score ranging from 0 to 10, where 0 indicates no relevance 
    and 10 signifies perfect alignment with the user's request.
    The score quality also increases when the text is a complete senetence, making it perfect for a video clip result"""

    # pass the data
    ranking_prompt += f"""
    text: {text}
    User Prompt: {prompt}
    """

    # Add instructions to always return JSON at the end of processing.
    ranking_prompt += """
    Ensure the final output strictly adheres to the JSON format specified, without including additional text or explanations. 
    Use the following structure for your response:
    {
      "score": <relevance score>
    }
    """
    try:
        response = LLM().chat(message=ranking_prompt)
        print(response)
        output = response["choices"][0]["message"]["content"]
        res = json.loads(output)
        score = res.get('score')
        return score
    except Exception as e:
        return 0 

def rank_results(res, prompt, score_percentage=0.40):
    """
    rank and give score to each result
    """
    res_score = []
    for text in res:
        res_score.append((text, ranking_prompt_llm(text,prompt)))
    
    res_score_sorted = sorted(res_score, key=lambda x: x[1], reverse=True)
    return res_score_sorted[0: floor(len(res_score_sorted)*score_percentage)]

In [None]:
ranked_results = rank_results(result, user_prompt)

In [None]:
ranked_results

In [17]:
# Search for matching video segments and watch the resulting stream
from videodb import SearchType
from videodb.timeline import Timeline, VideoAsset, AudioAsset

timeline = Timeline(conn)
for sentences, score in ranked_results:
    search_res = video.search(sentences, search_type=SearchType.keyword)
    matched_segments = search_res.get_shots()
    
    # No exact match found
    if len(matched_segments) == 0:
        continue

    # Get the first matched video segment
    video_shot = matched_segments[0]

    # Create a new Video Asset and add it to a timeline.
    timeline.add_inline(VideoAsset(asset_id=video.id, start=video_shot.start, end=video_shot.end))

In [None]:
stream = timeline.generate_stream()
play_stream(stream)

### Add some sound effects to it 🎶

Not just this we can jazz it up with audio overlays and create another stream with audio overlays.

In [None]:
# Add music overlay, this can be laughter soundtrack
audio_id = os.getenv("SPOKEN_DEMO_AUDIO_ID")
if not audio_id:
    audio = conn.upload(url="https://www.youtube.com/watch?v=q3VVxbJa61Q", media_type=MediaType.audio)
    audio_id = audio.id
    print(f"Uploaded audio with id {audio_id}")

In [21]:
# 1 sec background audio 
background = AudioAsset(asset_id=audio_id, start=3, end=4, disable_other_tracks=True)

In [None]:
timeline = Timeline(conn)
dur_so_far = 0
for clip_sentences, score in ranked_results:
    try:
        search_res = video.search(clip_sentences, search_type=SearchType.keyword)
        matched_segments = search_res.get_shots()
        
        # No exact match found
        if len(matched_segments) == 0:
            continue
    
        #video segment
        video_shot = matched_segments[0]
    
        # Create a new Video Asset and add it to a timeline.
        timeline.add_inline(VideoAsset(asset_id=video.id, start=video_shot.start, end=video_shot.end))
        chunk_dur = (video_shot.end - video_shot.start)
        dur_so_far += chunk_dur 
        if chunk_dur < 2:
            print("Skipping since chunk duration is less then the overlay audio.")
            continue
        timeline.add_overlay(dur_so_far-2, background)
    except Exception as e:
        print(f"Error: skipping the segment {str(e)}")

In [None]:
# add music overlay in the last 2 sec of each supercut.
stream = timeline.generate_stream()
play_stream(stream)

If you have any questions or feedback. Feel free to reach out to us 🙌🏼

* [Discord](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fdiscord.gg%2Fpy9P639jGz)
* [GitHub](https://github.com/video-db)
* [Website](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fvideodb.io)
* [Email](ashu@videodb.io)

### Playground For You To Experiment

In [None]:
video_url = ""
video_id = ""

if not video_id:
    video = coll.upload(url=video_url)
else:
    video = coll.get_video(video_id)

print(f"video_id: {video.id}, name: {video.name}")

In [None]:
try:
    transcript = video.get_transcript()
    transcript_text = video.get_transcript_text()
except Exception:
    video.index_spoken_words()
    transcript = video.get_transcript()
    transcript_text = video.get_transcript_text()

In [None]:
from video_prompter import text_prompter

user_prompt = ""
result = text_prompter(transcript_text, user_prompt)
print(f"Found {len(result)} segments in the video.")

In [None]:
from videodb import play_stream
from videodb.timeline import Timeline
from video_prompter import get_result_timestamps, build_video_timeline

timeline = Timeline(conn)
result_timestamps = get_result_timestamps(video, result, index_type="spoken_word")
timeline, duration  = build_video_timeline(video, result_timestamps, timeline)
stream = timeline.generate_stream()
print(stream)
play_stream(stream)