<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/nb/censor/examples/Scene_index_toolkit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [None]:
!pip install videodb openai Pillow

In [50]:
import os

os.environ["OPENAI_API_KEY"] = ""
os.environ["VIDEO_DB_API_KEY"] = ""

## Upload Your Video

In [37]:
from videodb import connect 
conn = connect()

#### **🚨 ATTENTION: avoid re-upload the video, use already uploaded video by using the video_id**

In [38]:
coll = conn.get_collection()

# find Video by query
video = None
query = "Arya"
for v in coll.get_videos():
    if query in v.name:
        video = v

#find by id
# video = coll.get_video("SOMEID")

In [None]:
coll = conn.get_collection()

video = conn.upload(url="https://www.youtube.com/watch?v=6xOiVAW54CQ")

In [None]:
video.index_scenes()

In [39]:
scenes = video.get_scenes()
print(f"{scenes[0]['start']} - {scenes[0]['end']}")
print(scenes[0]["response"])

0 - 6.439766666666666
The image features a close-up of a man with a rugged appearance, set against a dark, indistinct background. His expression is complex, with furrowed brows and eyes that gaze directly ahead, conveying intensity or concern. The lighting accentuates his features, highlighting creases on his forehead and subtle stubble across his face. Shadows envelop the periphery of the image, focusing attention on the central figure's visage. The color palette is warm yet muted, dominated by browns and ambers that suggest an intimate or somber setting. There is an air of raw emotion, suggesting the subject is in a moment of contemplation or inner turmoil.


In [40]:
print("Scene Detected from video : ", len(scenes))

Scene Detected from video :  56


## Analyzing Scene Index Prompt

- `overlay_text_image_width` : Width of Overlay Image, that display scene description
- `overlay_text_image_heiht` : Height of Overlay Image, that display scene description
- `overlay_text_iamge_size` : Font Size of Text in Overlay Image
- `images_dir` : The directory where images would be generated and saved


#### **🚨 ATTENTION: avoid re-upload the video, use already uploaded video by using the video_id**
the string in `get_image_name` function is going to decide name for your overlay images 

In [42]:
overlay_text_image_width = 800
overlay_text_image_height = 600
overlay_text_image_size = 20
text_color = (0, 0, 0, 255)
images_dir = "images"


def get_image_path(index):
    return f"{images_dir}/{get_image_name(index)}.png"

def get_image_name(index):
    return f"myimage-2-{index}"

In [44]:
from PIL import Image, ImageDraw, ImageFont
import math


def create_image_with_text(text, file_path):
    padding = 50

    width, height = overlay_text_image_width, overlay_text_image_height
    image = Image.new("RGBA", (width, height), (255, 255, 255, 0))
    font = ImageFont.load_default(size=overlay_text_image_size)
    draw = ImageDraw.Draw(image)

    text_color = (0, 0, 0, 255)

    text_length = draw.textlength(text, font=font)
    avg_len_per_char = text_length / len(text)
    character_per_line = math.floor((width - 2 * padding) / avg_len_per_char)
    newline_text = "\n".join(
        text[i : i + character_per_line]
        for i in range(0, len(text), character_per_line)
    )

    # Calculate text size and position for centering, using the correct method
    draw.multiline_text(text=newline_text, font=font, xy=[0, 0], fill=text_color)

    # Save the image
    image.save(file_path)

When conducting tests, it's advisable to start with a limited number of scenarios to avoid overwhelming the database unnecessarily. 

Begin your testing with a few select scenarios, and only proceed to test with all scenarios once you're confident in your approach. 

This strategy helps manage resource use effectively during the testing phase.

In [45]:
# proccess only first 5 scene
# comment out this line, if you want to process all scenes
scenes = scenes[:5]

In [46]:
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, ImageAsset
from videodb import play_stream, TextStyle, MediaType

timeline = Timeline(conn)

video_asset = VideoAsset(asset_id=video.id)
timeline.add_inline(video_asset)

if not os.path.exists(images_dir):
    os.makedirs(images_dir)

for index, scene in enumerate(scenes):
    scene_start = float(scene["start"])
    scene_end = float(scene["end"])
    scene_desc = scene["response"]
    duration = scene_end - scene_start
    image_path = get_image_path(index)

    print("Uploaded Image - {}", index)

    # Create an image with the scene description Text 
    create_image_with_text(scene_desc, image_path)

    # Upload the image to VideoDB
    image = conn.upload(file_path=image_path, media_type=MediaType.image)

    # Ovelay the image on the video
    timeline.add_overlay(
        start=scene_start,
        asset=ImageAsset(
            asset_id=image.id,
            duration=duration,
            width=overlay_text_image_width,
            height=overlay_text_image_height,
        ),
    )

stream_url = timeline.generate_stream()
play_stream(stream_url)

Uploaded Image - {} 0
Uploaded Image - {} 1
Uploaded Image - {} 2
Uploaded Image - {} 3
Uploaded Image - {} 4


'https://console.videodb.io/player?url=https://dseetlpshk2tb.cloudfront.net/v3/published/manifests/c404b58a-a306-427f-850b-a24a410868c2.m3u8'

If you want to generate the timeline again, its better to use the same images instead of uploading them to VideoDB again,
you can use this cell to do the same

In [64]:
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, ImageAsset
from videodb import play_stream, TextStyle, MediaType

all_images = coll.get_images()

timeline = Timeline(conn)

video_asset = VideoAsset(asset_id=video.id)
timeline.add_inline(video_asset)

if not os.path.exists(images_dir):
    os.makedirs(images_dir)

for index, scene in enumerate(scenes):
    scene_start = float(scene["start"])
    scene_end = float(scene["end"])
    scene_desc = scene["response"]
    duration = scene_end - scene_start
    image_name = get_image_name(index)

    target_image = None

    for image in all_images:
        if image_name in image.name:
            target_image = image
            break
    
    if(target_image):
        # Ovelay the image on the video
        timeline.add_overlay(
            start=scene_start,
            asset=ImageAsset(
                asset_id=target_image.id,
                duration=duration,
                width=overlay_text_image_width,
                height=overlay_text_image_height,
            ),
        )
    else:
        print("Image not found in VideoDB - {}", index)
# 
stream_url = timeline.generate_stream()
play_stream(stream_url)

'https://console.videodb.io/player?url=https://dseetlpshk2tb.cloudfront.net/v3/published/manifests/d947d087-bd49-4980-8eb8-8fff15f51e1b.m3u8'

### Analyzing LLM Decider Prompt

In [51]:
llm_prompt = """Here’s the data from a vision (scene) index run on a video footage. The structure of the data is { - index: scene description}. 
Based on this information, identify and filter the scenes safe for mature audiences but do not include nudity.
As a response, return a json which contains one fields, safe and its value is a list of all indexes(numbers) that are safe \n \n"""

for index, scene in enumerate(scenes):
    llm_prompt += f"{index}: {scene['response']}\n"

In [52]:
import openai
import json

client = openai.OpenAI()

openai_res = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": llm_prompt}],
    response_format={"type": "json_object"}
)
openai_res = json.loads(openai_res.choices[0].message.content)

In [53]:
safe_indexes = [int(item) for item in openai_res['safe']]
unsafe_indexes = [item for item in range(len(scenes)) if item not in openai_res['safe']]

safe_shots = []
unsafe_shots = []

for index in safe_indexes: 
    safe_shots.append([float(scenes[index]['start']), float(scenes[index]['end'])])

for index in unsafe_indexes: 
    unsafe_shots.append([float(scenes[index]['start']), float(scenes[index]['end'])])


print(safe_shots)
print(unsafe_shots)

[[6.439766666666666, 7.841166666666667], [7.841166666666667, 9.542866666666667], [22.989633333333334, 27.2272]]
[[0.0, 6.439766666666666], [9.542866666666667, 22.989633333333334]]


In [66]:
for index, scene in enumerate(scenes):
    scene_start = float(scene["start"])
    scene_end = float(scene["end"])
    duration = scene_end - scene_start

    if(index in safe_indexes):
        text = "safe"
        color = 'green'
    else:
        text = "unsafe"
        color = 'red'
    text_asset = TextAsset(text=text, style=TextStyle(fontcolor=color, fontsize=30), duration=duration)
    timeline.add_overlay(start=scene_start, asset=text_asset)


stream_url = timeline.generate_stream()
play_stream(stream_url)

'https://console.videodb.io/player?url=https://dseetlpshk2tb.cloudfront.net/v3/published/manifests/2a16e42b-e34d-469a-a752-9983bf8a3529.m3u8'

## Generating Safe Stream and Unsafe Streams

In [67]:
def merge_timestamps(shots, threshold):
    merged_shots = [shots[0]]  

    for current_start, current_end in shots[1:]:
        last_end = merged_shots[-1][1]

        # If the current start time is within the threshold of the last end time, merge them.
        if current_start - last_end <= threshold:
            merged_shots[-1][1] = max(last_end, current_end)  # Update the end time of the last interval.
        else:
            # Otherwise, add the current interval as a new entry.
            merged_shots.append([current_start, current_end])

    return merged_shots 


merged_safe_shots = merge_timestamps(safe_shots, 1)
merged_unsafe_shots = merge_timestamps(unsafe_shots, 1)

#### Streams of Shot Detected as Safe

In [68]:
from videodb import play_stream

stream_url = video.generate_stream(timeline=merged_safe_shots)
play_stream(stream_url)

'https://console.videodb.io/player?url=https://dseetlpshk2tb.cloudfront.net/v3/published/manifests/eea2dbfe-dc30-4adc-8db7-537ac627a51c.m3u8'

#### Streams of Shot Detected as UnSafe

In [52]:
from videodb import play_stream

stream_url = video.generate_stream(timeline=merged_unsafe_shots)
play_stream(stream_url)

'https://console.videodb.io/player?url=https://stream.videodb.io/v3/published/manifests/fe772d36-9d3a-4639-9100-2291341bb8b5.m3u8'

## Clean up

Delete Uploaded Images

In [None]:
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, ImageAsset
from videodb import play_stream, TextStyle, MediaType

all_images = coll.get_images()

for index, scene in enumerate(scenes):
    image_name = get_image_name(index)
    target_image = None
    for image in all_images:
        if image_name in image.name:
            target_image = image
            break
    if(target_image):
        print("Image Deleted - {}", index)
        image.delete()
    else:
        print("Image not found in VideoDB - {}", index)
# 
stream_url = timeline.generate_stream()
play_stream(stream_url)