# 🔊 Lovo x VideoDB: Adding AI Generated voiceovers to silent footage

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/nb/lovo-1/examples/Lovo_Voiceover_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Overview

![](https://raw.githubusercontent.com/video-db/videodb-cookbook-assets/main/images/examples/Lovo_Voiceover_1.png)

---

## Setup

### 📦  Installing packages 

In [None]:
%pip install openai
%pip install videodb

### 🔑 API Keys

In [74]:
import os

os.environ["OPENAI_API_KEY"] = ""
os.environ["LOVO_API_KEY"] = ""
os.environ["VIDEO_DB_API_KEY"] = ""

### 🎙️  Lovo's Speaker ID  

In [75]:
voiceover_artist_id = "640f477d2babeb0024be422b"

---

## Implementation


### 🌐 Step 1: Connect to VideoDB
Connect to VideoDB using your API key to establish a session for uploading and manipulating video files.

In [76]:
from videodb import connect

# Connect to VideoDB using your API key
conn = connect()
coll = conn.get_collection()

### 🎥 Step 2: Upload Video

In [78]:
# Upload a video by URL (replace the url with your video)
video = conn.upload(url='https://youtu.be/RcRjY5kzia8')

### 🔍 Step 3: Analyze Scenes and Generate Scene Descriptions

Start by analyzing the scenes within your Video using VideoDB's scene indexing capabilities. This will provide context for generating the script prompt.

In [None]:
video.index_scenes()

Let's view the description of first scene of the video

In [81]:
scenes = video.get_scenes()
print(f"{scenes[0]['start']} - {scenes[0]['end']}")
print(scenes[0]["response"])

0 - 9.7
The image is a highly pixelated or low-resolution photo predominated by various shades of blue. There is no distinguishable subject due to heavy pixelation, causing the content to appear abstract. The pattern seems chaotic, with irregularly shaped and sized blocks giving an impression of a digital mosaic. There are subtle variations in color, with some pixels exhibiting a teal or turquoise hue, while others are a deeper navy blue, together creating a textured appearance. The image's focus is unclear, causing the pixels to blur together, further obscuring any possible details and making the overall content of the photograph indiscernible.


### Step 4: Generate Voiceover Script with LLM
Combine scene descriptions with the script prompt, instructing LLM to create a voiceover script in David Attenborough's style.

This script prompt can be refined and tweaked to generate the most suitable output. Check out [these examples](https://www.youtube.com/playlist?list=PLhxAMFLSSK03rsPTjRv1LbAXHQpNN6BS0) to explore more use cases.

In [None]:
import openai

client = openai.OpenAI()

script_prompt = "Here's the data from a scene index for a video about the underwater world. Study this and then generate a synced script based on the description below. Make sure the script is in the language, voice and style of Sir David Attenborough"

full_prompt = script_prompt + "\n\n"
for scene in scenes:
  full_prompt += f"- {scene}\n"

openai_res = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": full_prompt}],
)
voiceover_script = openai_res.choices[0].message.content

In [None]:
chunk_size = 500
chunks = [voiceover_script[i:i+chunk_size] for i in range(0, len(voiceover_script), chunk_size)]

### 🎤 Step 5: Generate Voiceover Audio with LOVO 

In [None]:
import requests
import time 


# Call Lovo API to generate voiceover
url = "https://api.genny.lovo.ai/api/v1/tts"
headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "X-API-KEY": os.environ.get("LOVO_API_KEY")
}

outputs = []
for chunk in chunks:
    payload = {
        "text": chunk,
        "speaker": voiceover_artist_id
    }
    lovo_res = requests.request("POST", url, json=payload, headers=headers)
    job_id = lovo_res.json()["id"]
    outputs.append({"job_id": job_id})

poll_time = 1
for output in outputs:
    completed = False
    while not completed:
        lovo_res = requests.request("GET", f"{url}/{output['job_id']}", headers=headers)
        lovo_res = lovo_res.json()['data'][0]
        completed = lovo_res["status"] == "succeeded"
        if completed:
            output["audio_url"] = lovo_res["urls"][0]
            completed = True
            break
        else:
            time.sleep(poll_time)

### 🎬 Step 6: Add Voiceover to Video with VideoDB

In order to use the voiceover generated above, let's upload the audio file (voiceover) to VideoDB first

In [89]:

for output in outputs:
    audio = coll.upload(url=output["audio_url"])
    output["audio_id"] = audio.id
    output["audio_length"] = float(audio.length)
    print("Audio Uploaded with id", audio.id)


Audio Uploaded with id a-7f071983-4eac-4b90-ab64-fdbe82e6d3cc
Audio Uploaded with id a-401d6309-45c6-4355-85d1-eae8712e1c04
Audio Uploaded with id a-ecadee95-edb6-476d-8e89-82b65c5ed157
Audio Uploaded with id a-92a01c89-1cf6-4c2c-ba75-3ecae238e616
Audio Uploaded with id a-601f01e4-0f01-4d41-a4ad-a2aaa66704f4
Audio Uploaded with id a-37c76474-dc4c-4be2-a1a6-55d8144f678e


Finally, add the AI-generated voiceover to the original footage using VideoDB's [timeline feature](https://docs.videodb.io/version-0-0-3-timeline-and-assets-44)

In [90]:
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, AudioAsset

# Create a timeline object
timeline = Timeline(conn)

# Add the video asset to the timeline for playback
video_asset = VideoAsset(asset_id=video.id)
timeline.add_inline(asset=video_asset)

seek = 0
for output in outputs:
    audio_asset = AudioAsset(asset_id=output["audio_id"])
    timeline.add_overlay(start=seek, asset=audio_asset)
    seek += output['audio_length']

### 🪄 Step 7: Review and Share
Preview the video with the integrated voiceover to ensure it functions correctly.   
Once satisfied, generate a stream of the video and share the link for others to view and enjoy this wholesome creation!


In [91]:
from videodb import play_stream

stream_url = timeline.generate_stream()
play_stream(stream_url)

'https://console.videodb.io/player?url=https://dseetlpshk2tb.cloudfront.net/v3/published/manifests/b91a79b5-1ebb-4876-ab5d-d5c2380a099d.m3u8'

---

### 🎉 Conclusion:
Congratulations! You have successfully automated the process of creating custom and personalized voiceovers based on a simple prompt and raw video footage using VideoDB, OpenAI, and ElevenLabs.

By leveraging advanced AI technologies, you can enhance the storytelling and immersive experience of your video content. Experiment with different prompts and scene analysis techniques to further improve the quality and accuracy of the voiceovers. Enjoy creating captivating narratives with AI-powered voiceovers using VideoDB! 

For more such explorations, refer to the [documentation of VideoDB](https://docs.videodb.io/) and join the VideoDB community on [GitHub](https://github.com/video-db) or [Discord](https://discord.com/invite/py9P639jGz) for support and collaboration.

We're excited to see your creations, so we welcome you to share your creations via [Discord](https://discord.com/invite/py9P639jGz), [LinkedIn](https://www.linkedin.com/company/videodb) or [Twitter](https://twitter.com/videodb_io).