<a href="https://colab.research.google.com/github/KhurramDevOps/Quarter-02/blob/master/06_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Exploring LLM Models for Creative Video Generation and Script Analysis**

In [1]:
!pip install langchain-google-genai



In [2]:
from google.colab import userdata
GENAI_KEY = userdata.get('GOOGLE_API_KEY_1')

In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI

In [4]:

llm =  ChatGoogleGenerativeAI(
    api_key = GENAI_KEY,
    model = "gemini-2.0-flash-exp",
)

# Video Generation Prompt: I have used this prompt for video Generation Using VEED STUDIO"

**Prompt:**
A fast-paced, one-minute video highlighting the journey of innovation and transformation. The video begins with a young inventor sketching ideas on a notepad under the dim glow of a desk lamp. The camera zooms into the sketches, transforming them into a real-world 3D model of a robotic arm, lifting a box in a modern, high-tech factory.

The scene shifts to a montage of breakthroughs in various fields: a scientist holding a glowing test tube in a research lab, engineers assembling a clean energy plant, and a futuristic city filled with electric cars, solar panels, and green spaces. Each clip transitions smoothly, keeping the energy high.

Halfway through, the focus shifts to the collaborative spirit of humanity: teams working together on tech projects, people sharing knowledge through virtual screens, and communities coming together to solve challenges like climate change and food scarcity. A quick shot of diverse individuals planting vertical gardens in an urban setting symbolizes sustainable progress.

The video ends with a sweeping aerial shot of Earth from a satellite, showing green innovations and bustling cities. A simple text overlay appears: **"Innovation powered by unity."**

---

## Style and Details:

- **Resolution:** 4K Ultra HD.
- **Lighting:** Bright, warm, and optimistic with a focus on green, natural tones.
- **Camera Movements:** Quick cuts, dramatic zoom-ins, and smooth pans for an energetic and modern feel.
- **Mood:** Inspiring, optimistic, and forward-looking.
- **Duration:** 60 seconds.
- **Sound Effects:** The hum of machines, soft background conversation, and the swoosh of futuristic tech.
- **Music:** A hopeful, upbeat track with a gradual crescendo, leading to a powerful and uplifting finish.


In [36]:

from google.colab import files

# Upload video file
uploaded = files.upload()

# List uploaded files
for filename in uploaded.keys():
    print(f'Uploaded file: {filename}')

Saving Innovation Unleashed_ Unity in Progress-VEED.mp4 to Innovation Unleashed_ Unity in Progress-VEED (2).mp4
Uploaded file: Innovation Unleashed_ Unity in Progress-VEED (2).mp4


In [12]:
from google import genai
from google.genai import Client
client = Client(api_key=GENAI_KEY)

In [13]:
model: str = "gemini-2.0-flash-exp"

## Video Upload Function
This code uploads a video file to the server and continuously checks its processing status. Once complete, it retrieves the video's URI for further use. If processing fails, an error is raised.


In [15]:
import time

def upload_video(video_file_name):
  video_file = client.files.upload(path="/content/Innovation Unleashed_ Unity in Progress-VEED (1).mp4")
  while video_file.state == "PROCESSING":
      print('Waiting for video to be processed.')
      time.sleep(10)
      video_file = client.files.get(name=video_file.name or "")

  if video_file.state == "FAILED":
    raise ValueError(video_file.state)
  print(f'Video processing complete: ' + (video_file.uri or ""))

  return video_file

pottery_video = upload_video('Pottery.mp4')

Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/sy0l4w8c48ir


## Video Caption Generator
This function generates detailed, time-coded captions for a video using the GENAI API and a specified AI model. It takes a video file URI as input, applies a custom prompt to describe scenes with captions and spoken text, and formats the output in Markdown for easy readability and use in presentations or documentation.


In [16]:

from IPython.display import display, Audio ,Markdown

In [27]:
from google.genai.types import Part, Content
def generate_video_captions(video):

    # Predefined model and client
    model = "gemini-2.0-flash-exp"  # Replace with your actual model name
    client: Client = genai.Client(
    api_key=GENAI_KEY,
    )
    prompt = """For each scene in this video,
                generate captions that describe the scene, along with spoken text.
                Place each caption into an object with the timecode of the caption in the video.
             """

    # Generate content using the API
    response = client.models.generate_content(
        model=model,
        contents=[
            Content(
                role="user",
                parts=[
                    Part.from_uri(
                        file_uri=video.uri or "",
                        mime_type=video.mime_type or ""
                    ),
                ]
            ),
            prompt,
        ]
    )

    # Extract and format the response as Markdown
    scenes = response.text
    return Markdown(scenes)


In [28]:
generate_video_captions(pottery_video)

```json
[
  {
    "timecode": "00:00",
    "caption": "A hand draws a building on paper with a pencil. The words “IN A WORLD” appear on the screen along with a globe emoji.",
     "spoken_text": "In a world"
  },
    {
    "timecode":"00:01",
    "caption": "A hand continues drawing on the paper, the words “DRIVEN BY INNOVATION” and a light bulb emoji appear.",
    "spoken_text":"driven by innovation, "
  },
 {
    "timecode":"00:02",
    "caption": "The words “IDEA BEGINS WITH A” and lightbulb emoji appear.",
    "spoken_text":"every great idea begins with a "
 },
  {
    "timecode":"00:04",
     "caption": "The words “SPARK OF CREATIVITY” appear with a picture frame emoji",
     "spoken_text":"spark of creativity. "
   },
  {
   "timecode": "00:05",
   "caption": "The words “PICTURE A YOUNG INVENTOR” appear with an orange ball of yarn emoji.",
   "spoken_text": "Picture a young inventor"
  },
 {
    "timecode": "00:07",
     "caption": "The words “HUNCHED OVER A NOTEPAD” and an orange ball of yarn emoji appear.",
      "spoken_text":"hunched over a notepad,"
 },
  {
  "timecode":"00:09",
  "caption":"The words “DREAMS UNDER THE” and an orange ball of yarn emoji appear.",
  "spoken_text":"sketching dreams under the soft glow of a desk lamp."
 },
 {
    "timecode":"00:13",
     "caption":"A post office emoji appears with the words “THOSE SKETCHES”",
     "spoken_text":"Those sketches"
  },
{
   "timecode":"00:14",
   "caption": "The words “DON’T JUST STAY ON PAPER” appear",
   "spoken_text":"don't just stay on paper."
  },
   {
   "timecode":"00:15",
    "caption":"A robotic arm in a factory is in motion, a mechanical arm emoji is shown, and “THEY LEAD” appears.",
    "spoken_text":"They lead"
  },
 {
    "timecode":"00:16",
      "caption":"The words “TO LIFE AS A ROBOTIC ARM” appear with a mechanical arm emoji.",
      "spoken_text":"to life as a robotic arm"
  },
  {
    "timecode":"00:18",
     "caption": "The words “EFFORTLESSLY LIFTING BOXES” appear with a pair of scissors emoji.",
     "spoken_text":"effortlessly lifting boxes"
  },
  {
    "timecode":"00:20",
    "caption":"The words “IN A CUTTING EDGE FACTORY” and light bulb emoji appear.",
    "spoken_text": "in a cutting edge factory. Innovation doesn't stop there."
  },
{
  "timecode":"00:23",
  "caption":"Footprint emojis appear with the words “ITS A RELENTLESS JOURNEY”.",
   "spoken_text":"It's a relentless journey"
  },
{
    "timecode":"00:25",
     "caption":"A green test tube emoji appears with the words “THROUGH BREAKTHROUGHS”.",
      "spoken_text":"through breakthroughs"
  },
 {
  "timecode":"00:26",
  "caption":"The words “IN SCIENCE AND TECHNOLOGY” appear along with a green test tube emoji.",
   "spoken_text":"in science and technology."
  },
 {
  "timecode":"00:28",
   "caption":"A scientist emoji holding a green test tube appears with the words “IMAGINE A”",
    "spoken_text":"Imagine a"
  },
 {
 "timecode":"00:29",
 "caption":"The words “SCIENTIST IN A LAB” appear with a scientist emoji holding a green test tube",
 "spoken_text":"scientist in a lab"
  },
{
   "timecode":"00:30",
   "caption":"The words “HOLDING A GLOWING TEST TUBE” appear with a yellow moon emoji.",
    "spoken_text":"holding a glowing test tube"
  },
{
   "timecode":"00:33",
   "caption": "A yellow moon emoji appears with the words “A SYMBOL OF DISCOVERY”",
   "spoken_text":"a symbol of discovery."
   },
{
  "timecode":"00:34",
 "caption": "A safety vest emoji appears with the words “ENGINEERS”",
 "spoken_text":"Engineers"
 },
{
  "timecode":"00:36",
  "caption":"A safety vest emoji appears with the words “ARE HARD AT WORK”",
  "spoken_text":"are hard at work"
 },
{
   "timecode":"00:37",
 "caption":"The words “ASSEMBLING CLEAN ENERGY PLANTS” appear with a lightning bolt emoji.",
  "spoken_text":"assembling clean energy plants"
 },
{
   "timecode":"00:38",
 "caption": "The words “THAT PROMISE A SUSTAINABLE FUTURE” and lightning bolt emoji appear.",
 "spoken_text":"that promise a sustainable future."
  },
{
  "timecode":"00:41",
  "caption":"Sunglasses emoji appear with the words “AND THEN THERES” ",
  "spoken_text":"And then there's"
  },
{
  "timecode":"00:42",
 "caption":"Sunglasses appear with the words “THE VISION OF A FUTURISTIC CITY”",
 "spoken_text":"the vision of a futuristic city"
 },
{
   "timecode":"00:45",
 "caption":"The words “ALIVE WITH ELECTRIC CARS” and a lightning bolt emoji appear.",
 "spoken_text":"alive with electric cars,"
  },
 {
  "timecode":"00:46",
  "caption":"The words “SOLAR PANELS AND LUSH GREEN SPACES” appear.",
  "spoken_text":"solar panels and lush green spaces."
   },
{
  "timecode":"00:49",
  "caption":"A heart emoji appears with the words “BUT THE HEART OF INNOVATION LIES”",
  "spoken_text":"But the heart of innovation lies"
 },
{
    "timecode":"00:50",
 "caption":"The words “IN COLLABORATION” appear.",
 "spoken_text":"in collaboration."
  },
{
    "timecode":"00:52",
    "caption":"The words “TEAMS UNITE, SHARING” appear with a stack of books emoji",
     "spoken_text":"Teams unite, sharing"
   },
 {
   "timecode":"00:53",
   "caption":"The words “KNOWLEDGE ACROSS VIRTUAL SCREENS” appear with a stack of books emoji.",
   "spoken_text":"knowledge across virtual screens."
   },
{
   "timecode":"00:56",
   "caption": "The words “TEAM TACKLING THE WORLD’S TOUGHEST CHALLENGES” and a person with person emoji appear.",
     "spoken_text":"team tackling the world's toughest challenges."
   },
 {
    "timecode":"00:57",
  "caption": "A back and forth arrow emoji appears with the words “CLIMATE”",
   "spoken_text":"climate"
 },
 {
  "timecode":"00:58",
 "caption":"A back and forth arrow emoji appears with the words “CHANGE, FOOD”",
  "spoken_text":"change, food"
 },
 {
   "timecode":"01:00",
  "caption":"The words “SCARCITY AND” appear with a back and forth arrow emoji.",
   "spoken_text":"and"
 },
 {
  "timecode":"01:01",
  "caption":"The words “DIVERSE INDIVIDUALS” and a person with person emoji appear.",
    "spoken_text":"Diverse individuals"
 },
{
  "timecode":"01:03",
 "caption":"The words “COME TOGETHER PLANTING” appear with a person with person emoji",
 "spoken_text":"come together, planting"
  },
 {
  "timecode":"01:04",
  "caption":"The words “VERTICAL GARDENS IN URBAN LANDSCAPES” appear with an up and down arrow emoji.",
 "spoken_text":"vertical gardens in urban landscapes,"
  },
{
  "timecode":"01:05",
  "caption":"A pregnant person emoji appears with the words “NURTURING SUSTAINABLE”",
   "spoken_text":"nurturing sustainable"
  },
{
    "timecode":"01:07",
   "caption":"The words “PROGRESS WHERE ITS NEEDED MOST” appear.",
    "spoken_text":"progress right where it's needed most."
  },
{
    "timecode":"01:09",
  "caption":"A magnifying glass emoji appears with the words “AS WE ZOOM OUT”",
  "spoken_text":"As we zoom out,"
   },
{
  "timecode":"01:11",
  "caption":"The words “WE SEE EARTH FROM ABOVE” and magnifying glass emoji appear.",
   "spoken_text":"we see Earth from above,"
   },
{
  "timecode":"01:12",
  "caption":"A rainbow emoji appears with the words “A VIBRANT”",
   "spoken_text":"a vibrant"
   },
{
    "timecode":"01:13",
   "caption":"A rainbow emoji appears with the words “TAPESTRY OF GREEN INNOVATIONS”",
   "spoken_text":"tapestry of green innovations"
   },
{
    "timecode":"01:15",
 "caption":"The words “AND BUSTLING CITIES” appear with a rainbow emoji",
  "spoken_text":"and bustling cities."
   },
{
    "timecode":"01:16",
  "caption":"A flexed biceps emoji appears with the words “THIS”",
    "spoken_text":"This"
 },
{
  "timecode":"01:17",
  "caption":"The words “IS THE POWER OF UNITY” appear with a flexed bicep emoji.",
    "spoken_text":"is the power of unity"
  },
{
   "timecode":"01:18",
  "caption":"The words “IN INNOVATION” appear.",
  "spoken_text":"in innovation."
   },
{
   "timecode":"01:19",
    "caption":"A person with person emoji appears with the words “TOGETHER”",
    "spoken_text":"Together,"
  },
 {
 "timecode":"01:20",
  "caption":"A person with person emoji appears with the words “WE'RE NOT JUST”",
   "spoken_text":"we're not just"
  },
 {
   "timecode":"01:21",
  "caption": "A house emoji appears with the words “IMAGINING THE FUTURE”",
   "spoken_text":"imagining the future,"
 },
  {
  "timecode":"01:23",
  "caption":"The words “WE’RE BUILDING IT” appear.",
   "spoken_text":"we're building it."
 }
]
```

# **Reflection on the Experience**

####Throughout this assignment, I encountered significant challenges in finding a suitable text-to-video model that could be used for free in Google Colab. I explored several options, including platforms like Runway, Pictory, Pixel AI, and Synthesia, but most of them either required paid plans or did not offer the flexibility I needed. Despite attempting to integrate over 10 to 15 different models, I couldn't find a workable solution within the constraints of free access.

####Given these limitations, I decided to take an alternative approach by utilizing the Gemini LLM with LangChain for video analysis. This allowed me to focus on generating a detailed script and timeline from the video, which still fulfilled the assignment's objectives. While it wasn't the original text-to-video generation I envisioned, it provided valuable insights into the power of LLMs in content creation and analysis.

####Overall, this experience taught me the importance of flexibility and creative problem-solving when facing technical constraints. It also highlighted the need to adapt to available resources and find innovative ways to achieve the desired outcomes.
