<a href="https://colab.research.google.com/github/abida229/Python_APIs/blob/main/Assignment_06_Exploring_LLM_Models_for_Creative_Video_Generation_and_Script_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Objective:**
The goal of this assignment is to explore the capabilities of Large Language Models (LLMs) in generating creative content. Students will use an LLM to generate a video based on a creative prompt, save the generated video, and analyze it using Gemini LLM to produce a script with a detailed timeline.

# **Step 1: Generate a Creative Video**

Select an LLM model capable of video generation (e.g., a multimodal LLM like OpenAI’s GPT-4 Vision or other suitable platforms).
Design a creative and detailed prompt for video generation. For example, “Create a short animated video showing the life of a tree through the seasons.”
Generate the video using the selected LLM.

In [5]:
#this line of code connect colab to the google drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# **Prompt 1:**
"Create a 60-second cinematic video of a powerful ocean storm. Towering waves rise and crash violently against jagged, dark rocky cliffs, sending sprays of white foam into the air. The sky is a chaotic mix of swirling dark gray clouds, periodically illuminated by blinding flashes of lightning that streak across the horizon. Thunder rumbles deeply, shaking the atmosphere. In the foreground, sea birds struggle to stay aloft, their silhouettes faintly visible against the stormy sky. The camera pans from the turbulent water to the cliffs, showing streams of rainwater cascading down the rocks. As the lightning flashes, it briefly reveals a ship in the distance, battling the waves with its sails torn and mast creaking under the wind’s force. The scene concludes with a dramatic close-up of a wave breaking in slow motion, the spray glistening in the fleeting light of another lightning strike."


In [7]:
#copy path of video from google drive
video_1 = "/content/drive/MyDrive/Colab Notebooks/Agentic AI code/Epic _Ocean_Storm.mp4"

In [4]:
from google.colab import userdata
Google_api = userdata.get('GOOGLE_API_KEY')

In [2]:
!pip install langchain_google_genai

Collecting langchain_google_genai
  Downloading langchain_google_genai-2.0.7-py3-none-any.whl.metadata (3.6 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain_google_genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Downloading langchain_google_genai-2.0.7-py3-none-any.whl (41 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading filetype-1.2.0-py2.py3-none-any.whl (19 kB)
Installing collected packages: filetype, langchain_google_genai
Successfully installed filetype-1.2.0 langchain_google_genai-2.0.7


In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI

In [6]:
llm = ChatGoogleGenerativeAI(
    api_key = Google_api,
    model = "gemini-2.0-flash-exp",

)

In [11]:
from google import genai
from google.genai import Client


In [12]:
model: str = "gemini-2.0-flash-exp"
client: Client = genai.Client(
    api_key = Google_api,
)

# **Generate a Script with Timeline**

In [17]:
from IPython.display import  display, Audio, Markdown

In [22]:
# Upload all the videos using the File API.
# You can find more details about how to use it in the Get Started notebook.
# This can take a couple of minutes as the videos will need to be processed and tokenized.

import time

def upload_video(video_file_name):
  video_file = client.files.upload(path="/content/drive/MyDrive/Colab Notebooks/Agentic AI code/Epic _Ocean_Storm.mp4")
  while video_file.state == "PROCESSING":
      print('Waiting for video to be processed.')
      time.sleep(10)
      video_file = client.files.get(name=video_file.name or "")

  if video_file.state == "FAILED":
    raise ValueError(video_file.state)
  print(f'Video processing complete: ' + (video_file.uri or ""))

  return video_file

pottery_video = upload_video('Pottery.mp4')

Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/5q2bu2k5uauk


In [23]:
from google.genai.types import Content, Part
prompt = """For each scene in this video,
            generate captions that describe the scene along with any spoken text placed in quotation marks.
            Place each caption into an object with the timecode of the caption in the video.
         """

video = pottery_video

response = client.models.generate_content(
    model=model,
    contents=[
        Content(
            role="user",
            parts=[
                Part.from_uri(
                    file_uri=video.uri or "",
                    mime_type=video.mime_type or ""),
                ]),
        prompt,
    ]
)

Markdown(response.text)

```json
[
    {
        "timecode": "0:00",
        "caption": "An aerial view of a beach with many people relaxing on the sand and swimming in the ocean. A green and yellow lifeguard tower is in the foreground and text overlayed says “THE OCEAN.” The narrator says, “The ocean is a force of nature, wild and untamed.”"
     },
    {
        "timecode":"0:03",
         "caption":"The aerial shot continues, and now text overlay says, “LOOK.” An animated character on a surfboard is added to the image. The narrator says, “Look at those towering waves crashing against the jagged cliffs with a fury that echoes through the air.”"
     },
    {
         "timecode":"0:07",
         "caption":"An aerial view of a rocky shore and the ocean with an eagle flying by. A red angry emoji is in the center of the image, and the narrator says, “Each surge sends sprays of foam soaring, a reminder of the power that lies beneath the surface.”"
     },
     {
         "timecode":"0:16",
        "caption":"Dark rain clouds fill the sky over a rural scene. A white cloud and yellow sun emoji is added to the center of the screen. The narrator says, “The sky churns with dark clouds, illuminated by lightning that splits the horizon.”"
     },
    {
         "timecode":"0:21",
         "caption":"The sky is full of dark, ominous clouds with rain coming down. Sushi is placed on the screen, and the narrator says, “a fleeting glimpse of nature’s raw energy.”"
     },
     {
         "timecode":"0:25",
        "caption":"Dark clouds and rain in the sky over a green, rural area. A raining cloud emoji is overlayed onto the scene. The narrator says, “Thunder rolls like a beast awakening, shaking the very ground beneath us.”"
     },
      {
        "timecode":"0:30",
        "caption":"A grey sky is filled with many birds flying over an urban landscape. A black flag emoji is added to the center of the screen. The narrator says, “In the chaos, sea birds fight to stay airborne.”"
     },
    {
        "timecode":"0:33",
        "caption":"A grey sky with many birds flying. A person riding a wheelchair emoji is added to the center of the screen. The narrator says, “Their silhouettes a testament to resilience.”"
     },
    {
         "timecode":"0:37",
         "caption":"A cruise ship in the ocean with a setting sun. The narrator says, “And there in the distance, a ship battles the storm.”"
     },
    {
         "timecode":"0:40",
         "caption":"A ship is on the horizon of the sea. A white cloud with a lightening bolt emoji is added. The narrator says, “Its sails torn and mast creaking.”"
     },
      {
         "timecode":"0:43",
        "caption":"The ocean is visible with a text and music note emoji overlayed. The narrator says, “A symbol of human determination against the elements.”"
     },
     {
         "timecode":"0:47",
         "caption":"A ship is being tossed around in a rough sea. A purple umbrella emoji with rain is added. The narrator says, “As the rain cascades down the rocks.”"
     },
     {
        "timecode": "0:49",
        "caption":"Water is falling off of rocks. A mountain landscape emoji is overlayed. The narrator says, “The scene intensifies culminating in a breathtaking close up of a wave.”"
     },
      {
         "timecode":"0:53",
         "caption":"A wave is crashing on the beach. An upward arrow emoji is in the middle of the screen and the narrator says, “breaking in slow motion.”"
     },
    {
        "timecode":"0:55",
        "caption":"A wave is crashing on the beach and a raindrop emoji is visible. The narrator says, “Each droplet catching the light.”"
     },
     {
         "timecode":"0:58",
        "caption":"A wave is crashing on the beach. A hand with painted fingernails emoji is in the center of the screen. The narrator says, “a fleeting moment of beauty amid the tempest.”"
     },
    {
        "timecode":"1:01",
         "caption":"A wave is crashing on the beach, and a red, angry emoji is in the center. The narrator says, “This is the ocean's fury.”"
     },
     {
        "timecode":"1:03",
        "caption":"The ocean is viewable with white foam. A red, angry emoji is in the center of the screen. The narrator says, “A reminder of our place in the world.”"
     }
 ]
```

#**Prompt 2:**
"Generate a video showcasing the effects of climate change, with rising sea levels, melting glaciers, and efforts to combat global warming."

In [24]:
#video path from google drive
video = "/content/drive/MyDrive/Colab Notebooks/Agentic AI code/Save _Planet.mp4"


In [25]:
# Upload all the videos using the File API.
# You can find more details about how to use it in the Get Started notebook.
# This can take a couple of minutes as the videos will need to be processed and tokenized.

import time

def upload_video(video_file_name):
  video_file = client.files.upload(path="/content/drive/MyDrive/Colab Notebooks/Agentic AI code/Save _Planet.mp4")
  while video_file.state == "PROCESSING":
      print('Waiting for video to be processed.')
      time.sleep(10)
      video_file = client.files.get(name=video_file.name or "")

  if video_file.state == "FAILED":
    raise ValueError(video_file.state)
  print(f'Video processing complete: ' + (video_file.uri or ""))

  return video_file

pottery_video = upload_video('Pottery.mp4')

Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/4sixlegvap32


In [26]:
from google.genai.types import Content, Part
prompt = """For each scene in this video,
            generate captions that describe the scene along with any spoken text placed in quotation marks.
            Place each caption into an object with the timecode of the caption in the video.
         """

video = pottery_video

response = client.models.generate_content(
    model=model,
    contents=[
        Content(
            role="user",
            parts=[
                Part.from_uri(
                    file_uri=video.uri or "",
                    mime_type=video.mime_type or ""),
                ]),
        prompt,
    ]
)

Markdown(response.text)

```json
[
  {
    "timecode": "00:00",
    "caption": "An aerial view of a bridge with a road on it, over a body of water that has a lot of turbulent water movement. “Everyday our planet faces a growing crisis: climate change.”"
  },
  {
    "timecode": "00:07",
    "caption": "An aerial view of a barrier with water flowing through it.  “Rising sea levels are swallowing coastlines.”"
  },
    {
    "timecode": "00:08",
    "caption": "An aerial view of large pieces of ice floating in water, with snow-capped mountains in the distance. “Glaciers, once majestic and vast, are now disappearing at an alarming rate.”"
  },
    {
    "timecode": "00:14",
    "caption": "An aerial view of a large body of water, with land and mountains in the distance. “Extreme weather events are becoming our new reality.”"
  },
    {
    "timecode": "00:18",
    "caption": "An aerial view of a large body of water, with a light colored surface. “But it's not all doom and gloom.”"
  },
    {
    "timecode": "00:21",
    "caption": "A girl in a light blue shirt, black pants, and pink shoes planting a tree outside.  “Around the world, people are fighting back.”"
  },
  {
    "timecode": "00:24",
    "caption": "An aerial view of an area with a large amount of solar panels and wind turbines. “Adopting renewable energy, planting billions of trees, and innovating for a sustainable future.”"
  },
  {
    "timecode": "00:29",
     "caption": "Close up of people planting a tree. “Change starts with awareness, and awareness starts with you.”"
  },
   {
    "timecode": "00:33",
    "caption": "An aerial view of several wind turbines in a body of water. “Together, we can make a difference.”"
  },
   {
    "timecode": "00:36",
    "caption":"An aerial view of solar panels on a rooftop. “Act now because our planet doesn't have a pause button.”"
  }
]
```