# Setup

In [1]:
import logging
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

In [3]:
# take openai key as input secret
import getpass
openai_key = getpass.getpass("Enter your OpenAI API key: ")

Enter your OpenAI API key:  ········


In [4]:
import os
os.environ["OPENAI_API_KEY"] = openai_key

# Read Your Story

In [5]:
import os
from pathlib import Path
data_dir = Path(os.getcwd()) / "data" / "0"
parts = []
files = [f for f in os.listdir(data_dir) if f.endswith('.txt')]
logger.info(f"Files in data directory: {files}")
for f in files:
    with open(data_dir / f, 'r', encoding='utf8') as file:
        content = file.read()
        parts.append(content)
logger.info(f"Read {len(parts)} parts from the story files.")


INFO:__main__:Files in data directory: ['0.txt', '1.txt', '2.txt', '3.txt']
INFO:__main__:Read 4 parts from the story files.


In [6]:
parts[1]

'The Arrival\nIt was past 8:00 PM when Riya, Kabir, and Mehul entered the silent city of Bansipur.\nA faint mist curled along the cracked roads, the streetlamps flickering as if unsure they wanted to stay lit.\nThey had been driving for hours, lost after taking a wrong turn from the highway. Fuel gauge—dangerously low.\nBansipur looked deserted, except for one building at the end of the main road:\na bright, glowing sign that read “Mehta Super Mart – Always Open.”\nRiya gave a nervous laugh.\n“Creepy or not, we need snacks. And water. And… maybe a map.”\nThey parked, the sound of their car engine echoing too loudly in the empty street.\nThe sliding doors of the supermarket opened with a slow hiss, though no one stood behind the counter.\nInside, the air was too cold for summer.\nThe lights buzzed overhead, but the aisles were perfectly stocked—cereal boxes lined like soldiers, canned goods gleaming, fruits unnaturally shiny.\nMehul called out, “Hello? Anyone here?”\nOnly the sound of t

# Scenes Creation

In [96]:
prompt = """
You are a YouTube Shorts storyboard generator, cinematic scene visual prompt writer, and animation planner.

I will give you a story.  
Your task is to split it into coherent scenes for a short vertical video (9:16 aspect ratio) and ensure narration, visuals, and motion match **exactly**.

---

## Scene Planning Rules
- **Sentence coverage:** Every sentence in the story **must appear** in the `narration_text` of some scene, in the same order as the story.  
  - Do not skip or merge sentences in narration.  
  - If a sentence is too short, merge it only with the immediately next one, but never drop it.
- **Scene chunking:** End a scene whenever the location, main subject, or time changes. Never mix two locations or actions in one scene.
- **Narration timing:** 
  - Target total video duration: {target_duration_sec} seconds
  - Narration pace: {words_per_second} words/sec
  - Each scene duration: 6–12 seconds
- **Continuity:** Narration and visuals must represent the *same* moment in time.
- **Perspective:** Visual prompts must always specify the exact camera POV (e.g., “point-of-view from driver’s seat,” “over-the-shoulder from Riya,” “low-angle looking up at supermarket sign”).
- **Sound:** Narration output must also suggest background music or ambient sound effects that match the mood.
- **Consistency:** Characters, props, vehicles, and environment must remain visually consistent across all scenes unless narration explicitly changes them.
- **Aspect Ratio:** Always assume vertical 9:16 framing.

---

## Visual Prompt Requirements

Each scene must have **staged prompts**, structured as an array called `visual_prompts`.  
Each element in the array is an object with:

- `stage_type`: one of **["base", "lighting", "style", "details"]**  
- `prompt`: description text  

## Rules:
1. **Stage 0 (base):**  
   - Always defines **camera POV, subject, setting, and core action**.  
   - Example: `"Wide-angle from above, showing a car entering a misty, deserted city street at night."`

2. **Stage 1 (lighting):**  
   - Defines **time of day, weather, illumination, and atmosphere**.  
   - Example: `"Nighttime with flickering streetlamps and a faint mist."`

3. **Stage 2 (details):**  
   - Specifies **foreground, midground, background breakdown, textures, and extra props**.  
   - For text on **signs, posters, or screens**, specify exactly as:  
     `"in-frame, clear, sharp, legible text displaying EXACTLY: 'YOUR TEXT HERE'."`  
   - Example: `"Foreground: cracked asphalt roads. Midground: car headlights reflecting on wet pavement. Background: mist swirling around tall buildings."`

5. **If a stage is not needed, omit it.**  

---

## Example Structure

```json
{{
  "visual_prompts": [
    {{
      "stage_type": "base",
      "prompt": "Wide-angle from above, showing a vintage car driving into a foggy, deserted city street at night."
    }},
    {{
      "stage_type": "lighting",
      "prompt": "Dim streetlights casting long shadows, soft mist illuminated by headlights, overall nighttime ambiance."
    }},
    {{
      "stage_type": "details",
      "prompt": "Foreground: cracked roads with reflections of neon lights. Midground: car headlights cutting through mist. Background: tall buildings fading into fog. A shop sign with in-frame, clear, sharp, legible text displaying EXACTLY: 'NIGHT CAFE'."
    }}
  ]
}}
---

## Animation Planning
For each scene, pick the most suitable animation type based on the visual structure:

- **Ken Burns:** Slow zoom/pan for close-up or still images.
- **Parallax:** For layered depth (foreground/midground/background separation).
- **Cinemagraph:** For subtle looping elements (flickering lights, mist, fire, rain).
- **Dolly Zoom / Push In / Pan:** For dramatic tension or reveals.
- **Static:** For moments meant to feel still and frozen.

---

## Output
Return a valid JSON object with the following keys:

- `profile` (object) with:
  - `geographic_location`: {{ "country": str, "specific_location": str }}
  - `time_period`: {{ "era": str, "time_of_day": str }}
  - `weather`: {{ "condition": str, "details": str }}
  - `ethnicity`: str
  - `mood`: str
  - `characters`: list of objects with:
    - `name`: str (or "Unnamed" if not specified)
    - `role`: str (hero, narrator, bystander, etc.)
    - `visual_features`: str (clothing, hair, build, etc.)
    - `psychological_features`: str (personality, emotions, motivations)

- `scenes`: array of objects, each containing:
  - `scene_index` (int): Scene number starting at 0.
  - `narration_text` (str): Voiceover narration from the original story.  
        ⚠ Must include every sentence of the story, in order, without omission.  
  - `subtitle_text` (str): Short text (≤12 words) for on-screen subtitle.
  - `visual_prompts` (array of objects):
        [
          {{"stage_type": "base", "prompt": "..."}},
          {{"stage_type": "lighting", "prompt": "..."}},
          {{"stage_type": "details", "prompt": "..."}}
        ]
  - `background_music` (str): Suggested BGM/ambient sound.
  - `animation_type` (str): One of ["Ken Burns", "Parallax", "Cinemagraph", "Dolly Zoom", "Static"].
  - `duration_sec` (int): Scene length in seconds.

---

Story:
{story_text}
"""

In [97]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
template = ChatPromptTemplate.from_template(prompt)

In [98]:
from pydantic import BaseModel, Field
from typing import List, Optional, Literal


class GeographicLocation(BaseModel):
    country: Optional[str] = Field(None, description="Likely country where the story takes place")
    specific_location: Optional[str] = Field(None, description="Specific setting of the story")


class TimePeriod(BaseModel):
    era: Optional[str] = Field(None, description="Era or historical context")
    time_of_day: Optional[str] = Field(None, description="Time of day in which the story occurs")


class Weather(BaseModel):
    condition: Optional[str] = Field(None, description="Weather condition (rainy, sunny, snowy)")
    details: Optional[str] = Field(None, description="Extra weather details (mist, storm, clear sky)")


class Character(BaseModel):
    name: str = Field(..., description="Character name or 'Unnamed'")
    role: Literal["hero", "narrator", "bystander", "antagonist", "supporting"] = Field(
        ..., description="Role in the story"
    )
    visual_features: Optional[str] = Field(None, description="Appearance and clothing")
    psychological_features: Optional[str] = Field(None, description="Personality, emotions, motivations")


class Profile(BaseModel):
    geographic_location: GeographicLocation
    time_period: TimePeriod
    weather: Weather
    ethnicity: Optional[str] = Field(None, description="Ethnicity of main characters if implied")
    mood: Optional[str] = Field(None, description="Overall mood or tone of the scenes")
    characters: List[Character] = Field(default_factory=list, description="List of main characters")


class VisualPromptStage(BaseModel):
    stage_type: Literal["base", "lighting", "details"] = Field(
        ..., description="Type of visual refinement stage"
    )
    prompt: str = Field(..., description="Prompt text for this stage")

class Style(BaseModel):
    style_description: str = Field(..., description="Cinematic noir, photorealistic, 35mm film still, deep contrast and grainy texture.") 

class Scene(BaseModel):
    scene_index: int = Field(..., description="Scene number starting at 0")
    narration_text: str = Field(..., description="Voiceover narration from the original story")
    subtitle_text: str = Field(..., max_length=50, description="Short subtitle (≤12 words)")
    visual_prompts: List[VisualPromptStage] = Field(
        ..., description="Array of visual prompts for staged image generation"
    )
    background_music: str = Field(..., description="Suggested BGM/ambient sound")
    animation_type: Literal["Ken Burns", "Parallax", "Cinemagraph", "Dolly Zoom", "Static"] = Field(
        ..., description="Animation type"
    )
    duration_sec: int = Field(..., ge=3, le=15, description="Scene duration in seconds (3–15)")


class StoryBoard(BaseModel):
    profile: Profile
    style: Style
    scenes: List[Scene]


In [99]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0.2,
    openai_api_key=openai_key,
)

llm = llm.with_structured_output(StoryBoard)  # Enable structured output for Scenes model

In [100]:
chain = template | llm

In [101]:
result = chain.invoke({
    "story_text": "\n\n".join(parts),
    "target_duration_sec": 60,
    "words_per_second": 3
})


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [102]:
len(result.scenes)

14

In [103]:
result.profile

Profile(geographic_location=GeographicLocation(country='India', specific_location='Bansipur'), time_period=TimePeriod(era='modern', time_of_day='night'), weather=Weather(condition='misty', details='faint mist curling along roads'), ethnicity='South Asian', mood='eerie and suspenseful', characters=[Character(name='Riya', role='hero', visual_features='long black hair, wearing a denim jacket and jeans', psychological_features='nervous but determined'), Character(name='Kabir', role='supporting', visual_features='short hair, wearing a hoodie and cargo pants', psychological_features='cautious and observant'), Character(name='Mehul', role='supporting', visual_features='curly hair, wearing a t-shirt and backpack', psychological_features='curious and skeptical')])

In [105]:
result.style

Style(style_description='Cinematic noir, photorealistic, 35mm film still, deep contrast and grainy texture.')

In [104]:
result.scenes[0]

Scene(scene_index=0, narration_text='It was past 8:00 PM when Riya, Kabir, and Mehul entered the silent city of Bansipur. A faint mist curled along the cracked roads, the streetlamps flickering as if unsure they wanted to stay lit.', subtitle_text='Entering Bansipur at night.', visual_prompts=[VisualPromptStage(stage_type='base', prompt='Wide-angle from above, showing a car entering a misty, deserted city street at night.'), VisualPromptStage(stage_type='lighting', prompt='Nighttime with flickering streetlamps and a faint mist.'), VisualPromptStage(stage_type='details', prompt='Foreground: cracked asphalt roads. Midground: car headlights reflecting on wet pavement. Background: mist swirling around tall buildings.')], background_music='Eerie ambient tones with distant echoes.', animation_type='Parallax', duration_sec=10)

# Scene Profile (TBD)

In [16]:
from tabulate import tabulate

def to_profile_tab(profile: Profile) -> str:
    rows = [
        ["Geographic Location", f"{profile.geographic_location.country or ''}, {profile.geographic_location.specific_location or ''}"],
        ["Time Period", f"Era: {profile.time_period.era or ''}, Time: {profile.time_period.time_of_day or ''}"],
        ["Weather", f"{profile.weather.condition or ''} ({profile.weather.details or ''})"],
        ["Ethnicity", profile.ethnicity or ""],
        ["Mood", profile.mood or ""],
    ]
    
    # Add characters as a sub-table
    char_rows = []
    for c in profile.characters:
        char_rows.append([
            c.name,
            c.role,
            c.visual_features or "",
            c.psychological_features or ""
        ])
    
    table_str = tabulate(rows, headers=["Attribute", "Value"], tablefmt="grid")
    
    if char_rows:
        char_table = tabulate(
            char_rows, 
            headers=["Name", "Role", "Visual Features", "Psychological Features"], 
            tablefmt="grid"
        )
        table_str += "\n\nCharacters:\n" + char_table
    
    return table_str


In [17]:
profile_str = to_profile_tab(result.profile)

In [18]:
print(profile_str)


+---------------------+--------------------------------------------+
| Attribute           | Value                                      |
| Geographic Location | India, Bansipur                            |
+---------------------+--------------------------------------------+
| Time Period         | Era: modern, Time: night                   |
+---------------------+--------------------------------------------+
| Weather             | misty (faint mist, flickering streetlamps) |
+---------------------+--------------------------------------------+
| Ethnicity           | South Asian                                |
+---------------------+--------------------------------------------+
| Mood                | eerie and suspenseful                      |
+---------------------+--------------------------------------------+

Characters:
+--------+--------+-----------------------------------------+--------------------------+
| Name   | Role   | Visual Features                         | Psycholo

# Stage Area

In [26]:
import os
from uuid import uuid4
from pathlib import Path

id = str(uuid4())

stage_dir = Path(os.getcwd()) / "stage" / id
stage_dir.mkdir(parents=True, exist_ok=True)

In [27]:
images_dir = stage_dir / "images"
raw_images_dir = images_dir / "raw"
raw_images_dir.mkdir(parents=True, exist_ok=True)

In [28]:
clean_images_dir = images_dir / "clean"
clean_images_dir.mkdir(parents=True, exist_ok=True)

In [29]:
audio_dir = stage_dir / "audios"
audio_dir.mkdir(parents=True, exist_ok=True)

In [30]:
video_dir = stage_dir / "videos"
video_dir.mkdir(parents=True, exist_ok=True)

# Image Generation

In [143]:
current = result.scenes[13]
print(current)

scene_index=13 narration_text='They looked exactly like Riya, Kabir, and Mehul—but their faces were twisted, rotting, their eyes black pits. The next flicker of light went out completely. In the darkness, the footsteps came closer. The last thing anyone heard was the supermarket doors hissing open upstairs… …with no one coming out.' subtitle_text='The final horror.' visual_prompts=[VisualPromptStage(stage_type='base', prompt="Close-up of the ghostly figures' faces, twisted and rotting."), VisualPromptStage(stage_type='lighting', prompt='Flickering lights casting eerie shadows.'), VisualPromptStage(stage_type='details', prompt='Foreground: ghostly faces. Midground: flickering lights. Background: shadowy walls.')] background_music='Creepy ambient tones with a rising tension.' animation_type='Cinemagraph' duration_sec=12


In [144]:
whole = '\n'.join([ prompt.prompt for prompt in current.visual_prompts])
whole = f"{whole}\n\n{result.style.style_description}"
print(whole)

Close-up of the ghostly figures' faces, twisted and rotting.
Flickering lights casting eerie shadows.
Foreground: ghostly faces. Midground: flickering lights. Background: shadowy walls.

Cinematic noir, photorealistic, 35mm film still, deep contrast and grainy texture.


In [135]:
api = "http://18.207.254.53:8000"

In [111]:
import requests
import json
import base64

_NEGATIVE_PROMPT = """
    blurry, low quality, 
    low resolution, bad anatomy, bad hands, 
    missing fingers, extra digit, fewer digits, 
    cropped, worst quality, low quality, 
    normal quality, jpeg artifacts, signature, watermark, username, blurry,
    cartoon, anime, illustration, comic, flat shading
"""

_DEFAULT_PARAMS = {
    'guidance_scale': 7.5,
    'strength': 0.45,
    'orientation': 'portrait'
}

def call(text: str, image_path: str = None, params: dict = _DEFAULT_PARAMS):
    path = "/sm/txt2img"
    data = {
        'prompt': text,
        'negative_prompt': _NEGATIVE_PROMPT,
        **params
    }
    if image_path:
        path = "/sm/img2img"
        with open(image_path, "rb") as f:
            data["image"] = base64.b64encode(f.read()).decode("utf-8")
    endpoint = f"{api}{path}"

    response = requests.post(endpoint, 
                             data=json.dumps(data), 
                             headers={'Content-Type': 'application/json'},
                             timeout=3000)

    if response.status_code != 200:
        raise RuntimeError(f"Request failed: {response.status_code} {response.text}")
    else:
        img = json.loads(response.content.decode('utf-8'))
        image_bytes = base64.b64decode(img['image_base64'])
        return image_bytes

In [145]:
import base64

def generate_raw_scene(idx: int, dir: str):
    style = result.style
    scene = result.scenes[idx]
    images_dir = dir / str(idx)
    images_dir.mkdir(parents=True, exist_ok=True)
    style_prompt = VisualPromptStage(stage_type="details", prompt=style.style_description)
    for i, prompt in enumerate([*scene.visual_prompts, style_prompt]):
        logger.info(f"Generating image for scene {idx}, prompt {i}: {prompt.prompt}")
        if i == 0:
            img = call(text=prompt.prompt)
        else:
            prev_img_path = images_dir / f"{i-1}.png"
            img = call(text=prompt.prompt, image_path=prev_img_path)
        
        output_path = images_dir / f"{i}.png"
    
        with open(output_path, "wb") as f:
            f.write(img)
        
        if i == len(scene.visual_prompts) - 1:
            output_path = images_dir / f"final.png"
            with open(output_path, "wb") as f:
                f.write(img)

    logger.info(f"completed scene generation for scene - {idx}")
        
def generate_raw_combined_scene(idx: int, dir: str):
    style = result.style
    scene = result.scenes[idx]
    images_dir = dir / str(idx)
    images_dir.mkdir(parents=True, exist_ok=True)
    
    whole = '\n'.join([ prompt.prompt for prompt in scene.visual_prompts])
    whole = f"{whole}\n\n{style.style_description}"
    
    img = call(text=whole)
    output_path = images_dir / "final.png"
    
    with open(output_path, "wb") as f:
        f.write(img)
    
    logger.info(f"completed scene generation for scene - {idx}")

def generate_raw_combined_scene_with_past_reference(idx: int, dir: str):
    style = result.style
    scene = result.scenes[idx]
    images_dir = dir / str(idx)
    images_dir.mkdir(parents=True, exist_ok=True)

    past_image = None
    if idx > 0:
        past_image_path = dir / str(idx - 1) / "final.png"
        if past_image_path.exists():
            past_image = past_image_path
    
    whole = '\n'.join([ prompt.prompt for prompt in scene.visual_prompts])
    whole = f"{whole}\n\n{style.style_description}"

    if past_image:
        img = call(text=whole, image_path=str(past_image), params = { **__DEFAULT_PARAMS, 'strength': 0.5 })
    else:
        # If no past image, generate from scratch
        img = call(text=whole)
    output_path = images_dir / "final.png"
    
    with open(output_path, "wb") as f:
        f.write(img)
    
    logger.info(f"completed scene generation for scene - {idx}")

In [126]:
import time
for idx, scene in enumerate(result.scenes):
    generate_raw_combined_scene(idx, raw_images_dir)
    time.sleep(2)

INFO:__main__:completed scene generation for scene - 0
INFO:__main__:completed scene generation for scene - 1
INFO:__main__:completed scene generation for scene - 2
INFO:__main__:completed scene generation for scene - 3
INFO:__main__:completed scene generation for scene - 4
INFO:__main__:completed scene generation for scene - 5
INFO:__main__:completed scene generation for scene - 6
INFO:__main__:completed scene generation for scene - 7
INFO:__main__:completed scene generation for scene - 8
INFO:__main__:completed scene generation for scene - 9
INFO:__main__:completed scene generation for scene - 10
INFO:__main__:completed scene generation for scene - 11
INFO:__main__:completed scene generation for scene - 12
INFO:__main__:completed scene generation for scene - 13


# Image Upscale

In [133]:
import requests
import json
import base64


def upscale(image_path: str = None, params: dict = {}):
    path = "/upscale/latent/v2"
    data = {
        **params
    }
    with open(image_path, "rb") as f:
        data["image"] = base64.b64encode(f.read()).decode("utf-8")
    endpoint = f"{api}{path}"

    response = requests.post(endpoint, 
                             data=json.dumps(data), 
                             headers={'Content-Type': 'application/json'},
                             timeout=3000)

    if response.status_code != 200:
        raise RuntimeError(f"Request failed: {response.status_code} {response.text}")
    else:
        img = json.loads(response.content.decode('utf-8'))
        image_bytes = base64.b64decode(img['image_base64'])
        return image_bytes

In [134]:
import base64

def upscale_raw_scene(idx: int, input_dir: str, output_dir: str):
    images_dir = output_dir
    source_img_path = input_dir / f"{idx}" / "final.png"
    image_bytes = upscale(source_img_path)
    output_path = images_dir / f"{idx}.png"
    
    with open(output_path, "wb") as f:
        f.write(image_bytes)

In [136]:
import time
for idx, scene in enumerate(result.scenes):
    upscale_raw_scene(idx, raw_images_dir, clean_images_dir)
    time.sleep(2)

ReadTimeout: HTTPConnectionPool(host='18.207.254.53', port=8000): Read timed out. (read timeout=3000)

# Audio Generation

In [30]:
from openai import OpenAI

client = OpenAI()

def generate_tts(text, filename):
    """Generate narration audio from text using OpenAI TTS."""
    speech = client.audio.speech.create(
        model="gpt-4o-mini-tts",
        voice="alloy",
        input=text
    )
    with open(filename, "wb") as f:
        f.write(speech.read())
    return filename

In [None]:
import time

audio_dir = stage_dir / "audios"
audio_dir.mkdir(parents=True, exist_ok=True)
    

for idx, scene in enumerate(result.scenes):
    output_path = images_dir / f"scene_{idx}.png"
    generate_tts(scene.narration_text, output_path)
    time.sleep(2)

# Generate Videos

In [15]:
client = OpenAI()

def tts_generate(text, filename):
    """Generate narration audio from text using OpenAI TTS."""
    speech = client.audio.speech.create(
        model="gpt-4o-mini-tts",
        voice="alloy",
        input=text
    )
    with open(filename, "wb") as f:
        f.write(speech.read())
    return filename

def image_generate(prompt, filename):
    """Generate image using OpenAI image API."""
    img = client.images.generate(
        model="gpt-image-1",
        prompt=prompt,
        size="1024x1536"
    )
    print(f"Generated image for prompt: {prompt}")
    image_base64 = img.data[0].b64_json

    # Step 3 — Decode base64 → bytes
    image_bytes = base64.b64decode(image_base64)
    
    # Step 4 — Open with Pillow
    img = Image.open(BytesIO(image_bytes))
    
    # Step 5 — Resize to YouTube Shorts-friendly 1080x1920
    img = img.resize((1080, 1920), Image.LANCZOS)
    
    # Step 6 — Save
    img.save(filename)
    print(f"Image saved as {filename}")
    
    return filename

In [16]:
import subprocess

def create_video_with_ffmpeg(image_path, audio_path, output_path, duration):
    # Create a video from a single image + audio, loop image for duration
    command = [
        "ffmpeg",
        "-y",
        "-loop", "1",
        "-i", image_path,
        "-i", audio_path,
        "-c:v", "libx264",
        "-tune", "stillimage",
        "-c:a", "aac",
        "-b:a", "192k",
        "-pix_fmt", "yuv420p",
        "-shortest",
        output_path
    ]
    subprocess.run(command, check=True)

In [17]:
scene_videos = []

for idx, scene in enumerate(result.scenes):
    audio_file = f"scene{idx}.mp3"
    image_file = f"scene{idx}.png"
    video_file = f"scene{idx}.mp4"

    # Generate assets
    tts_generate(scene.narration_text, audio_file)
    image_generate(scene.visual_prompt, image_file)

    # Estimate duration from narration speed (~3 words/sec)
    words = len(scene.narration_text.split())
    duration = round(words / 3.0, 1)

    # Combine into scene video
    create_video_with_ffmpeg(image_file, audio_file, video_file, duration)
    scene_videos.append(video_file)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A wide-angle shot of a deserted city street at night. The scene is ultra-realistic with a moody, noir style. Streetlamps cast flickering, dim light over cracked roads. A faint mist swirls, creating an eerie atmosphere. The camera captures the scene from a low angle, emphasizing the desolation. Colors are muted with a cold blue tint, enhancing the chilling mood. Textures of cracked asphalt and swirling mist are prominent.
Image saved as scene0.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A medium shot of a car parked on an empty street, with a glowing neon sign of 'Mehta Super Mart' in the background. The scene is depicted in a cinematic, neo-noir style with vibrant neon colors contrasting against the dark surroundings. The camera captures the scene from a slightly elevated angle, focusing on the car and the sign. The neon glow casts colorful reflections on the wet pavement, adding texture and depth.
Image saved as scene1.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: An ultra-realistic, wide-angle shot of the supermarket interior. The scene is brightly lit with a sterile, fluorescent glow. The camera captures the aisles from a low angle, emphasizing the perfectly stocked shelves. Colors are vibrant but slightly surreal, with a focus on the shiny, unnatural appearance of the products. The buzzing of the lights adds an unsettling atmosphere.
Image saved as scene2.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A narrow staircase descending into darkness, lit by a single, flickering bulb. The scene is captured in a film noir style with deep shadows and high contrast. The camera takes a close-up shot of the sign and the staircase, emphasizing the mystery and foreboding atmosphere. The colors are muted, with a focus on the interplay of light and shadow.
Image saved as scene3.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A surreal, Escher-like staircase scene with repeating elements. The camera captures the scene from a wide-angle, slightly tilted perspective, emphasizing the disorienting, infinite nature of the staircases. The lighting is dim, with a single bulb casting long shadows. The colors are dark and gritty, enhancing the sense of confusion and unease.
Image saved as scene4.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A close-up shot of a dark red smear on a wall, with a flickering bulb above. The scene is captured in a horror style with high contrast and deep shadows. The camera focuses on the texture of the smear, with the bulb casting an eerie glow. The colors are dark and ominous, with a focus on the red smear and the shadows it casts.
Image saved as scene5.png


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/images/generations "HTTP/1.1 200 OK"


Generated image for prompt: A dark, atmospheric shot of a staircase with faint, approaching footsteps. The scene is captured in a suspenseful, thriller style with minimal lighting. The camera takes a low-angle shot, focusing on the staircase and the shadows. The colors are muted, with a focus on the interplay of light and shadow. The atmosphere is tense, with a sense of impending danger.
Image saved as scene6.png


In [2]:
scene_videos = [
    "scene0.mp4",
    "scene1.mp4",
    "scene2.mp4",
    "scene3.mp4",
    "scene4.mp4",
    "scene5.mp4",
    "scene6.mp4",
]

In [3]:
with open("scenes.txt", "w") as f:
    for video in scene_videos:
        f.write(f"file '{video}'\n")

import subprocess
subprocess.run([
    "ffmpeg", "-f", "concat", "-safe", "0", "-i", "scenes.txt",
    "-c", "copy", "story_1.mp4"
], check=True)


CompletedProcess(args=['ffmpeg', '-f', 'concat', '-safe', '0', '-i', 'scenes.txt', '-c', 'copy', 'story_1.mp4'], returncode=0)