# SORA 2 with Azure AI Foundry

<img src="logo.webp">

### Overview
This notebook demonstrates how to generate videos using the Sora-2 model integrated with Azure AI Foundry. It covers:
- Authentication with Microsoft Entra ID.
- Connecting to Azure OpenAI endpoints.
- Using Python to create and download AI-generated videos.

## References
- https://learn.microsoft.com/en-us/azure/ai-foundry/
- https://azure.microsoft.com/en-us/blog/sora-2-now-available-in-azure-ai-foundry/
- https://openai.com/index/sora-2/
- https://openai.com/index/sora-2-system-card/

In [1]:
import datetime
import os
import shutil
import sys
import time
import zipfile

from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from dotenv import load_dotenv
from IPython.display import FileLink, Video
from openai import OpenAI

In [2]:
sys.version

'3.10.18 (main, Jun  5 2025, 13:14:17) [GCC 11.2.0]'

In [3]:
print(f"Today is {datetime.datetime.today().strftime('%d-%b-%Y %H:%M:%S')}")

Today is 20-Oct-2025 13:34:03


## 1. Settings

In [4]:
load_dotenv("azure.env")

True

In [5]:
# Set up Microsoft Entra ID authentication
token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

In [6]:
client = OpenAI(
    base_url=f"{os.getenv('AZURE_OPENAI_ENDPOINT')}/openai/v1/",
    api_key=token_provider,
)

In [7]:
VIDEO_DIR = "videos"

if os.path.exists(VIDEO_DIR) and os.path.isdir(VIDEO_DIR):
    shutil.rmtree(VIDEO_DIR)

os.makedirs(VIDEO_DIR, exist_ok=True)

## 2. Helper

In [8]:
def sora2(prompt):
    """
    Generate a video using Sora-2 model
    
    Args:
        prompt: Text description for video generation
        sora2_videofile: Output path for the generated video
    """
    start = time.time()

    # Create video with custom parameters
    print(f"===== 🎨 Creating SORA-2 video using Azure AI Foundry =====\n")
    print(f"Prompt: {prompt}")

    try:
        video = client.videos.create(
            model=
            "sora-2",  # The name of your sora2 deployed model in Azure AI Foundry
            prompt=prompt,
            size="1280x720",  # Resolution 1280x720 or 720x1280
            seconds="8"  # Options: 4, 8, 12 seconds
        )

        print(f"📹 Video ID: {video.id}")
        print(f"⏳ Initial Status: {video.status}\n")

        # Poll for completion
        while video.status not in ["completed", "failed", "cancelled"]:
            now = datetime.datetime.now().strftime('%d-%b-%Y %H:%M:%S')
            print(f"[{now}] ⏱️ Status: {video.status}")
            time.sleep(10)  # Pause
            video = client.videos.retrieve(video.id)

        # Handle final status
        if video.status == "completed":
            print("\n✨ Video generation completed!")
            print("📥 Downloading video")
            content = client.videos.download_content(video.id, variant="video")
            sora2_videofile = os.path.join(
                VIDEO_DIR,
                f"sora2_video_{datetime.datetime.now().strftime('%d%b%Y_%H%M%S')}.mp4"
            )
            content.write_to_file(sora2_videofile)
            minutes, seconds = divmod((time.time() - start), 60)
            print(f"⏱️ Done in {minutes:.0f} minutes and {seconds:.0f} seconds")
            print(f"\n✅ Video saved to: {sora2_videofile}")
            return sora2_videofile

        elif video.status == "failed":
            print("\n❌ Video generation failed!")
            return False

        elif video.status == "cancelled":
            print("\n⚠️ Video generation was cancelled")
            return False

    except Exception as e:
        print(f"\n🚨 Error occurred: {str(e)}")
        return False

## 3. Text to video examples

In [9]:
prompt = """
Advertising luxury perfume with a blond female model holding the perfume with Paris sunset behind her. 
The perfume name is 'Velour Eclipse'.
She is saying 'Velour Eclipse perfume from Paris: Wear the eclipse, Own the night.'.
"""

In [10]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
Advertising luxury perfume with a blond female model holding the perfume with Paris sunset behind her. 
The perfume name is 'Velour Eclipse'.
She is saying 'Velour Eclipse perfume from Paris: Wear the eclipse, Own the night.'.

📹 Video ID: video_68f63a521ddc8190bf57d6877d4b88a2
⏳ Initial Status: queued

[20-Oct-2025 13:34:10] ⏱️ Status: queued
[20-Oct-2025 13:34:20] ⏱️ Status: in_progress
[20-Oct-2025 13:34:30] ⏱️ Status: in_progress
[20-Oct-2025 13:34:41] ⏱️ Status: in_progress
[20-Oct-2025 13:34:51] ⏱️ Status: in_progress
[20-Oct-2025 13:35:01] ⏱️ Status: in_progress
[20-Oct-2025 13:35:12] ⏱️ Status: in_progress
[20-Oct-2025 13:35:22] ⏱️ Status: in_progress
[20-Oct-2025 13:35:32] ⏱️ Status: in_progress
[20-Oct-2025 13:35:43] ⏱️ Status: in_progress

✨ Video generation completed!
📥 Downloading video
⏱️ Done in 1 minutes and 49 seconds

✅ Video saved to: videos/sora2_video_20Oct2025_133557.mp4


In [11]:
Video(sora2_videofile, width=1024)

In [12]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [13]:
prompt = """
Clip on a sunlit beach boardwalk. 
Three diverse models walk in slow-motion wearing our summer '25 linen collection. 
Golden hour lighting, light acoustic pop track, quick jump-cuts between outfits. 
Animated title card: 'Enjoy!' with a brand logo 'Florida'.
"""

In [14]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
Clip on a sunlit beach boardwalk. 
Three diverse models walk in slow-motion wearing our summer '25 linen collection. 
Golden hour lighting, light acoustic pop track, quick jump-cuts between outfits. 
Animated title card: 'Enjoy!' with a brand logo 'Florida'.

📹 Video ID: video_68f63af0a2a48190a74c3c0dec1d517a
⏳ Initial Status: queued

[20-Oct-2025 13:36:48] ⏱️ Status: queued
[20-Oct-2025 13:36:58] ⏱️ Status: in_progress
[20-Oct-2025 13:37:09] ⏱️ Status: in_progress
[20-Oct-2025 13:37:19] ⏱️ Status: in_progress
[20-Oct-2025 13:37:29] ⏱️ Status: in_progress
[20-Oct-2025 13:37:39] ⏱️ Status: in_progress
[20-Oct-2025 13:37:50] ⏱️ Status: in_progress
[20-Oct-2025 13:38:00] ⏱️ Status: in_progress
[20-Oct-2025 13:38:11] ⏱️ Status: in_progress
[20-Oct-2025 13:38:22] ⏱️ Status: in_progress
[20-Oct-2025 13:38:32] ⏱️ Status: in_progress
[20-Oct-2025 13:38:43] ⏱️ Status: in_progress

✨ Video generation completed!
📥 Downloading vi

In [15]:
Video(sora2_videofile, width=1024)

In [16]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [17]:
prompt = """
Style: Hand-painted 2D/3D hybrid animation with soft brush textures, warm tungsten lighting, and a tactile, stop-motion feel. 
The aesthetic evokes mid-2000s storybook animation — cozy, imperfect, full of mechanical charm. 
Subtle watercolor wash and painterly textures; warm–cool balance in grade; filmic motion blur for animated realism.
Inside a cluttered workshop, shelves overflow with gears, bolts, and yellowing blueprints.
At the center, a small round robot sits on a wooden bench, its dented body patched with mismatched plates and old paint layers. 
Its large glowing eyes flicker pale blue as it fiddles nervously with a humming light bulb. 
The air hums with quiet mechanical whirs, rain patters on the window, and the clock ticks steadily in the background.

Cinematography:
Camera: medium close-up, slow push-in with gentle parallax from hanging tools
Lens: 35 mm virtual lens; shallow depth of field to soften background clutter
Lighting: warm key from overhead practical; cool spill from window for contrast
Mood: gentle, whimsical, a touch of suspense

Actions:
- The robot taps the bulb; sparks crackle.
- It flinches, dropping the bulb, eyes widening.
- The bulb tumbles in slow motion; it catches it just in time.
- A puff of steam escapes its chest — relief and pride.
- Robot says quietly: "Almost lost it… but I got it!"

Background Sound:
Rain, ticking clock, soft mechanical hum, faint bulb sizzle.

Add a light mention at the bottom of the video: 'Generated with SORA-2'.
"""

In [18]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
Style: Hand-painted 2D/3D hybrid animation with soft brush textures, warm tungsten lighting, and a tactile, stop-motion feel. 
The aesthetic evokes mid-2000s storybook animation — cozy, imperfect, full of mechanical charm. 
Subtle watercolor wash and painterly textures; warm–cool balance in grade; filmic motion blur for animated realism.
Inside a cluttered workshop, shelves overflow with gears, bolts, and yellowing blueprints.
At the center, a small round robot sits on a wooden bench, its dented body patched with mismatched plates and old paint layers. 
Its large glowing eyes flicker pale blue as it fiddles nervously with a humming light bulb. 
The air hums with quiet mechanical whirs, rain patters on the window, and the clock ticks steadily in the background.

Cinematography:
Camera: medium close-up, slow push-in with gentle parallax from hanging tools
Lens: 35 mm virtual lens; shallow depth of field to soften backgr

In [19]:
Video(sora2_videofile, width=1024)

In [20]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [21]:
prompt = """
Style
Genre: 1950s gritty police action thriller in black and white.
Look: Shot on 35 mm with high-contrast Kodak stock; cool steel-blue grade with neon accents.
Texture: Heavy grain, slight jitter for tension; lens flares from sirens and streetlights.
Atmosphere: Rain-slick rooftops, steam vents, and flashing red-blue lights cutting through shadows.

Scene Setup
Location: Same brick tenement rooftop, but now it’s a crime stakeout at night.
Props: Laundry lines replaced with tactical ropes and scattered evidence markers; fairy bulbs swapped for harsh floodlights and pulsing sirens.
Backdrop: Helicopter searchlights sweep across the skyline; distant gunfire and sirens echo below.

Characters
Detective: Worn leather jacket, badge glinting; grips a service revolver.
Partner: Tactical vest, scanning the perimeter with a flashlight.
Antagonist: Silhouette darting between sheets, gun drawn.

Cinematography
Camera: Tight tracking shot, low angle for intensity; whip pans during sudden movements.
Lens: 28 mm for dynamic action; deep focus to capture chaos.
Lighting: Harsh tungsten mixed with strobing police lights; steam backlight for drama.
Mood: Tense, urgent, adrenaline-charged.

Actions
Detective crouches, whispers: “He's here. Be careful.”
Partner signals silently, then bursts forward as a shadow moves.
Gunfire erupts; sheets whip violently in the wind, obscuring the chase.
Antagonist leave the place taking the stairs; detective follows in a desperate sprint.

Soundscape
Sirens wail, helicopter blades thrum overhead.
Gunshots crack, ricocheting off brick.
Rain patters on metal; distant shouting and screeching tires.

Add a light mention at the bottom of the video: 'Generated with SORA-2'.
"""

In [22]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
Style
Genre: 1950s gritty police action thriller in black and white.
Look: Shot on 35 mm with high-contrast Kodak stock; cool steel-blue grade with neon accents.
Texture: Heavy grain, slight jitter for tension; lens flares from sirens and streetlights.
Atmosphere: Rain-slick rooftops, steam vents, and flashing red-blue lights cutting through shadows.

Scene Setup
Location: Same brick tenement rooftop, but now it’s a crime stakeout at night.
Props: Laundry lines replaced with tactical ropes and scattered evidence markers; fairy bulbs swapped for harsh floodlights and pulsing sirens.
Backdrop: Helicopter searchlights sweep across the skyline; distant gunfire and sirens echo below.

Characters
Detective: Worn leather jacket, badge glinting; grips a service revolver.
Partner: Tactical vest, scanning the perimeter with a flashlight.
Antagonist: Silhouette darting between sheets, gun drawn.

Cinematography
Camera: Tight tra

In [23]:
Video(sora2_videofile, width=1024)

In [24]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [25]:
prompt = """
Style: 1970s romantic drama, shot on 35 mm film with natural flares, soft focus, and warm halation. 
Slight gate weave and handheld micro-shake evoke vintage intimacy. 
Warm Kodak-inspired grade; light halation on bulbs; film grain and soft vignette for period authenticity.
At golden hour, a brick tenement rooftop transforms into a small stage. 
Laundry lines strung with white sheets sway in the wind, catching the last rays of sunlight. 
Strings of mismatched fairy bulbs hum faintly overhead. A young woman in a flowing red silk dress dances barefoot, curls glowing in the fading light. 
Her partner — sleeves rolled, suspenders loose — claps along, his smile wide and unguarded. 
Below, the city hums with car horns, subway tremors, and distant laughter.

Cinematography:
Camera: medium-wide shot, slow dolly-in from eye level.
Lens: 40 mm spherical; shallow focus to isolate the couple from skyline.
Lighting: golden natural key with tungsten bounce; edge from fairy bulbs.
Mood: nostalgic, tender, cinematic.

Actions:
- She spins; her dress flares, catching sunlight.
- Woman (laughing): "See? Even the city dances with us tonight."
- He steps in, catches her hand, and dips her into shadow.
- Man (smiling): "Only because you lead."
- Sheets drift across frame, briefly veiling the skyline before parting again.

Background Sound:
Natural ambience only: faint wind, fabric flutter, street noise, funky music.

Add a light mention at the bottom of the video: 'Generated with SORA-2'.
"""

In [26]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
Style: 1970s romantic drama, shot on 35 mm film with natural flares, soft focus, and warm halation. 
Slight gate weave and handheld micro-shake evoke vintage intimacy. 
Warm Kodak-inspired grade; light halation on bulbs; film grain and soft vignette for period authenticity.
At golden hour, a brick tenement rooftop transforms into a small stage. 
Laundry lines strung with white sheets sway in the wind, catching the last rays of sunlight. 
Strings of mismatched fairy bulbs hum faintly overhead. A young woman in a flowing red silk dress dances barefoot, curls glowing in the fading light. 
Her partner — sleeves rolled, suspenders loose — claps along, his smile wide and unguarded. 
Below, the city hums with car horns, subway tremors, and distant laughter.

Cinematography:
Camera: medium-wide shot, slow dolly-in from eye level.
Lens: 40 mm spherical; shallow focus to isolate the couple from skyline.
Lighting: golden natural

In [27]:
Video(sora2_videofile, width=1024)

In [28]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [29]:
prompt = """
cene Concept: Car Chase in San Francisco.
Setting: Manhattan streets, late afternoon transitioning to dusk. Wet asphalt from recent rain, neon reflections from storefronts.

Cinematography
Camera Styles:
Opening Shot: Aerial drone shot sweeping over the Golden gate bridge, establishing the cityscape.
Tracking Shots: Low-angle dolly alongside speeding cars, emphasizing tire grip and sparks from metal scraping.
POV Shots: Inside the driver’s seat, shaky handheld for tension.
Crash Cam: Mounted GoPro-style on bumpers for impact moments.

Lighting:
Natural golden-hour light mixed with harsh neon signage.
Flickering streetlights and reflections on wet pavement for cinematic depth.

Lens Choices:
Wide-angle (24mm) for cityscape and chase overview.
Telephoto (85mm) for compressing traffic chaos and close-ups of driver expressions.

Transitions:
Quick whip pans between cars.
Slow-motion during near-collision moments.

Action Details
Cars weaving through congested traffic, clipping mirrors.
Pedestrians scattering, hot dog stand tipping over.
Police sirens in the distance, squad cars joining the pursuit.
Dramatic jump over a construction ramp, sparks flying.
Near-miss with a yellow taxi spinning out.

Sound Design
Engine Roar: Deep growl for muscle cars, high-pitched whine for sports cars.
Tire Screech: Layered with echo for urban canyon effect.
Ambient San Francisco: Honking horns, distant subway rumble, street chatter fading under chase intensity.
Impact Sounds: Metallic crunch for collisions, glass shatter for side hits.
Music Cue: High-tempo percussion with electronic bass drops synced to gear shifts.
Dynamic Silence: Brief mute before a major crash for dramatic tension.

Add a light mention at the bottom of the video: 'Generated with SORA-2'.
"""

In [30]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
cene Concept: Car Chase in San Francisco.
Setting: Manhattan streets, late afternoon transitioning to dusk. Wet asphalt from recent rain, neon reflections from storefronts.

Cinematography
Camera Styles:
Opening Shot: Aerial drone shot sweeping over the Golden gate bridge, establishing the cityscape.
Tracking Shots: Low-angle dolly alongside speeding cars, emphasizing tire grip and sparks from metal scraping.
POV Shots: Inside the driver’s seat, shaky handheld for tension.
Crash Cam: Mounted GoPro-style on bumpers for impact moments.

Lighting:
Natural golden-hour light mixed with harsh neon signage.
Flickering streetlights and reflections on wet pavement for cinematic depth.

Lens Choices:
Wide-angle (24mm) for cityscape and chase overview.
Telephoto (85mm) for compressing traffic chaos and close-ups of driver expressions.

Transitions:
Quick whip pans between cars.
Slow-motion during near-collision moments.

Action 

In [31]:
Video(sora2_videofile, width=1024)

In [32]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [36]:
prompt = """
A high-energy hard rock band performing live on a massive stage at night, with dramatic pyrotechnics and flames shooting up around the musicians.
The band members are playing electric guitars, bass, and drums with intense passion, headbanging under bright spotlights and colorful strobe lights. 
The crowd is cheering wildly in the background, hands raised in the air. 
The atmosphere is dark and moody with smoke effects, glowing red and orange fire illuminating the scene, creating a raw, powerful rock concert vibe. 
Ultra-realistic cinematic style, dynamic camera angles, 4K resolution.

Add a light mention at the bottom of the video: 'Generated with SORA-2'.
"""

In [37]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
A high-energy hard rock band performing live on a massive stage at night, with dramatic pyrotechnics and flames shooting up around the musicians.
The band members are playing electric guitars, bass, and drums with intense passion, headbanging under bright spotlights and colorful strobe lights. 
The crowd is cheering wildly in the background, hands raised in the air. 
The atmosphere is dark and moody with smoke effects, glowing red and orange fire illuminating the scene, creating a raw, powerful rock concert vibe. 
Ultra-realistic cinematic style, dynamic camera angles, 4K resolution.

📹 Video ID: video_68f63deacb208190992f0aac30180995
⏳ Initial Status: queued

[20-Oct-2025 13:49:30] ⏱️ Status: queued
[20-Oct-2025 13:49:41] ⏱️ Status: in_progress
[20-Oct-2025 13:49:52] ⏱️ Status: in_progress
[20-Oct-2025 13:50:02] ⏱️ Status: in_progress
[20-Oct-2025 13:50:12] ⏱️ Status: in_progress
[20-Oct-2025 13:50:23] ⏱️ Status: in_

In [38]:
Video(sora2_videofile, width=1024)

In [39]:
video_link = FileLink(path=sora2_videofile)
video_link

### Another example

In [50]:
prompt = """
A professional two-person interview setup in a modern tech office environment, focused on discussing Microsoft Azure cloud solutions. 
One person is the interviewer who is a 40 years old girl, seated across from the guest, who is an Azure expert and a 25 years old male. 
Both are dressed in business casual attire, speaking confidently and engagingly. 
The background features subtle branding elements like Azure logos on a digital screen, with soft lighting and a clean, minimalistic design. 
Include close-up shots of the speakers, smooth camera transitions, and clear audio emphasis on their conversation. 
The tone is informative and professional, with a dynamic yet authentic feel. 
Ultra-realistic cinematic style, 4K resolution.

Question from the interviewer: 'Good morning John.'
Answer from the Azure expert: 'Hello, good morning Lita.'

Question from the interviewer: 'Do you know how to do computer vision with Azure?'
Answer from the Azure expert: 'Well you can use AutoML for Images from Azure ML. Or use Azure Custom Vision or Azure AI Vision.'

Add a light mention at the bottom of the video: 'Generated with SORA-2'.

"""

In [None]:
sora2_videofile = sora2(prompt)

===== 🎨 Creating SORA-2 video using Azure AI Foundry =====

Prompt: 
A professional two-person interview setup in a modern tech office environment, focused on discussing Microsoft Azure cloud solutions. 
One person is the interviewer who is a 40 years old girl, seated across from the guest, who is an Azure expert and a 25 years old male. 
Both are dressed in business casual attire, speaking confidently and engagingly. 
The background features subtle branding elements like Azure logos on a digital screen, with soft lighting and a clean, minimalistic design. 
Include close-up shots of the speakers, smooth camera transitions, and clear audio emphasis on their conversation. 
The tone is informative and professional, with a dynamic yet authentic feel. 
Ultra-realistic cinematic style, 4K resolution.

Question from the interviewer: 'Good morning John.'
Answer from the Azure expert: 'Hello, good morning Lita.'

Question from the interviewer: 'Do you know how to do computer vision with Azure?'

In [None]:
Video(sora2_videofile, width=1024)

In [None]:
video_link = FileLink(path=sora2_videofile)
video_link

## 6. All SORA-2 generated videos

In [47]:
!ls $VIDEO_DIR -lh

total 18M
-rwxrwxrwx 1 root root 1.1M Oct 20 13:35 sora2_video_20Oct2025_133557.mp4
-rwxrwxrwx 1 root root 1.7M Oct 20 13:38 sora2_video_20Oct2025_133857.mp4
-rwxrwxrwx 1 root root 1.6M Oct 20 13:40 sora2_video_20Oct2025_134056.mp4
-rwxrwxrwx 1 root root 3.5M Oct 20 13:43 sora2_video_20Oct2025_134308.mp4
-rwxrwxrwx 1 root root 1.4M Oct 20 13:45 sora2_video_20Oct2025_134527.mp4
-rwxrwxrwx 1 root root 2.7M Oct 20 13:47 sora2_video_20Oct2025_134759.mp4
-rwxrwxrwx 1 root root 3.5M Oct 20 13:51 sora2_video_20Oct2025_135142.mp4
-rwxrwxrwx 1 root root 1.1M Oct 20 13:56 sora2_video_20Oct2025_135613.mp4
-rwxrwxrwx 1 root root 1.1M Oct 20 14:02 sora2_video_20Oct2025_140231.mp4


In [48]:
zip_filename = "sora2_videos.zip"

print(f"Zipping videos files from '{VIDEO_DIR}' to '{zip_filename}' ...")

with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
    for root, dirs, files in os.walk(VIDEO_DIR):
        for file in files:
            zipf.write(os.path.join(root, file), os.path.relpath(os.path.join(root, file), VIDEO_DIR))

print("Done")
!ls $zip_filename -lh

Zipping videos files from 'videos' to 'sora2_videos.zip' ...
Done
-rwxrwxrwx 1 root root 18M Oct 20 14:02 sora2_videos.zip


In [49]:
zip_link = FileLink(path=zip_filename)
zip_link