This Spring Boot application demonstrates multiple Java approaches to integrate with OpenAI's Sora 2 video generation API. The implementation showcases Java's diverse async patterns, from traditional schedulers to modern virtual threads.
Key Insight: While documentation examples often show simple
sleep()
loops for clarity, production applications benefit from more sophisticated patterns. Java offers multiple approaches including the unique virtual threads feature that combines the simplicity of synchronous code with async performance.
Imagine ordering lunch at a fast-food restaurant:
Synchronous (Blocking) Model:
- You place your order
- You and the cashier stand there, staring at each other, waiting for your food
- You can't leave the counter
- The cashier can't take anyone else's order
- If you have as many customers as cashiers, the entire restaurant is "blocked on I/O"
Asynchronous (Polling) Model:
- You place your order and get a receipt with a number
- You can grab a table, get napkins, or get a drink
- The cashier can immediately take other customers' orders
- Two or three good cashiers can handle the entire lunch rush
- You periodically check the screen for your number
- The system is more complicated, but it scales much better
Webhook Model:
- Like the polling model, but the cashier calls your number when ready
- No need to check the screen - you're notified
- Even more efficient for the customer
This is exactly why OpenAI and Google use async models for video generation: generating a video takes 2-4 minutes regardless of the approach, but with async patterns, the server doesn't tie up threads waiting. It's all about scaling for the server.
Unlike text or image generation (which are synchronous request-response), video generation APIs are inherently asynchronous:
- Initial POST → Returns a
video_id
immediately - Poll or Wait → Check status periodically or register webhook
- Download → Retrieve completed video when ready
Time to generate: 2-4 minutes (both sync and async take the same time) Difference: With async, threads aren't blocked, allowing servers to handle 1000x more concurrent requests
-
Multiple Client Implementations:
- Pure Java
HttpClient
(Java 11+) - Spring
RestClient
(Spring 6+)
- Pure Java
-
Two Video Generation Modes:
- Text-to-Video — Generate videos from text prompts
- Image-to-Video — Animate static images (e.g., American Gothic square dancing!)
-
Four Polling Strategies:
VirtualThread
— Java 21+ virtual threads ⭐ RECOMMENDEDFixedRate
— ScheduledExecutorService with fixed intervalsSelfScheduling
— Dynamic self-rescheduling pollingReactive
— Spring WebFlux reactive streams
-
Real-time Progress Tracking — Visual progress bars with percentage updates
-
Webhook Support — Alternative to polling (OpenAI sends notifications)
-
REST API Endpoints — Test all strategies via HTTP
-
Interactive CLI Demo — Side-by-side comparison tool
- Java 21+ (Java 21 or 25 LTS recommended for virtual threads)
- Gradle (wrapper included)
- OpenAI API Key with Sora 2 access — Set
OPENAI_API_KEY
environment variable
export OPENAI_API_KEY="sk-proj-..."
./gradlew build
./gradlew run
./gradlew bootRun
The application starts on http://localhost:8080
- POST
/api/video/generate/virtualthread
⭐ - POST
/api/video/generate/fixedrate
- POST
/api/video/generate/selfscheduling
- POST
/api/video/generate/reactive
- POST
/api/video/generate/restclient
- POST
/api/video/generate/httpclient
Request body:
{
"prompt": "A serene mountain landscape at sunset with golden light"
}
- POST
/api/video/generate/image-to-video
⭐
Request body:
{
"prompt": "The couple suddenly smile and begin square dancing together",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/c/cc/Grant_Wood_-_American_Gothic_-_Google_Art_Project.jpg"
}
Note: Images are automatically resized to match requested video dimensions. The API requires exact dimension matching, not just aspect ratio.
- GET
/api/video/strategies
— List available strategies with descriptions - GET
/api/video/health
— Health check
- POST
/api/webhook/sora
— Receive OpenAI webhook events
Text-to-Video:
curl -X POST http://localhost:8080/api/video/generate/virtualthread \
-H "Content-Type: application/json" \
-d '{"prompt": "A cat playing with a ball of yarn in a sunny garden"}'
Image-to-Video:
curl -X POST http://localhost:8080/api/video/generate/image-to-video \
-H "Content-Type: application/json" \
-d '{
"prompt": "The couple suddenly smile and begin square dancing together",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/c/cc/Grant_Wood_-_American_Gothic_-_Google_Art_Project.jpg"
}'
./gradlew run
The demo presents an interactive menu where you can:
- Choose which polling strategy to test (text-to-video)
- Test image-to-video with American Gothic painting
- Enter custom prompts
- See real-time progress bars during generation
- Compare performance across strategies
Edit src/main/resources/application.properties
:
# OpenAI API Configuration
openai.api.key=${OPENAI_API_KEY}
sora.api.model=sora-2
sora.api.base-url=https://api.openai.com/v1
# Video Parameters
sora.video.size=1280x720
sora.video.seconds=8
# Polling Configuration
sora.polling.interval-seconds=5
sora.polling.max-timeout-minutes=10
# Output
sora.output.directory=./videos
Strategy | Java Version | Complexity | Scalability | Best For |
---|---|---|---|---|
VirtualThread ⭐ | 21+ | Low | Excellent | Modern Java apps (recommended) |
Reactive | 8+ | High | Excellent | High-concurrency reactive apps |
FixedRate | 8+ | Medium | Good | Traditional enterprise apps |
SelfScheduling | 8+ | Medium | Good | Dynamic interval adjustments |
RestClient | Spring 6+ | Low | Limited | Simple use cases only |
HttpClient | 11+ | Low | Limited | Simple use cases only |
Virtual threads (Java 21+) give you the best of both worlds:
// Looks like simple blocking code with progress tracking
do {
Thread.sleep(5000);
status = client.checkVideoStatus(videoId);
// Display progress bar if available
if (status.progress() != null && status.progress() > 0) {
int barLength = 30;
int filledLength = (int) ((status.progress() / 100.0) * barLength);
String bar = "=".repeat(filledLength) + "-".repeat(barLength - filledLength);
System.out.print(String.format("\r%s: [%s] %d%%",
status.status(), bar, status.progress()));
}
} while (!status.isDone());
But it's running on virtual threads:
- Extremely lightweight (millions possible)
- No thread pool exhaustion
- Simple, readable code
- Excellent performance
- Easy to add features like progress tracking
This is dramatically better than Python/JS busy-waiting loops because:
- Python/JS block OS threads (expensive, limited)
- Java virtual threads are cheap and plentiful
- Same simple code, vastly better scalability
Documentation Examples (Python, JS, etc.):
import time
operation = client.models.generate_videos(...)
# Simple polling loop - clear and educational
while not operation.done:
print("Waiting...")
time.sleep(10)
operation = client.operations.get(operation)
Production Patterns:
All modern languages offer sophisticated async patterns:
- Python:
asyncio
with async/await for non-blocking I/O - JavaScript: Promises and async/await with event loop
- Java: Multiple options including virtual threads
Java Virtual Threads (unique advantage):
// Looks like simple blocking code
do {
Thread.sleep(5000);
status = client.checkVideoStatus(videoId);
} while (!status.isDone());
// But runs on lightweight virtual threads
// Scales to millions of concurrent operations
Why Virtual Threads Stand Out:
- Write familiar synchronous-looking code
- Get async performance automatically
- No need for async/await keywords throughout codebase
- Existing blocking libraries work without modification
For production web applications, this project demonstrates how Java's diverse async toolkit (schedulers, reactive streams, virtual threads) provides flexibility for different architectural needs.
- Sora 2: $0.10/second
- Sora 2 Pro: $0.30/second
- Example: 8-second video @ 720p = $0.80
src/main/java/com/kousenit/sora2java/
├── client/ # HTTP client implementations
│ ├── SoraVideoClient.java # Text-to-video interface
│ ├── SoraImageVideoClient.java # Image-to-video interface
│ ├── HttpClientSoraVideoClient.java
│ ├── RestClientSoraVideoClient.java
│ └── RestClientSoraImageVideoClient.java # Image-to-video with multipart upload
├── controller/ # REST endpoints
│ ├── VideoGenerationController.java # Both text & image-to-video endpoints
│ └── WebhookController.java
├── model/ # Data models (records)
│ └── SoraRecords.java # VideoRequest, ImageVideoRequest, VideoStatus, etc.
├── service/ # Business logic and polling strategies
│ ├── PollingStrategy.java (sealed interface)
│ ├── VirtualThreadPollingStrategy.java ⭐ # Text-to-video with progress bars
│ ├── FixedRatePollingStrategy.java
│ ├── SelfSchedulingPollingStrategy.java
│ ├── ReactivePollingStrategy.java
│ ├── ImageVideoGenerationService.java # Image-to-video with progress bars
│ └── VideoGenerationService.java
├── SoraVideoDemo.java # Interactive CLI demo (text & image-to-video)
└── Sora2JavaApplication.java
This project showcases cutting-edge Java features:
- Virtual Threads (Java 21+) — Lightweight concurrency
- Records — Immutable data models
- Sealed Interfaces — Compile-time exhaustiveness checking
- Pattern Matching — Enhanced switch expressions
- Text Blocks — Multi-line strings
This project demonstrates:
- Production-Ready Integration — Real OpenAI Sora 2 API with proper error handling
- Async Pattern Comparison — 4 different concurrency approaches
- Multiple Input Modalities — Both text-to-video and image-to-video generation
- Modern Java Showcase — Records, sealed interfaces, virtual threads, progress tracking
- Fast-Food Analogy — Clear explanation of async benefits
- Multipart File Uploads — Automatic image download, resize, and upload
Perfect for:
- Java developers learning AI API integration
- Understanding when and why to use async patterns
- Comparing different Java concurrency approaches
- Creating educational YouTube content about async programming
- Learning how to handle multipart form-data with Spring RestClient
- API_IMPLEMENTATION_NOTES.md — Technical details about Sora 2 API
- polling-strategy-comparison.md — Detailed performance analysis
MIT License - Feel free to use this for learning, teaching, or production projects.
This is an educational project demonstrating async patterns in Java. Feel free to:
- Use it in your training courses
- Create YouTube videos about it
- Adapt the patterns for your own projects
- Submit PRs with improvements
Built with ☕ Java 21, 🍃 Spring Boot 3.5, and the belief that Java's async patterns deserve more recognition in the AI/ML integration space.