- Next.js: A React framework for building the user interface.
- Clerk: User authentication and management.
- NestJS: A robust Node.js framework for building APIs.
- MongoDB: NoSQL database for storing data.
- BullMQ: Queue processing for handling video processing tasks.
- Gemini & ChatGPT: Generate video scripts from text input.
- AWS Polly: Convert text to speech (Text-to-Speech - TTS).
- ViettelAI: Generate Vietnamese voiceovers for better localization.
- Kling & Gemini: Generate and modify character outfits in videos.
- Amazon S3: Store generated videos and voice files.
- Proxy Rotation: Avoid API request limits when sending high-volume requests.
This project enables users to generate AI-powered videos from text input by utilizing advanced AI models for script generation, voice synthesis, and character rendering. The generated videos are then stored in Amazon S3 and provided as shareable URLs.
- User Input: Users enter a text description or script.
- Script Generation: AI (Gemini/ChatGPT) processes the input and generates a script.
- Voice Synthesis: AWS Polly converts the script into a natural-sounding voice.
- Character & Outfit Customization: Kling & Gemini generate AI-powered characters with custom outfits.
- Video Processing: The generated voice and characters are combined into a video.