The project is designed to handle multiple large video/audio files efficiently by breaking them into segments and processing them concurrently with Java.
- Put all videos in a single folder
- Replace
FOLDERvalue with the video folder(in step 1) path inUtil.java. - Run
SubtitleGenerator.Main. - Srt file will be generated in same folder(in step 1) with same file name as its corresponding video.
- Check log info that has
Complete with (0) ..., the number in parentheses indicates how manysegmentsWhisper AI failed to transcribe. A video hasvideo_length_in_seconds/MAX_SEGMENT_DURATION_SECONDSsegments.
- Producer-Consumer Design:
- (Deprecated)Optimized the Thread Pool and Blocking Queue to implement a producer-consumer design with thread-safety, ensuring an efficient and stable workflow.
- Multithreaded Implementation:
- Utilizes multithreading structure to operate audio segments and make multiple transcription requests in parallel in order to ensure efficiency and likely ended with at least 15x speedups.
- Semaphore Design:
- Applied semaphore to ensure a handled workflow arrangement with great success rate in multi-threaded environment.
- Audio Extraction:
- Extracts audio from video files and prepares them for transcription.
- Handling multiple Files:
- Processing all videos in one folder simultaneously, get rid of serial processing.
- Handling Large Files:
- Automatically splits audio files into smaller segments (e.g., 1-minute chunks) to avoid API size limitations.
- Subtitle Generation:
- Combines transcriptions of each audio segment into a single
.srtsubtitle file.
- Combines transcriptions of each audio segment into a single
- Cloudflare Workers Integration:
- Leverages Cloudflare Workers AI for secure and scalable serverless processing of API requests.
- OpenAI Whisper (Speech-to-Text) model Integration:
- Brings AI integrations with Speech-to-Text power.
- Multi-Large Videos:
- Capable of handling multiple, large video/audio files simultaneously.
- Trained Performance:
- Tuned auto retry mechanism, semaphore and threads to ensure high efficiency, accuracy, and stability.
- Efficiency:
- Uses multithreading and concurrency features in Java to improve the efficiency of API requests and overall processing speed.
- API Integration:
- Integrates with Cloudflare's OpenAI Whisper API for high-quality speech-to-text transcription.
- Java 11 or higher
- Maven (for dependency management)
- Cloudflare account with access to OpenAI Whisper API
- FFmpeg (for audio extraction)
- Clone this repository:
git clone https://github.com/yourusername/yourproject.git cd yourproject - Install dependencies:
mvn install
- Make sure
ffmpegis installed and available in your system's PATH. - Replace placeholders for
CF_ACCOUNT_IDand"CF_API_TOKEN"in Util.java with your Cloudflare access.