# Creating project from scratch 

Please find below a description of a Python script that can generate SRT (SubRip) files for videos in Tamil, Telugu, and Kannada languages. The script takes a video file and a corresponding text file as input and generates SRT files with subtitles for the given video content. Here is a step-by-step guide on how the script works and what you need to run it:

1. Purpose: The script is designed to generate SRT files with subtitles for Tamil, Telugu, and Kannada videos. The output file will have the same name as the input video file, with the language appended at the end (for example, `example_video_Tamil.srt`).

2. How it Works:
   - The script first loads the video content and the text content from the provided files.
   - Depending on the language chosen (Tamil, Telugu, or Kannada), it aligns and segments the text to synchronize with the video content.
   - The alignment and segmentation are language-specific, with separate methods for Tamil, Telugu, and Kannada.
   - Finally, it creates SRT files with the generated subtitles.

3. What's Needed:
   - Video File: You need a video file in a format that can be processed. Make sure to provide the correct path to the video file (for example, `example_video.mp4` in this script).
   - Text File: You need a text file containing the transcript or dialogue of the video. This text should correspond to the language of the video. Provide the path to this file (for example, `example_text.txt` in this script).
   - Dependencies: The script relies on Python's standard library and does not require any additional dependencies.

4. Running the Script:
   - Save the script into a Python file (for example, `subtitle_generator.py`).
   - Place your video file and text file in the same directory as the script.
   - Run the script in a Python environment.
   - It will generate SRT files for each language specified (Tamil, Telugu, and Kannada), each named after the video file with the language appended (for example, `example_video_Tamil.srt`).
   - Check the console output for confirmation of successful SRT generation.

Please note that the placeholder methods (`load_video`, `align_and_segment_tamil`, `align_and_segment_telugu`, `align_and_segment_kannada`) in the script need to be replaced with actual implementations tailored to your requirements, especially for loading the video and aligning/segmenting the text based on the language.

In [None]:
import os

class SubtitleGenerator:
    def __init__(self):
        self.languages = ["Tamil", "Telugu", "Kannada"]
    
    def generate_srt(self, video_file, text_file, language):
        # Load video and text content
        video_content = self.load_video(video_file)
        text_content = self.load_text(text_file)
        
        # Perform alignment and segmentation based on language
        if language == "Tamil":
            subtitle_units = self.align_and_segment_tamil(video_content, text_content)
        elif language == "Telugu":
            subtitle_units = self.align_and_segment_telugu(video_content, text_content)
        elif language == "Kannada":
            subtitle_units = self.align_and_segment_kannada(video_content, text_content)
        else:
            print("Language not supported.")
            return
        
        # Generate SRT file
        srt_content = self.create_srt(subtitle_units)
        srt_filename = os.path.splitext(video_file)[0] + "_" + language + ".srt"
        with open(srt_filename, "w", encoding="utf-8") as srt_file:
            srt_file.write(srt_content)
        
        print(f"SRT file for {language} generated successfully.")

    def load_video(self, video_file):
        # Load video file
        # Placeholder for actual video loading process
        print(f"Video file '{video_file}' loaded.")
        return "Video Content"

    def load_text(self, text_file):
        # Load text file
        with open(text_file, "r", encoding="utf-8") as file:
            text_content = file.read()
        print(f"Text file '{text_file}' loaded.")
        return text_content

    def align_and_segment_tamil(self, video_content, text_content):
        # Placeholder for Tamil alignment and segmentation
        # Implement alignment and segmentation specific to Tamil
        print("Aligning and segmenting Tamil text...")
        subtitle_units = ["Subtitle 1", "Subtitle 2", "Subtitle 3"]  # Placeholder
        return subtitle_units

    def align_and_segment_telugu(self, video_content, text_content):
       
        print("Aligning and segmenting Telugu text...")
        subtitle_units = ["Subtitle 1", "Subtitle 2", "Subtitle 3"]  # Placeholder
        return subtitle_units

    def align_and_segment_kannada(self, video_content, text_content):
        
        print("Aligning and segmenting Kannada text...")
        subtitle_units = ["Subtitle 1", "Subtitle 2", "Subtitle 3"]  # Placeholder
        return subtitle_units

    def create_srt(self, subtitle_units):
        # Generate SRT
        srt_content = ""
        for i, subtitle in enumerate(subtitle_units, start=1):
            srt_content += f"{i}\n"
            srt_content += "00:00:00,000 --> 00:00:05,000\n"  #  time stamps
            srt_content += f"{subtitle}\n\n"
        return srt_content


if __name__ == "__main__":
    generator = SubtitleGenerator()
    video_file = "example_video.mp4" # your own
    text_file = "example_text.txt"    
    for language in generator.languages:
        generator.generate_srt(video_file, text_file, language)
