Go Hardcoded Subtitle Translator

Can be used to detect and translate hardcoded subtitles / other text in video, and output them as a subtitle file (currently WEBVTT is supported). Also allows you to output a translated version instead.

OCR is done using Google's video intelligence, translation using Google Translate or Naver's Papago.

Mostly tested to detect Korean (using Hangul as script filter) and translating to english, but all languages should be supported. (As long as the translating engine choses supports the language, and google's ocr can read it).

To use, simply copy the config.json.example to config.json and fill the details. API keys for both Google and Naver are required (if using naver for translation).

Config

The config looks like this, here comments are added to explain every value, to use the config simply copy the example config provided.

{
    "naver": {
        "clientId": "id",
        "clientSecret": "secret",
        "endpoint": "https://openapi.naver.com/v1/papago/n2mt"
    },
    "google": {
        "apiKey": "path"
    },
    "settings": {
        "detection": {
            "language": {
                // Determines if text is filtered based on script
                "filterScript": true,
                "script": "Hangul",
                // Determines if text is filtered based on language
                "filterLanguage": true,
                "language": "Korean",
                // Determines if we give the OCR language hints
                "detectLanguage": false,
                // BCP-47 format for language hints
                "languageHints": ["ko-KR"]
            },
            "subtitleLocation": {
                // Determines if we filter out text not within a given box
                "restrictLocation": true,
                // Box size in %. Box size is from top to bottom, left to right, 0 to 100%
                "top": 80,
                "bottom": 100,
                "left": 0,
                "right": 100
            },
            // Confidence threshold, filter out any text with a confidence below this
            "confidence": 75
        },
        "translation": {
            // Determines if we translate the text or not
            "translate": true,
            // What translation service to use, currently supported: Google, Naver (Naver = Papago)
            "engine": "Google",
            // Languages in ISO-639-1 format
            "sourceLanguage": "ko",
            "targetLanguage": "en"
        },
        "fixSubtitles": {
            // Determines if we attempt to do some fixes on the subtitles (highly recommended)
            "fix": true,
            // Determines if whitespace should be take in consideration to match duplicate text (ignore = recommended)
            "ignoreWhitespace": true,
            // Determines minimum subtitle duration, although duration could be shorter if next 
            // subtitle starts before this minimum amount (in milliseconds)
            "minimumDuration": 2000,
            // Determines if duplicate text can be matched based on a x% of words matching
            "partialMatch": true,
            "partialMatchPercentage": 50
        },        
        "inputFile": "video/input.mp4",
        "outputFile": "output/output.vtt"
    }
}

Run

To build, simply

$ go build

Then to run:

$ ./go-video-intel

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json.example		config.json.example
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Go Hardcoded Subtitle Translator

Config

Run

About

Releases

Packages

Languages

License

HergenD/go-video-intel

Folders and files

Latest commit

History

Repository files navigation

Go Hardcoded Subtitle Translator

Config

Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages