# Generating subtitles with `autosub`

Since `autosub` does not have a Python interface, we wil directly use the command line to run it.

And we will actually run it via `poetry` since we are in a a `poetry` virtual environment, but you would use `autosub` directly otherwise.

## Parameters

* `-S`:  Lang code/Lang tag for speech-to-text. Recommend using the Google Cloud Speech (default) reference lang codes
* `-i`: input file

IMPORTANT: language are provided as `{LANG}-{COUNTRY}` e.g. `it-it`, `uk-ua` (Ukrainian) etc.

See also `autosub --help`.

In [1]:
from pathlib import Path

In [2]:
# I did not commit this file for copyright reasons, feel free to use your own files to test it
FILENAME = 'example.mkv'
LANGUAGE = 'it-it'  # make sure to use localized language codes (includes the country code)

In [3]:
!poetry run autosub -S {LANGUAGE} -i {FILENAME}

Translation destination language not provided. Only performing speech recognition.
Override "-of"/"--output-files" due to your args too few.
Output source subtitles file only.

Convert source file to "/tmp/tmps4jar9qf.wav" to detect audio regions.
/usr/bin/ffmpeg -hide_banner -y -i "example.mkv" -vn -ac 1 -ar 48000 -loglevel error "/tmp/tmps4jar9qf.wav"

Use ffprobe to check conversion result.
/usr/bin/ffprobe "/tmp/tmps4jar9qf.wav" -show_format -pretty -loglevel quiet
[FORMAT]
filename=/tmp/tmps4jar9qf.wav
nb_streams=1
nb_programs=0
format_name=wav
format_long_name=WAV / WAVE (Waveform Audio)
start_time=N/A
duration=0:00:30.267229
size=2.771122 Mibyte
bit_rate=768.020000 Kbit/s
probe_score=99
TAG:encoder=Lavf58.76.100
[/FORMAT]

Conversion completed.
Use Auditok to detect speech regions.
Auditok detection completed.
"/tmp/tmps4jar9qf.wav" has been deleted.

Converting speech regions to short-term fragments.
Converting: [38;2;0;255;0m100%[39m |########################################

In [4]:
# get the output file name (note: parameter `-o` to customize the output file name adds a language suffix
# anyways, so that does not really help us...)
FILENAME_OBJ = Path(FILENAME)
OUTPUT_FILENAME = FILENAME_OBJ.with_stem(FILENAME_OBJ.stem + '.' + LANGUAGE).with_suffix('.srt')
OUTPUT_FILENAME

PosixPath('example.it-it.srt')

In [5]:
!cat {OUTPUT_FILENAME}

1
00:00:00,880 --> 00:00:03,190


2
00:00:03,820 --> 00:00:07,170


3
00:00:10,710 --> 00:00:11,320


4
00:00:12,490 --> 00:00:16,420
Qui agente Sora Lella chiedo immediatamente i rinforzi

5
00:00:16,510 --> 00:00:26,500


6
00:00:26,510 --> 00:00:28,550


