Skip to content

Video Transcription for the SubtitleSwitcher Plugin

Daniel Neto edited this page May 14, 2024 · 6 revisions

SubtitleSwitcher is a powerful plugin for the AVideo platform. It allows users to upload or manually create subtitles, supporting .srt and .vtt files. In addition to these features, SubtitleSwitcher integrates with the offline speech recognition toolkit Vosk, enabling automatic transcription of videos and audios. This tutorial will guide you through the installation and configuration of the SubtitleSwitcher plugin and the Vosk transcriber.

SubtitleSwitcher Plugin Options

Upon installation of the SubtitleSwitcher plugin, several options are available for customization:

subtitleDivStyle: Defines the appearance of the subtitle section.

disableSubtitleSettings: If enabled, the user won't be able to modify subtitle settings.

disableUTF8Encode: Turns off UTF8 encoding for subtitles.

addClosedCaptionIcons: Adds icons to the closed captions.

defaultSubtitleIsOff: Defines if subtitles are turned off by default.

defaultSubtitleLanguage: Sets the default subtitle language.

defaultSubtitleLanguageUseUserLocation: If enabled, the default subtitle language is based on the user's location.

automaticTranscript: If enabled, use Vosk transcriber for automatic transcriptions.

automaticTranscriptModelPath: Sets the file path for the Vosk language model.

Once the plugin is enabled, two new options will appear in the video manager - 'Create Subtitle' and 'Upload Subtitle'. The 'Upload Subtitle' option allows for uploading a .srt or .vtt file. The 'Create Subtitle' option opens an editor where you can set the start and end times for each subtitle text segment and edit the subtitle text.

chrome-capture-2023-7-1

The transcriber

Vosk is an offline speech recognition toolkit that supports multiple languages. The Vosk-transcriber is a Python package that utilizes the Vosk library to transcribe audio files. By installing Vosk-transcriber in Ubuntu, you can enable the SubtitleSwitcher plugin to automatically transcribe videos and audio. This will allow you to generate accurate subtitles for your videos in various languages without the need for an internet connection. Follow this step-by-step guide to install Vosk-transcriber in Ubuntu and enhance the functionality of the subtitleSwitcher plugin with speech recognition capabilities.

Update your system

Before installing new packages, it's a good practice to update your system. Open a terminal and run the following command:

sudo apt update && sudo apt upgrade

Python installation from Pypi

The easiest way to install vosk API is with pip. You do not have to compile anything.

Make sure you have up-to-date pip and python3 versions:

  • Python version: 3.5-3.9
  • pip version: 20.3 and newer.

Upgrade Python and pip if needed. Then install vosk on Linux/Mac from pip:

pip3 install vosk

For more information check https://alphacephei.com/vosk/install

Download a Vosk language model

Vosk requires a language model to perform speech recognition. Download a pre-built language model from the Vosk website (https://alphacephei.com/vosk/models) or the GitHub repository (https://github.com/alphacep/vosk-api/blob/master/doc/models.md). For example, to download the English language model, run:

wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip

Extract the language model

Unzip the downloaded language model:

unzip vosk-model-small-en-us-0.15.zip
Clone this wiki locally