Whisper Speech-To-Text

Speech-To-Text provider for Nextcloud running OpenAI Whisper locally on CPU.

The model runs completely on your machine. No private data leaves your servers.

Requirements

Architecture: x86-64 with AVX support
OS: Linux

Model sizes

Small: 500MB
Medium: 1.5Gb
Large: 3.1GB

Ethical AI Rating

Rating: 🟡

Positive:

the software for training and inference of this model is open source
the trained model is freely available, and thus can be run on-premises

Negative:

the training data is not freely available, limiting the ability of external parties to check and correct for bias or optimise the model’s performance and CO2 usage.

Learn more about the Nextcloud Ethical AI Rating in our blog.

Install

Manual install
- Place this app in nextcloud/apps/
One click install
- Install from the Nextcloud Appstore

Download models

After installing this app you will need to run

occ stt_whisper:download-models [model-name]

where [model-name] is one of

small
medium (default)
large

Building the app

The app can be built by using the provided Makefile by running:

make

This requires the following things to be present:

make
which
tar: for building the archive
curl: used if phpunit and composer are not installed to fetch them from the web
npm: for building and testing everything JS, only required if a package.json is placed inside the js/ folder
gcc: for building whisper.cpp

NOTE

A few things to keep in mind.

Transcriptions need to be enabled in the Talk app if you need the calls to be transcribed with any Speech to Text provider (including this app). It can be set using this occ command:

occ config:app:set spreed call_recording_transcription --value yes

This app tends to be heavy on CPU. If it starts to be an issue in your normal workflow, you can limit the number of threads used by Whisper in the "Whisper Speech-To-Text" section in the admin settings
The generated transcriptions may vary in accuracy based on the spoken language.
Per participant transcription in calls is currently not available but PRs are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/workflows		.github/workflows
.reuse		.reuse
.tx		.tx
LICENSES		LICENSES
appinfo		appinfo
bin		bin
img		img
l10n		l10n
lib		lib
models		models
screenshots		screenshots
src		src
stubs		stubs
templates		templates
test/fixtures		test/fixtures
vendor-bin		vendor-bin
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.l10nignore		.l10nignore
.php-cs-fixer.dist.php		.php-cs-fixer.dist.php
CHANGELOG.md		CHANGELOG.md
Makefile		Makefile
README.md		README.md
composer.json		composer.json
composer.lock		composer.lock
package-lock.json		package-lock.json
package.json		package.json
psalm-baseline.xml		psalm-baseline.xml
psalm.xml		psalm.xml
stylelint.config.js		stylelint.config.js
webpack.js		webpack.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Speech-To-Text

Requirements

Model sizes

Ethical AI Rating

Rating: 🟡

Install

Download models

Building the app

NOTE

About

Releases 9

Packages

Contributors 7

Languages

nextcloud/stt_whisper

Folders and files

Latest commit

History

Repository files navigation

Whisper Speech-To-Text

Requirements

Model sizes

Ethical AI Rating

Rating: 🟡

Install

Download models

Building the app

NOTE

About

Resources

Security policy

Stars

Watchers

Forks

Releases 9

Packages 0

Contributors 7

Languages

Packages