stlr-core

Overview

stlrcore is a toolkit designed as a wrapper to whisper-timestamped and stable-whisper which aims to provide a more convenient interface between them. It serves as the foundation for stlrapps (a suite of tools for automatic subtitle generation, etc.)

Installation

Usage

from pathlib import Path
from typing import Iterator

from stlrcore import Transcription
from stlrcore.transcribe import WordTiming, Segment

# create a transcription from an audio file
transcription = Transcription.from_audio("path/to/audio.ext")

# Transcription objects are iterable, and they iterate over WordTimings
word_timings: Iterator[WordTiming] = iter(transcription)

# For just the actual words themselves:
words: list[str] = transcription.words
text: str = str(transcription)

# You can also create Segments, which are consecutive words without pauses.
segments: Iterator[Segment] = transcription.get_segments(tolerance=0.0)

# Find the timing for a particular substring of words
segment: Segment = transcription.get_fragment(fragment="...")

# Transcriptions can be exported as json, Audacity cue, or Audition cue
transcription.export(filestem="transcription", mode="json")  # -> transcription.json
transcription.export(filestem="transcription", mode="audacity")  # -> transcription.txt
transcription.export(filestem="transcription", mode="audition")  # -> transcription.csv

# Similarly, Transcriptions can be created from these exported files:
transcription = Transcription.from_json(Path("transcription.json"))
transcription = Transcription.from_audacity_cue(Path("transcription.txt"))
transcription = Transcription.from_audition_cue(Path("transcription.csv"))

# A convenience wrapper around these is also provided:
# `mode` should be one of "audio", "json", "audacity", "audition"
transcription = Transcription.load(Path(...), mode=...)

# Because Transcriptions are constructed out of words, not as segments, direct export
# to SRT is not supported. While it may be preferable to manually (or otherwise) determine
# the proper segment split points, the following can be used:
Transcription.write_srt(segments=transcription.get_segments(), dest="transcription.srt")

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
src/stlrcore		src/stlrcore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/stlrcore

src/stlrcore

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

Repository files navigation

stlr-core

Overview

Installation

Usage

About

Releases 1

Packages

Languages

License

lilellia/stlrcore

Folders and files

Latest commit

History

Repository files navigation

stlr-core

Overview

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Languages