🌐 Official Website | 🖥️ GitHub | 🤗 Model | 📑 Blog |
Advanced forced alignment and subtitle generation powered by Lattice-1-Alpha model.
pip install install-k2
# The installation will automatically detect and use your already installed PyTorch version.
install-k2 # Install k2
pip install lattifai
⚠️ Important: You must runinstall-k2
before using the lattifai library.
# Align audio with subtitle
lattifai align audio.wav subtitle.srt output.srt
# Convert subtitle format
lattifai subtitle convert input.srt output.vtt
> lattifai align --help
Usage: lattifai align [OPTIONS] INPUT_AUDIO_PATH INPUT_SUBTITLE_PATH OUTPUT_SUBTITLE_PATH
Command used to align audio with subtitles
Options:
-F, --input_format [srt|vtt|ass|txt|auto] Input Subtitle format.
-D, --device [cpu|cuda|mps] Device to use for inference.
--split_sentence Smart sentence splitting based on punctuation semantics.
--help Show this message and exit.
The --split_sentence
option performs intelligent sentence re-splitting based on punctuation and semantic boundaries. This is especially useful when processing subtitles that combine multiple semantic units in a single segment, such as:
- Mixed content: Non-speech elements (e.g.,
[APPLAUSE]
,[MUSIC]
) followed by actual dialogue - Natural punctuation boundaries: Colons, periods, and other punctuation marks that indicate semantic breaks
- Concatenated phrases: Multiple distinct utterances joined together without proper separation
Example transformations:
Input: "[APPLAUSE] >> MIRA MURATI: Thank you all"
Output: ["[APPLAUSE]", ">> MIRA MURATI: Thank you all"]
Input: "[MUSIC] Welcome back. Today we discuss AI."
Output: ["[MUSIC]", "Welcome back.", "Today we discuss AI."]
This feature helps improve alignment accuracy by:
- Respecting punctuation-based semantic boundaries
- Separating distinct utterances for more precise timing
- Maintaining semantic context for each independent phrase
Usage:
lattifai align --split_sentence audio.wav subtitle.srt output.srt
from lattifai import LattifAI
# Initialize client
client = LattifAI(
api_key: Optional[str] = None,
model_name_or_path='Lattifai/Lattice-1-Alpha',
device='cpu', # 'cpu', 'cuda', or 'mps'
)
# Perform alignment
result = client.alignment(
audio="audio.wav",
subtitle="subtitle.srt",
split_sentence=False,
output_subtitle_path="output.srt"
)
Audio: WAV, MP3, FLAC, M4A, OGG Subtitle: SRT, VTT, ASS, TXT (plain text)
LattifAI(
api_key: Optional[str] = None,
model_name_or_path: str = 'Lattifai/Lattice-1-Alpha',
device: str = 'cpu' # 'cpu', 'cuda', or 'mps'
)
client.alignment(
audio: str, # Path to audio file
subtitle: str, # Path to subtitle/text file
format: Optional[str] = None, # 'srt', 'vtt', 'ass', 'txt' (auto-detect if None)
split_sentence: bool = False, # Smart sentence splitting based on punctuation semantics
output_subtitle_path: Optional[str] = None
) -> str
Parameters:
audio
: Path to the audio file to be alignedsubtitle
: Path to the subtitle or text fileformat
: Subtitle format ('srt', 'vtt', 'ass', 'txt'). Auto-detected if Nonesplit_sentence
: Enable intelligent sentence re-splitting (default: False). Set to True when subtitles combine multiple semantic units (non-speech elements + dialogue, or multiple sentences) that would benefit from separate timing alignmentoutput_subtitle_path
: Output path for aligned subtitle (optional)
client = LattifAI()
client.alignment(
audio="speech.wav",
subtitle="transcript.txt",
format="txt",
split_sentence=False,
output_subtitle_path="output.srt"
)
from pathlib import Path
client = LattifAI()
audio_dir = Path("audio_files")
subtitle_dir = Path("subtitles")
output_dir = Path("aligned")
for audio in audio_dir.glob("*.wav"):
subtitle = subtitle_dir / f"{audio.stem}.srt"
if subtitle.exists():
client.alignment(
audio=audio,
subtitle=subtitle,
output_subtitle_path=output_dir / f"{audio.stem}_aligned.srt"
)
# NVIDIA GPU
client = LattifAI(device='cuda')
# Apple Silicon
client = LattifAI(device='mps')
# CLI
lattifai align --device mps audio.wav subtitle.srt output.srt
First, create your API key at https://lattifai.com/dashboard/api-keys
Recommended: Using .env file
Create a .env
file in your project root:
LATTIFAI_API_KEY=your-api-key
The library automatically loads the .env
file (python-dotenv is included as a dependency).
Alternative: Environment variable
export LATTIFAI_API_KEY="your-api-key"
Lattice-1-Alpha features:
- State-of-the-art alignment precision
- Language Support: Currently supports English only. The upcoming Lattice-1 release will support English, Chinese, and mixed English-Chinese content.
- Handles noisy audio and imperfect transcripts
- Optimized for CPU and GPU (CUDA/MPS)
Requirements:
- Python 3.9+
- 4GB RAM recommended
- ~2GB storage for model files
git clone https://github.com/lattifai/lattifai-python.git
cd lattifai-python
pip install -e ".[test]"
./scripts/install-hooks.sh # Optional: install pre-commit hooks
pytest # Run all tests
pytest --cov=src # With coverage
pytest tests/test_basic.py # Specific test
ruff check src/ tests/ # Lint
ruff format src/ tests/ # Format
isort src/ tests/ # Sort imports
- Fork the repository
- Create a feature branch
- Make changes and add tests
- Run
pytest
andruff check
- Submit a pull request
Apache License 2.0
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Discord: Join our community