GitHub - seanghay/kfa: A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus

KFA

A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus.

Built-in Speech Enhancement
Word-level Alignment

pip install kfa

CLI

Note

audio.wav Input audio sample rate should be in 16kHz. Use ffmpeg or any other tools to resample the audio before processing.

ffmpeg -i audio_orig.wav -ac 1 -ar 16000 audio.wav

kfa -a audio.wav -t text.txt -o alignments.jsonl

# Output as Whisper style JSON format
kfa -a audio.wav -t text.txt --format whisper -o alignments.json

Python

from kfa import align, create_session
import librosa

with open("test.txt") as infile:
    text = infile.read()

y, sr = librosa.load("text.wav", sr=16000, mono=True)
session = create_session()

for alignment in align(y, sr, text, session=session):
  print(alignment)

References

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
bin		bin
kfa		kfa
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KFA

CLI

Python

References

License

About

Languages

seanghay/kfa

Folders and files

Latest commit

History

Repository files navigation

KFA

CLI

Python

References

License

About

Topics

Resources

Stars

Watchers

Forks

Languages