We support importing Kaldi data directories that contain at least the wav.scp
file, required to create the ~lhotse.audio.RecordingSet
. Other files, such as segments
, utt2spk
, etc. are used to create the ~lhotse.supervision.SupervisionSet
. We also support converting feats.scp
to ~lhotse.features.base.FeatureSet
, and reading features directly from Kaldi's scp/ark files via kaldiio library (which is an optional Lhotse's dependency).
We also allow to export a pair of ~lhotse.audio.RecordingSet
and ~lhotse.supervision.SupervisionSet
to a Kaldi data directory.
We currently do not support the following (but may start doing so in the future):
- Exporting Lhotse extracted features to Kaldi's
feats.scp
- Export Lhotse's multi-channel recording sets to Kaldi
We support Kaldi-compatible log-mel filter energies ("fbank") and MFCCs. We provide a PyTorch implementation that is GPU-compatible, allows batching, and backpropagation. To learn more about feature extraction in Lhotse, see features
.
Python methods related to Kaldi support:
lhotse.kaldi
Converting Kaldi data directory called data/train
, with 16kHz sampling rate recordings, to a directory with Lhotse manifests called train_manifests
:
# Convert data/train to train_manifests/{recordings,supervisions}.json
lhotse kaldi import \
data/train \
16000 \
train_manifests
# Convert train_manifests/{recordings,supervisions}.json to data/train
lhotse kaldi export \
train_manifests/recordings.json \
train_manifests/supervisions.json \
data/train