Caution
Lhotse datasets are still very much in the works and are subject to breaking changes.
We supply subclasses of the torch.data.Dataset
for various audio/speech tasks. These datasets are created from CutSet
objects and load the features from disk into memory on-the-fly. Each dataset accepts an optional root_dir
argument which is used as a prefix for the paths to features and audio.
Currently, we provide the following:
lhotse.dataset.diarization
lhotse.dataset.unsupervised
lhotse.dataset.speech_recognition
lhotse.dataset.speech_synthesis
lhotse.dataset.source_separation.DynamicallyMixedSourceSeparationDataset
lhotse.dataset.source_separation.PreMixedSourceSeparationDataset
lhotse.dataset.vad