Skip to content

Latest commit

 

History

History
57 lines (43 loc) · 1.54 KB

cuts.rst

File metadata and controls

57 lines (43 loc) · 1.54 KB

Cuts

Overview

Audio cuts are one of the main Lhotse features. Cut is a part of a recording, but it can be longer than a supervision segment, or even span multiple segments. The regions without a supervision are just audio that we don't assume we know anything about - there may be silence, noise, non-transcribed speech, etc. Task-specific datasets can leverage this information to generate masks for such regions.

lhotse.cut.Cut

lhotse.cut.CutSet

Types of cuts

There are three cut classes: ~lhotse.cut.MonoCut, ~lhotse.cut.MixedCut, and ~lhotse.cut.PaddingCut that are described below in more detail:

lhotse.cut.MonoCut

lhotse.cut.MixedCut

lhotse.cut.PaddingCut

CLI

We provide a limited CLI to manipulate Lhotse manifests. Some examples of how to perform manipulations in the terminal:

# Reject short segments
lhotse yaml filter 'duration>=3.0' cuts.jsonl cuts-3s.jsonl
# Pad short segments to 5 seconds.
lhotse cut pad --duration 5.0 cuts-3s.jsonl cuts-5s-pad.jsonl
# Truncate longer segments to 5 seconds.
lhotse cut truncate --max-duration 5.0 --offset-type random cuts-5s-pad.jsonl cuts-5s.jsonl