You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 28, 2022. It is now read-only.
The current audio corpora prep seems to only work on SPH files. In addition, the current description says this:
Note that filenames with hyphens will be sanitized to underscores and that audio files will be forced to single channel, 16 kHz, signed PCM format. If two channels are present, only the first will be used.
Many corpora come in WAV files instead of SPH files, and many also have two unmixed channels that need to be mixed to properly account for all audio.
The text was updated successfully, but these errors were encountered:
This PR enables audio corpus objects to accept SPH, WAV, and MP3 files from directories. It still expects file names to match between audio files and transcript STM files.
Further, this PR mixes stereo channels down to mono instead of discarding extra channels.
Fixes#13
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The current audio corpora prep seems to only work on SPH files. In addition, the current description says this:
Many corpora come in WAV files instead of SPH files, and many also have two unmixed channels that need to be mixed to properly account for all audio.
The text was updated successfully, but these errors were encountered: