[Not for merge] Add Bengaliai speech #1202

yfyeung · 2023-08-07T06:15:26Z

Kaggle Competition: https://www.kaggle.com/competitions/bengaliai-speech/overview
The goal of this competition is to recognize Bengali speech from out-of-distribution audio recordings. You will build a model trained on the first Massively Crowdsourced (MaCro) Bengali speech dataset with 1,200 hours of data from ~24,000 people from India and Bangladesh. The test set contains samples from 17 different domains that are not present in training

npovey · 2023-08-15T09:41:37Z

hi,
I tried this code but the duration in manifests for all audios are zero.

...."duration": 0.0,...

not sure if the output below is the cause:

2023-08-14 23:47:41,483 INFO [audio.py:137] The user overrided the global setting for whether to use ffmpeg-torchaudio to compute the duration of audio files. Old setting: True. New setting: False.

PS: added to stage 0 the line below

  unzip bengaliai-speech.zip -d download/bengaliai_speech

yfyeung · 2023-08-15T10:13:30Z

hi, I tried this code but the duration in manifests for all audios are zero.

...."duration": 0.0,...

not sure if the output below is the cause:

2023-08-14 23:47:41,483 INFO [audio.py:137] The user overrided the global setting for whether to use ffmpeg-torchaudio to compute the duration of audio files. Old setting: True. New setting: False.

PS: added to stage 0 the line below
  unzip bengaliai-speech.zip -d download/bengaliai_speech

Using ffmpeg-torchaudio to compute the duration of .mp3 files leads to %CPU more than 100%, see issue: lhotse-speech/lhotse#1026
So we disable ffmpeg-torchaudio when the type of audio files is mp3.
I don't think this is the reason causing the duration in manifests for all audios are zero.

danpovey · 2023-08-16T06:16:33Z

@yfyeung so do you have a theory why the durations might be zero?

npovey · 2023-08-18T22:22:18Z

The problem is fixed.
Actually, I was having a problem on my text_search project. The m4a duration was calculated wrong and @pzelasko advised to have pytorch 2.0 or above see here:lhotse-speech/lhotse#1121
After running

pip3 install torch torchvision torchaudio

This fixed this problem.

I had:
torch: 1.12.1+cu113
torchaudio: 0.12.1+cu113
lhotse: 1.16.0

Now I have:
python -c "import torch; print(torch.version)"
2.0.1+cu117
python -c "import torchaudio; print(torchaudio.version)"
2.0.2+cu117

yfyeung changed the title ~~Add Bengaliai speech~~ [Not for merge] Add Bengaliai speech Aug 7, 2023

update

1398e07

yfyeung force-pushed the bengaliai_speech branch from 3124f6e to 1398e07 Compare August 7, 2023 06:21

yifanyang added 3 commits September 1, 2023 17:16

Fix

d44a434

Fix

a887c01

Fix

53f5892

yfyeung closed this Jan 23, 2024

yfyeung deleted the bengaliai_speech branch January 23, 2024 05:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Not for merge] Add Bengaliai speech #1202

[Not for merge] Add Bengaliai speech #1202

yfyeung commented Aug 7, 2023

npovey commented Aug 15, 2023 •

edited

Loading

yfyeung commented Aug 15, 2023

danpovey commented Aug 16, 2023

npovey commented Aug 18, 2023

[Not for merge] Add Bengaliai speech #1202

[Not for merge] Add Bengaliai speech #1202

Conversation

yfyeung commented Aug 7, 2023

npovey commented Aug 15, 2023 • edited Loading

yfyeung commented Aug 15, 2023

danpovey commented Aug 16, 2023

npovey commented Aug 18, 2023

npovey commented Aug 15, 2023 •

edited

Loading