Experiment on creating a new dataset audio+text #107

Mte90 · 2020-11-30T11:42:36Z

On #90 we are talking about creating a new dataset but we need to experiment on how we can automatize it (also on reviewing).

The first experiment we can do is:

https://github.com/srinivr/kaldi-long-audio-alignment with italian model
Get an audio with a transcription from the previous ticket and experiment it on splitting the recording with the transcription associated

eziolotta · 2021-01-22T21:06:48Z

I'm starting same test of long audio segmentation, considering the speaker's voice activity.
On this fork: https://github.com/eziolotta/rVADfast

But i have same problem with quality of audio output...

eziolotta · 2021-01-31T13:43:45Z

First experiment of segmentation of short audio, using rVADfast and an algorithm that analyze segments found by rVAD to generate a new sequence of speech segments.
rVAD (and same other) tend to cut last bit signal of a speech segment.
Code and other tests yet to be published.

Input Clip : 644_2532_000000.wav - 15 second - (MLS Dataset)
Output : 5 Speech Segments (wav files)

test_segmentation_short_audio.zip

i try to extend algo to long audio (maybe hour, try Public Podcast )

eziolotta · 2021-02-13T20:30:29Z

Continuing the experiments with rVADFast, I was able to segment one random Podcast of Emilia Romagna Region

https://ambiente.regione.emilia-romagna.it/it/gallery/video/i-video-di-ermesambiente/convegno-inspire/stefano-olivucci-regione-emilia-romagna

Obtaining 143 segments with a duration from a minimum of 2 seconds to a maximum of 2 minutes.
Execution time for this process was approximately 1.5 hours

Audios are without transcription, so in this case an automatic transcription and human validation must be applied.

Unfortunately, other Speakers are also involved in podcasts, and some time words are not clear, check is required during validation. There is no background noise in Podcasts and the audio is clean.

Other Podcast here
Licence: Creative Commons Attribution 4.0

Output Dataset of My experiment can be downloaded here:
http://t.ly/xHHL

Mte90 added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Nov 30, 2020

This was referenced Dec 3, 2020

Lablita Importer #108

Closed

Lablita importer - improvement #109

Merged

Archive audio+text to download #90

Closed

eziolotta mentioned this issue Jan 31, 2021

VAD on short audio packet zhenghuatan/rVADfast#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment on creating a new dataset audio+text #107

Experiment on creating a new dataset audio+text #107

Mte90 commented Nov 30, 2020

eziolotta commented Jan 22, 2021

eziolotta commented Jan 31, 2021 •

edited

eziolotta commented Feb 13, 2021 •

edited

Experiment on creating a new dataset audio+text #107

Experiment on creating a new dataset audio+text #107

Comments

Mte90 commented Nov 30, 2020

eziolotta commented Jan 22, 2021

eziolotta commented Jan 31, 2021 • edited

eziolotta commented Feb 13, 2021 • edited

eziolotta commented Jan 31, 2021 •

edited

eziolotta commented Feb 13, 2021 •

edited