Skip to content

This repository contains the speech dataset and some files concerned with the speech recognition class project.

License

Notifications You must be signed in to change notification settings

csikasote/ammi-asr-class-project

Repository files navigation

Speech Recognition - Bemba language dataset##

This repository contains the 2 hours speech recognition dataset recorded in Bemba language. The native language spoken by the majority of people in the north part of Zambia. The dataset is split into 1hr dev and 1hr test datasets as per directive by Prof. Laurent Besacier.

Contents

  • *.wav audio files - produced by the Lig-Aikuma application
  • *.json metadata files - produced by the Lig-Aikuma application associated with every audio file.
  • *linker.text files - produced by Lig-Aikuma application after recording session.
  • Session.text files - consists raw texts used for elicitation.

File organization

  • [Bemba_speech_dataset] --> ['dev', 'train']

Problems encountered during recording.

Several challenges were encoutered during the dataset building process:

  • The scarcity of the Bemba literature on the internet. But was fortunate to find an online Bible in Bemba language.
  • Difficult to read and pronounce some of the words in Bemba. Surely, this could have negatively affected the quality of recordings.
  • The Lig-Aikuma application was crushing during recording sessions. This slowed the process of recording and frustrating at times.

Other comments

  • Speech recognition course was very interesting especially that it was coupled with a project that gave me the practical aspect of gathering data and coming up with speech dataset for ASR. Huge shout outs to amazing lecturers for sharing their knowledge.

Future works

  • Having noticed there exist no speech dataset for any Zambia language. I wish to take up a challenge/project and build the speech datasets for some of the local language in my country, Zambia, especially after AMMI. This would allow me and other researcher interested in speech recognition to carry on research. Also this has an impact on preservation of some local languages.

-- Signed

About

This repository contains the speech dataset and some files concerned with the speech recognition class project.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published