Skip to content

mcagri/DatasetLabeler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Dataset Labelling Automation

This project aims to streamline the dataset creation and labelling processes for Automatic Speech Recognition (ASR) systems. Project consists of 3 parts.

  • AudioFile Ingestion
  • Automatic Labelling with ASR (Whisper Model)
  • Manual Labelling for improved dataset quality

WebRTC-VAD implementation is taken from this repository. https://github.com/wiseman/py-webrtcvad/blob/master/example.py

Releases

No releases published

Packages

No packages published

Languages