Skip to content

Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers.

License

Notifications You must be signed in to change notification settings

swagshaw/TorchKWS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TorchKWS

AI Research into Spoken Keyword Spotting. Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers. Model architectures will not always mirror the ones proposed in the papers, but I have chosen to focus on getting the core ideas covered instead of getting every layer configuration right.

Table of Contents

Implementations

About DataSet

Speech Commands DataSet is a set of one-second .wav audio files, each containing a single spoken English word. These words are from a small set of commands, and are spoken by a variety of different speakers. The audio files are organized into folders based on the word they contain, and this dataset is designed to help train simple machine learning models.

Installation

We use the Google Speech Commands Dataset (GSC) as the training data. By running the script, you can download the training data:

cd <ROOT>/dataset
python process_speech_commands_data.py \
    --data_root=<absolute path to where the data should be stored> \
    --data_version=<either 1 or 2, indicating version of the dataset>\
    --class_split=<either "all" or "sub", indicates whether all 30/35 classes should be used, or the 10+2 split should be used> \
    --rebalance \
    --log

Temporal Convolution Resnet

Temporal Convolution for Real-time Keyword Spotting on Mobile Devices [Paper] [Code]

Broadcasting Residual Network

Broadcasted Residual Learning for Efficient Keyword Spotting [Paper] [Code]

MatchboxNet

MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition [Paper] [Code]

ConvMixer

ConvMixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-field Keyword Spotting [Paper] [Code]

KWT

Keyword transformer: A self-attention model for keyword spotting [Paper] [Code]

Reference

  1. https://github.com/hyperconnect/TC-ResNet
  2. https://github.com/huangyz0918/kws-continual-learning
  3. https://github.com/eriklindernoren/PyTorch-GAN
  4. https://github.com/roman-vygon/BCResNet
  5. https://github.com/dominickrei/MatchboxNet
  6. https://github.com/dianwen-ng/Keyword-Spotting-ConvMixer

About

Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages