Skip to content

Distill training from state of art model to a super small model for Speech Command dataset. The code base on torch lightling and optuna for optimization

Notifications You must be signed in to change notification settings

egochao/speech_commands_distillation_torch_lightling

Repository files navigation

1. What we are doing

1.1. The problems:

  • Key word spotting in audio.
  • Dataset : Speech Command

What we are concerning about

  • High accuracy on the test set
  • Super small model size for edge deployment

What we are doing here

  • We will use model distillation to pass knowledge from a big model to a small one
  • We will use Optuna for parameters search
  • We will use Torch lightling as boilerplace for this project
  • We will use weight and bias as monitoring tool

2. Share development env by VS code remote container

Please read though some concept here

This will spin up the development environment with minimal setup.

  1. Install and configure git password manager - this will help to share git configuration to the container

  2. Run the "Remote-Containers: Reopen in Container" command

3. Steps to train the model

  1. Train simple convolution
    python train.py
  1. Train Bc ResNet model
    python train.py --model bc_resnet

4. Steps to test the model

  1. Train simple convolution
    python test.py --pretrain path_to_pretrain
  1. Train Bc ResNet model
    python test.py --model bc_resnet --pretrain path_to_pretrain

5. The result

a. Model include in the work - No parameters search

Model Description Params Model accuracy
Simple Convolution A straight forward 1D convolution 26900 94.2%
BC Resnet Experiment logging 10600 95.6%

b. Model optimized with Optuna

Model Description Params Model accuracy
Simple Convolution A straight forward 1D convolution 35000 95.1%
BC Resnet Experiment logging 22000 98.3% - best

c. Model train with distillation loss

Model Description Params Model accuracy
Simple Convolution A straight forward 1D convolution 28600 90.3%
BC Resnet Experiment logging

d. Hightlight point

  • My best model have 22k parameters and accuracy on test set = 98.3% (Optuna optimized)
  • Almost beat the state-of-art(98.5)
  • The model size is superior compare with all other state-of-art model by some order of magnitude
  • The distillation process is not success and it causing the model perform worst than non distill

6. Other Development setups

  1. Install depenedencies
    pip install poetry
    poetry install
  1. Add new dependencies
    poetry add package_name

7. Link to trained model + resource

About

Distill training from state of art model to a super small model for Speech Command dataset. The code base on torch lightling and optuna for optimization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published