Skip to content
😭🔮TensorFlow-based cry detection
Python Jupyter Notebook C++ Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
app
data
.gitignore
README.md
batch-archive-mp3.sh
batch-convert-freq.sh
batch-process-recordings.sh
conv_labels.txt
freeze.sh
graph.pb
label-wav.sh
move-to-buckets.sh
process-raw-labeled-wav.sh
requirements.txt
train.sh

README.md

babblefish

File structure

app/       - TF training scripts
crying/    - raw crying sample WAVs from source device
white_noise/   - raw white_noise sample WAVs from source device
data/
  crying/  - normalized WAVs for crying label
  white_noise/  - normalized WAVs for white_noise label
  room_empty/  - normalized WAVs for room_empty label

Getting data

Record on rpi with arecord, rsync it to raw/ directory here.

Batchify

Find the valid recordings and use sox to batch up

Cleaning up data

The process-raw-labeled-wav.sh script will:

  1. Take raw samples in crying/ and white_noise/ folders
  2. Batch them up and splice them evenly into 5-second segments
  3. Amplify 45dB, downsample to 22050
  4. Store in data/<LABEL> directory.

Learning

Trying to run the training app directly pointed at this app.

Getting:

InvalidArgumentError (see above for traceback): Data too short when trying to read value
         [[Node: DecodeWav = DecodeWav[desired_channels=1, desired_samples=192000, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ReadFile)]]

Seems like the WAV file partitions might be too short? Need exactly SAMPLE_RATE*DURATION_MS samples in WAV files!

Training

python app/train.py --data_url= --data_dir=/Users/andrewhao/workspace/babblefish/data --wanted_words=white_noise,crying --sample_rate=48000 --clip_duration_ms=5000 --start_checkpoint=/tmp/speech_commands_train/conv.ckpt-1300 --how_many_training_steps=1400,300

Freeze

python app/freeze.py --start_checkpoint=/tmp/speech_commands_train/conv.ckpt-1700 --output_file=./graph.pb --clip_duration_ms=5000 --sample_rate=48000 --how_many_training_steps=1400,300 --wanted_words=white_noise,crying --data_dir=/Users/andrewhao/workspace/babblefish/data

Label

python app/label_wav.py --graph=./graph.pb --labels=./conv_labels.txt --wav=data/crying/crying00021.wav

MP3<>WAV utilities

MP3 -> WAV

ffmpeg -i crying/2_years/2019-09-11\ 04:53:03.wav.mp3 -acodec pcm_s16le -r 22050 test.wav

You can’t perform that action at this time.