GitHub - yermandy/vehicle-audio-nn

Audio-Based Event Detection

Experiments

Packages

Easy way

conda env create --file environment.yml

Manual way

⚠️ Note: This is not the recommended way. Use the environment.yml file instead to create the environment.

conda create -n eye-audio python=3.10 
conda activate eye-audio
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install easydict 
pip install opencv-python
pip install tqdm
pip install matplotlib
pip install scikit-learn
pip install PyYAML
pip install wandb
pip install hydra-core
pip install pandas
pip install seaborn
pip install librosa
pip install moviepy
pip install tabulate
pip install git+https://github.com/yermandy/pyrootutils.git
pip install rich
pip install torch_audiomentations
pip install audiomentations
pip install qpsolvers[open_source_solvers]

Project structure

Create and populate data/video and data/csv folders

mkdir -p data/video
mkdir -p data/csv

# cvut dataset
mkdir -p data/video/cvut
mkdir -p data/csv/cvut
ln -s ~/data/MultiDo/CVUTFD/copy/*.{MP4,MOV,mov,mp4,mts,MTS} data/video/cvut/
ln -s ~/data/MultiDo/CVUTFD/result/*.csv data/csv/cvut/

# eyedea dataset
mkdir -p data/video/eyedea
mkdir -p data/csv/eyedea
ln -s ~/data/MultiDo/videa_prujezdy/*.{MP4,MOV,mov,mp4,mts,MTS} data/video/eyedea/
ln -s ~/data/MultiDo/videa_prujezdy/*.csv data/csv/eyedea/

Preprocess files

Use preprocess_data.py to generate files in data/audio, data/audio_tensors, data/labels and data/intervals

Example:

preprocess_data.py config/dataset/000_debug.yaml

where config/dataset/dataset.yaml is the path to yaml list with files to be preprocessed

Converting videos by ffmpeg:

ffmpeg -i input_video.mts -c:v copy -c:a aac -b:a 256k output_video.mp4

Wandb account

To visualize training curves, create wandb account and add new project. Add your wandb project name and account name to config/wandb/wandb.yaml.

Neural Network Training

Debug training

The following command will run training for a few epochs and save results to outputs/000_debug folder

python cross_validation.py experiment=000_debug cuda=1

Best Model

Change training configurations in config/model/default.yaml

To override run configuration, use:

python cross_validation.py experiment=047_october

where 047_october is the name of the experiment defined in config/experiment/047_october.yaml file

Weights

Download pretrained model here and unzip in outputs folder

Demos

Demo 1

Prediction.

It takes an audio, extracted from a video, applies multi-head audio predictor and outputs predictions for individual time windows and summary.

Input:

videos
model

Output:

predictions for each time window
counts for each head

Usage:

python demo_1.py -v 71_Samsung -m 047_october/0

Notice, 71_Samsung video file should be somewhere in subdirectories of data/video/**. The full model path is "outputs/047_october/0/rvce.pth".

Demo 2

Prediction and evaluation. The same as demo_1, but it uses ground-truth labels to evaluate prediction accuracy.

Input:

videos
model
csv files with annotations

Output:

rvce for each head
fault detection visualization

Usage:

python demo_2.py -v 71_Samsung -m 047_october/0

Notice, 71_Samsung video file should be somewhere in subdirectories of data/video/** and annotations in data/csv/**. The full model path is "outputs/047_october/0/rvce.pth".

Demo 3

It splits input (long) video into two parts. The begining part is used for fine-tuning the prediction model. The trailing part of the video is used for prediction and evaluation.

Input:

videos
model
csv files with annotations
fine-tuning length (training part)

Output:

rvce for each head on test part

Usage:

python demo_3.py -v 71_Samsung -m 047_october/0 --device cpu --training_hours 0.15

Notice, 71_Samsung video file should be somewhere in subdirectories of data/video/** and annotations in data/csv/**. The full model path is "outputs/047_october/0/rvce.pth". The first 0.15 hours of the video is used for training and rest evaluation and the it uses CPU only.

Name		Name	Last commit message	Last commit date
Latest commit History 404 Commits
config		config
experiments		experiments
notebooks		notebooks
papers		papers
piano		piano
scripts		scripts
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
boxplot_statistics.ipynb		boxplot_statistics.ipynb
combine_IDMT_traffic.py		combine_IDMT_traffic.py
combine_short_to_long.py		combine_short_to_long.py
create_dataset_for_structured_learning.py		create_dataset_for_structured_learning.py
cross_validation.py		cross_validation.py
cross_validation_structured_predictor.py		cross_validation_structured_predictor.py
demo_1.py		demo_1.py
demo_2.py		demo_2.py
demo_3.py		demo_3.py
environment.yml		environment.yml
finetune_long_sequences.py		finetune_long_sequences.py
inference.py		inference.py
just_validate_and_save.py		just_validate_and_save.py
preprocess_data.py		preprocess_data.py
profile_network.py		profile_network.py
readme.md		readme.md
test_hydra.py		test_hydra.py
train_network.py		train_network.py
train_structured_predictor.py		train_structured_predictor.py
train_svm.py		train_svm.py
visualization.ipynb		visualization.ipynb
visualize_long_sequences.ipynb		visualize_long_sequences.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-Based Event Detection

Experiments

Packages

Easy way

Manual way

Project structure

Preprocess files

Wandb account

Neural Network Training

Debug training

Best Model

Weights

Demos

Demo 1

Demo 2

Demo 3

About

Releases 2

Packages

Languages

yermandy/vehicle-audio-nn

Folders and files

Latest commit

History

Repository files navigation

Audio-Based Event Detection

Experiments

Packages

Easy way

Manual way

Project structure

Preprocess files

Wandb account

Neural Network Training

Debug training

Best Model

Weights

Demos

Demo 1

Demo 2

Demo 3

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages