Codes and notes for learning audio operations with pytorch

Repo Info

Related Series in YouTube: PyTorch for Audio + Music Processing

Codes given by the author: GitHub Repo
Files / Directories Info (Listed in the order explained in the tutorial above):

Directories:
- MNIST: A dataset, auto downloaded in file train.py.
- UrbanSound8K: A dataset, downloaded from website URBANSOUND8K DATASET
Files:
- train_feed_forward_network.py: Contains a class FeedForwardNet and functions download_mnist_datasets, train_one_epoch and train which are used for downloading MNIST dataset and training them using FeedForwardNet model.
- feedforwardnet.pth: Model saved from train.py.
- predict_feed_forward_network.py: Contains a function predict for validating the model feedforwardnet.pth.
- urban_sound_dataset.py: Contains a class UrbanSoundDataset for loading .wav sound file in urbansound8k dataset and getting the waveform signals, sample rates and mel-spectorgrams of each audio. Serveral works are done in method __getitem__:
  - Load the .wav audio file and get its waveform signal and sample rate.
  - Resample the signal if the original sample rate is not equal to the target sample rate.
  - Mix down multiple channels to moto.
  - If the number of samples is more than the expected, apply cutting operation.
  - if the number of samples is less than the expected, apply right padding operation.
  - Use transforming function (here it's mel_spectrogram) to transform it.
- cnn.py: A simple CNN model.
- train_cnn: Use the model in cnn.py to train the urban sound dataset.
- predict_cnn: Predict the model in the same way of predict_feed_forward_network.py.

Some Environmental Problems

Get RuntimeError: No audio I/O backend is available. message while running code torchaudio.load(audio_sample_path) at file urban_sound_dataset.py:
```
# try with commands below
pip install SoundFile
# or
pip install sox
```

Get error message below when plotting mel-spectrogram using matplotlib:

 manager_pyplot_show = vars(manager_class).get("pyplot_show")
TypeError: vars() argument must have __dict__ attribute

Solutions (Stack Overflow):

mpl.use('TkAgg')  # Add this code

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MNIST		MNIST
UrbanSound8K		UrbanSound8K
.gitignore		.gitignore
README.md		README.md
cnn.pth		cnn.pth
cnn.py		cnn.py
feedforwardnet.pth		feedforwardnet.pth
predict_cnn.py		predict_cnn.py
predict_feed_forward_network.py		predict_feed_forward_network.py
train_cnn.py		train_cnn.py
train_fead_forward_network.py		train_fead_forward_network.py
urban_sound_dataset.py		urban_sound_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNIST

MNIST

UrbanSound8K

UrbanSound8K

.gitignore

.gitignore

README.md

README.md

cnn.pth

cnn.pth

cnn.py

cnn.py

feedforwardnet.pth

feedforwardnet.pth

predict_cnn.py

predict_cnn.py

predict_feed_forward_network.py

predict_feed_forward_network.py

train_cnn.py

train_cnn.py

train_fead_forward_network.py

train_fead_forward_network.py

urban_sound_dataset.py

urban_sound_dataset.py

Repository files navigation

Codes and notes for learning audio operations with pytorch

Repo Info

Some Environmental Problems

About

Releases

Packages

Languages

github-bowen/Audio-Processing-With-Torch

Folders and files

Latest commit

History

Repository files navigation

Codes and notes for learning audio operations with pytorch

Repo Info

Some Environmental Problems

About

Resources

Stars

Watchers

Forks

Languages