<a href="https://colab.research.google.com/github/beamscource/organic/blob/main/beatorganic_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **General goals of a ML project**



1. Find interesting and unique question.
2. Identify which data to use (audio, pictures).
3. Scrape and prepare the data.
4. Train a NN model (using SageMaker).
5. Evaluate model's performance.
6. Deploy the model in an streamlitapp.
7. Make predictions.
8. Build a pipepeline to add new features and retrain.

# **Beatorganic**

### Scraping beatbox videos from YT



https://github.com/marcosschroh/youtube-audio-downloader/blob/master/sound_downloader/downloader.py

https://www.geeksforgeeks.org/youtube-mediaaudio-download-using-python-pafy/

https://github.com/carlthome/audio-scraper

**Result**: There is not a single stable Python library which works with YT since YT's site is changing frequently. It's easier to use an online downloader like www.loader.to 

### Prepare the dataset



- sound classes in beat box: 50 to 120

  see https://www.reddit.com/r/beatbox/comments/bfrfrl/here_it_is_a_complete_list_of_all_sounds_in/

- sound libraries in Python

- https://www.alixaprodev.com/2021/09/python-libraries-for-audio-manipulation.html
- https://github.com/Parisson - audio data labelling - NOT STABLE
- https://github.com/heartexlabs/label-studio

In [None]:
# simple script that uses librosa to split wav file into short clips
# https://github.com/beamscource/organic

python edit_files.py -i /mnt/c/Users/kleine1/Downloads/dlow/ -c 6

### PyTorch models

- audio classification with CNN+LSTM models

https://hackernoon.com/intro-to-audio-analysis-recognizing-sounds-using-machine-learning-qy2r3ufl

https://www.nature.com/articles/s41598-021-96446-w
https://www.nature.com/articles/s41598-019-48909-4

https://towardsdatascience.com/audio-deep-learning-made-simple-sound-classification-step-by-step-cebc936bbe5

https://towardsdatascience.com/real-time-sound-event-classification-83e892cf187e


- use case 1: beatbox analytics for different beat boxers (e.g., Are there more instance of X in routines of beatboxer A vs beatboxer B?)
- use case 2: semi-supervised training for new data

- generative autoencoder/difusion/GAN models

- WaveGAN

https://github.com/chrisdonahue/wavegan

- Wavenet

https://medium.com/@rachelchen_49210/generating-ambient-noise-from-wavenet-95aa7f0a8f77

https://towardsdatascience.com/neuralfunk-combining-deep-learning-with-sound-design-91935759d628

https://magenta.tensorflow.org/nsynth

https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio

https://github.com/vincentherrmann/pytorch-wavenet

- DiffWave

https://github.com/lmnt-com/diffwave

In [None]:
pip install diffwave
pip install tensorboard
pip install Pillow

In [None]:
python -m diffwave.preprocess /mnt/c/Users/kleine1/Downloads/dlow/clean_outout
python -m diffwave /mnt/c/Users/kleine1/Downloads/dlow/model /mnt/c/Users/kleine1/Downloads/dlow/output_clean

In [None]:
tensorboard --logdir /mnt/c/Users/kleine1/Downloads/dlow/model --host localhost --port 8088

- https://diffwave-demo.github.io/

- generating sounds from autoencoder models

https://github.com/musikalkemist/generating-sound-with-neural-networks

- use case: style transfer

https://github.com/inzva/Audio-Style-Transfer

https://arxiv.org/abs/1710.11385


https://github.com/BenAAndrew/Voice-Cloning-App

**TTS with Hugging Face**

https://github.com/Vaibhavs10/ml-with-audio

https://huggingface.co/tasks/text-to-speech

https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_tts_realtime_demo.ipynb#scrollTo=vrRM57hhgtHy

**Audio playback**

- https://realpython.com/playing-and-recording-sound-python/
- https://python-sounddevice.readthedocs.io/en/0.4.5/
- https://stackoverflow.com/questions/8707967/playing-a-sound-from-a-wave-form-stored-in-an-array

**Training on a M1 chip**

- ensure to have MacOS 12.3.+ and Pytorch 1.13.+ installed

In [None]:
import torch

print(torch.__version__)

# this ensures that the current MacOS version is at least 12.3+
print(torch.backends.mps.is_available())
# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())

- https://github.com/pytorch/pytorch/releases

- https://towardsdatascience.com/installing-pytorch-on-apple-m1-chip-with-gpu-acceleration-3351dc44d67c
- https://stackoverflow.com/questions/68820453/how-to-run-pytorch-on-macbook-pro-m1-gpu
- https://stackoverflow.com/questions/72416726/how-to-move-pytorch-model-to-gpu-on-apple-m1-chips

- slow MPS device:
- https://github.com/pytorch/pytorch/issues/77799

**Training on Collab**

https://neptune.ai/blog/how-to-use-google-colab-for-deep-learning-complete-tutorial

https://blog.roboflow.com/how-to-save-and-load-weights-in-google-colab/

In [None]:
!pip install diffwave
!pip install tensorboard
!pip install Pillow

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# done already
!python -m diffwave.preprocess /content/drive/MyDrive/dlow/output_clean/

In [None]:
!python -m diffwave /content/drive/MyDrive/dlow/output_clean/model/ /content/drive/MyDrive/dlow/output_clean/

Epoch 0: 100% 42/42 [01:23<00:00,  1.99s/it]
Epoch 1: 100% 42/42 [01:00<00:00,  1.43s/it]
Epoch 2: 100% 42/42 [01:01<00:00,  1.48s/it]
Epoch 3: 100% 42/42 [01:03<00:00,  1.50s/it]
Epoch 4: 100% 42/42 [01:03<00:00,  1.52s/it]
Epoch 5: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 6: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 7: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 8: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 9: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 10: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 11: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 12: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 13: 100% 42/42 [01:04<00:00,  1.54s/it]
Epoch 14: 100% 42/42 [01:04<00:00,  1.54s/it]
Epoch 15: 100% 42/42 [01:04<00:00,  1.54s/it]
Epoch 16: 100% 42/42 [01:04<00:00,  1.54s/it]
Epoch 17: 100% 42/42 [01:04<00:00,  1.54s/it]
Epoch 18: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 19: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 20: 100% 42/42 [01:04<00:00,  1.53s/it]
Epoch 21: 100% 42/42 [01:04<00:00,  1.54s/it

In [None]:
# fix python code
!vi /usr/local/lib/python3.8/dist-packages/diffwave/dataset.py
/usr/local/lib/python3.8/dist-packages/diffwave/inference.py

In [None]:
!tensorboard --logdir /content/drive/MyDrive/dlow/model/ --host localhost --port 8088

In [None]:
!python -m diffwave.inference --fast /content/drive/MyDrive/dlow/weights-13776.pt /content/drive/MyDrive/dlow/white_noise.wav.spec.npy -o output.wav

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/dist-packages/diffwave/inference.py", line 106, in <module>
    main(parser.parse_args())
  File "/usr/local/lib/python3.8/dist-packages/diffwave/inference.py", line 92, in main
    audio, sr = predict(spectrogram, model_dir=args.model_dir, fast_sampling=args.fast)
  File "/usr/local/lib/python3.8/dist-packages/diffwave/inference.py", line 35, in predict
    checkpoint = torch.load(model_dir)
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 1131, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.8/dis

In [None]:
!python -m diffwave.inference --fast /content/drive/MyDrive/dlow/weights-13776.pt /content/drive/MyDrive/dlow/white_noise.wav.spec.npy -o output.wav