Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Data Driven Audio Signal Processing - A Tutorial with Computational Examples

Master Course #24512

- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise

Feel free to contact lecturer frank.schultz@uni-rostock.de

This tutorial is still evolving, so major parts could be rearranged from year to year to fit the actual students' demands. We will prepare more audio examples in future...

# Homework Task Template

- [Homework Task Template](homework/homework.ipynb)

# Planned Syllabus Winter Semester 2024/25

- [Numerical Examples from the Slides](slides/ddasp_exercise_slides.ipynb)


## Exercise : Motivation / Introducing a Toy Example for SVD / Regression
- [Introduction to the Course](exercise01.ipynb)


## Exercise : Singular Value Decomposition (SVD) / 4 Subspaces of a Matrix

- [SVD and 4 Subspaces](exercise04_svd.ipynb)
- [SVD and 4 Subspaces, above example as Matlab script](svd_four_subspaces.m) 


## Exercise: Left Inverse / Linear Regression with Ordinary Least Squares (OLS)
- [SVD and Left Inverse](exercise04_leftinv.ipynb)
- [SVD and Right Inverse](exercise04_rightinv.ipynb)
- [Linear Regression with OLS](ols.ipynb)


## Exercise: SVD Factorization for Multitrack Audio Matrix
- [exercise05.ipynb](exercise05.ipynb)


## Exercise: Linear and Ridge Regression on Multitrack Audio Matrix
- [L-curve to find optimum regularization parameter](lcurve.ipynb)
- [exercise08_ridge_regression.ipynb](exercise08_ridge_regression.ipynb)
- [exercise07_left_inverse_SVD_QR.ipynb](exercise07_left_inverse_SVD_QR.ipynb)

## Exercise: Audio Features
- [Audio Features I](exercise02.ipynb) (Segmentation, STFT, Spectrogram, Periodogram)
- [Audio Features II](exercise03.ipynb) (Segmentation, RMS/(True)Peak/Crest Factor, R128 loudness)

## Exercise: PCA
- [pca_2D.ipynb](pca_2D.ipynb)
- [pca_3D.ipynb](pca_3D.ipynb)
- [pca_audio_features.ipynb](pca_audio_features.ipynb)

## Exercise: Bias Variance Trade-Off vs. Model Complexity
- [Bias-Variance Trade-Off vs. Model Complexity](bias_variance_linear_regression.ipynb)


## Exercise: Bias Variance Trade-Off vs. Hyper Parameter Tuning
- [Bias-Variance Trade-Off vs. Regularization](bias_variance_ridge_regression.ipynb)


## Exercise: Gradient Descent along a 2D Surface
- [Gradient Descent 1](gradient_descent.ipynb) with one saddle point
- [Gradient Descent 2](gradient_descent2.ipynb) with saddle points, local maximum, local minima and a global minimum
- [Gradient Descent with Momentum](gradient_descent_momentum.ipynb)
- [Stochastic Gradient Descent for Least Squares Error](gradient_descent_on_least_squares.ipynb)
- [Stochastic Gradient Descent for Complex Data Least Squares Error Using PyTorch](gradient_descent_on_complex_data_least_squares.ipynb)



## Exercise: XOR as Non-Linear, Two-Layer Model
- The XOR mapping is a popular example to motivate non-linearities in models, as linear regression  in [exercise10_xor_example.m](exercise10_xor_example.m) cannot solve this simple problem (coding this in Python on our own is a good practice for linear algebra handling and OLS)
- The XOR mapping as as a simple **0/1 classification** using a **non-linear hidden layer** and **linear output layer** is coded in [regression_xor_twolayers.ipynb](regression_xor_twolayers.ipynb)

We should not miss these brilliant resources to start with neural networks
- [https://pythonalgos.com/create-a-neural-network-from-scratch-in-python-3/](https://pythonalgos.com/create-a-neural-network-from-scratch-in-python-3/)
- [https://playground.tensorflow.org](https://playground.tensorflow.org)
- https://www.tensorflow.org/tutorials/keras/overfit_and_underfit (and the other tutorials found there)



## Exercise: Binary Logistic Regression with Only One Sigmoid Output Layer
- [exercise10_binary_logistic_regression.py](exercise10_binary_logistic_regression.py)
- With [Binary logistic regression, our implementation vs. Tensorflow](binary_logistic_regression_tf.ipynb) we compare an **own implementation against a TF model**


## Exercise: Binary Logistic Regression with Hidden Layers and Sigmoid Output Layer
- Next, we create more complex models in [binary_logistic_regression_tf_with_hidden_layers.ipynb](binary_logistic_regression_tf_with_hidden_layers.ipynb) using **hidden layers**, but still with **manually tuned hyper parameters**
- It might be worth to spend time with this brilliant application https://playground.tensorflow.org/ to get a feeling how models get trained on rather simple data sets


## Exercise: Multi-Class Classification with Hidden Layers and Softmax Output Layer

- With [exercise12_MulticlassClassification_CategoricalCrossentropy.ipynb](exercise12_MulticlassClassification_CategoricalCrossentropy.ipynb) we expand the binary classification example [binary_logistic_regression_tf_with_hidden_layers.ipynb](binary_logistic_regression_tf_with_hidden_layers.ipynb)
towards more classes (note that classes are exclusive)


## Exercise: Hyper Parameter Tuning

- With [exercise12_HyperParameterTuning.ipynb](exercise12_HyperParameterTuning.ipynb) we introduce
    - data split into train, validate, test data sets
    - hyper parameter tuning  
    - one hot encoding
    - training of best model with re-set weights using train / val data set
    - final prediction on unseen test data set compared to predictions on train / val data sets
    - confusion matrix and visualization of predictions
    
## Exercise: Simple Music Genre Classification Application
   
- Finally we apply all our knowledge so far to realize a music genre classification application in [exercise12_MusicGenreClassification.ipynb](exercise12_MusicGenreClassification.ipynb)
    - feature design (loudness, crest, peak, rms, spectral weight)
    - feature inspection / avoiding NaNs
    - feature normalization
    - balancing data set wrt class occurence
    
We could move on with dropout layers, regularization...

We could also consider CNNs to work on STFT maps to solve this task...or combine CNN and DNN...nice homework project...

And of course we might check the research literature, e.g. on [IEEE Xplore](https://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=music%20genre%20classification), to figure the current state of the research.
We then might realize that recent models are little more complicated and probably exhibit some unknown tools, but the fundamentals remain the same.
Hence, we should be able to work in the literature after attending our courses and comprehending all the provided material.
Please, do nice and useful things with ML :-)


## Important Things We Might Not Cover in the Tutorials 
- [Audio Signal Fundamentals](audio_introduction.ipynb)
- [QR factorization.ipynb](exercise07_QR.ipynb)
- [linear_regression LS_vs_SVD](exercise07_linear_regression_LS_vs_SVD.ipynb)
- [Tensorflow convolution vs correlation](tf_conv1D_vs_corr_conv.ipynb)
- [Tensorflow 2D convolution](exercise13_CNN.py)

## Textbook Recommendations
Machine Learning (ML) using linear / non-linear models is a vivid topic and dozens of textbooks will be released each year.
The following textbook recommendations are very often referenced in the field and brilliant to learn with.  
- Sebastian **Raschka**, Yuxi Liu, Vahid Mirjalili: *Machine Learning with PyTorch and Scikit-Learn*, Packt, 2022, 1st ed.
- Gilbert **Strang**: *Linear Algebra and Learning from Data*, Wellesley, 2019, consider to buy your own copy of this brilliant book
- Gareth **James**, Daniela Witten, Trevor Hastie, Rob Tibshirani: *An Introduction to Statistical Learning* with Applications in R, Springer, 2nd ed., 2021, [free pdf e-book](https://www.statlearning.com/)
- Trevor **Hastie**, Robert Tibshirani, Jerome Friedman: *The Elements of  Statistical Learning: Data Mining, Inference, and Prediction*, Springer, 2nd ed., 2009, [free pdf e-book](https://hastie.su.domains/ElemStatLearn/)
- Sergios **Theodoridis**: *Machine Learning*, Academic Press, 2nd ed., 2020, check your university library service for free pdf e-book
- Kevin P. **Murphy**: *Probabilistic Machine Learning: An Introduction*, MIT Press, 1st. ed. [open source book and current draft as free pdf](https://probml.github.io/pml-book/book1.html)
- Ian **Goodfellow**, Yoshua Bengio, Aaron Courville: *Deep Learning*, MIT Press, 2016
- Marc Peter **Deisenroth**, A. Aldo Faisal, Cheng Soon Ong: *Mathemathics for Machine Learning*, Cambridge University Press, 2020, [free pdf e-book](https://mml-book.github.io/)
- Steven L. **Brunton**, J. Nathan Kutz: *Data Driven Science & Engineering - Machine Learning, Dynamical Systems, and Control*, Cambridge University Press, 2020, [free pdf of draft](http://www.databookuw.com/databook.pdf), see also the [video lectures](http://www.databookuw.com/) and [Python tutorials](https://github.com/dylewsky/Data_Driven_Science_Python_Demos)
- Aurélien **Géron**: *Hands-on machine learning with Scikit-Learn, Keras and TensorFlow*. O’Reilly, 2nd ed., 2019, [Python tutorials](https://github.com/ageron/handson-ml2)

ML deals with stuff that is actually known for decades (at least the linear modeling part of it), so if we are really serious about to learn ML deeply, we should think over concepts on statistical signal processing, maximum-likelihood, Bayesian vs. frequentist statistics, generalized linear models, hierarchical models...For these topics we could check these respected textbooks
- L. **Fahrmeir**, A. Hamerle, and G. Tutz, Multivariate statistische Verfahren, 2nd ed. de Gruyter, 1996.
- L. **Fahrmeir**, T. Kneib, S. Lang, and B. D. Marx, Regression, 2nd ed. Springer, 2021.
- A. J. **Dobson** and A. G. Barnett, An Introduction to Generalized Linear Models, 4th ed. CRC Press, 2018.
- H. **Madsen**, P. Thyregod, Introduction to General and Generalized Linear Models, CRC Press, 2011.
- A. **Agresti**, Foundations of Linear and Generalized Models, Wiley, 2015

## Open Course Ware Recommendations

- Online Course by Andrew **Ng** et al. at https://www.coursera.org/ and https://www.deeplearning.ai/
- Online Course by Gilbert **Strang** et al. at https://ocw.mit.edu/courses/mathematics/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/
- Online Course/Material by Aurélien **Géron** https://github.com/ageron
- Online Course by Meinard **Müller** https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_GetStarted.html (focus on music information retrieval)

## Autorship
- current main authors
    - University of Rostock
        - [Frank Schultz](https://orcid.org/0000-0002-3010-0294)
        - [Sascha Spors](https://orcid.org/0000-0001-7225-9992)


## Copyright

- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)
- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)
- feel free to use the notebooks for your own purposes considering above licenses

## Referencing
- please cite this open educational resource (OER) project as *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.