Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Data Driven Audio Signal Processing - A Tutorial with Computational Examples

Winter Semester 2022/23 (Master Course #24512)

- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise

Feel free to contact lecturer frank.schultz@uni-rostock.de

# Syllabus 2022/23

TBD: we might change the sequence of the 2021/22 material (see below), certainly we will add new improved stuff, improve didactics and so on...


## Exercise : Motivation / Introducing a Toy Example for SVD/Regression
- [Introduction to the Course](exercise01.ipynb)


## Exercise : Singular Value Decomposition (SVD) / 4 Subspaces

- [SVD and 4 Subspaces](exercise04_svd.ipynb)
- [SVD and 4 Subspaces, above example as Matlab script](svd_four_subspaces.m) 


## Exercise: Left Inverse / Linear Regression with Ordinary Least Squares (OLS)
- [SVD and Left Inverse](exercise04_leftinv.ipynb)
- [SVD and Right Inverse](exercise04_rightinv.ipynb)
- [Linear Regression with OLS](ols.ipynb)


## TBD
- [Audio Signal Fundamentals](audio_introduction.ipynb)
- [Bias-Variance Trade-Off vs. Model Complexity](bias_variance_linear_regression.ipynb)
- [Bias-Variance Trade-Off vs. Regularization](bias_variance_ridge_regression.ipynb)

## Textbook Recommendations

- Gilbert **Strang**: *Linear Algebra and Learning from Data*, Wellesley, 2019, consider to buy your own copy of this brilliant book
- Gareth **James**, Daniela Witten, Trevor Hastie, Rob Tibshirani: *An Introduction to Statistical Learning* with Applications in R, Springer, 2nd ed., 2021, [free pdf e-book](https://www.statlearning.com/)
- Trevor **Hastie**, Robert Tibshirani, Jerome Friedman: *The Elements of  Statistical Learning: Data Mining, Inference, and Prediction*, Springer, 2nd ed., 2009, [free pdf e-book](https://hastie.su.domains/ElemStatLearn/)
- Sergios **Theodoridis**: *Machine Learning*, Academic Press, 2nd ed., 2020, check your university library service for free pdf e-book
- Kevin P. **Murphy**: *Probabilistic Machine Learning: An Introduction*, MIT Press, 1st. ed. [open source book and current draft as free pdf](https://probml.github.io/pml-book/book1.html)
- Marc Peter **Deisenroth**, A. Aldo Faisal, Cheng Soon Ong: *Mathemathics for Machine Learning*, Cambridge University Press, 2020, [free pdf e-book](https://mml-book.github.io/)
- Steven L. **Brunton**, J. Nathan Kutz: *Data Driven Science & Engineering - Machine Learning, Dynamical Systems, and Control*, Cambridge University Press, 2020, [free pdf of draft](http://www.databookuw.com/databook.pdf), see also the [video lectures](http://www.databookuw.com/) and [Python tutorials](https://github.com/dylewsky/Data_Driven_Science_Python_Demos)
- Aurélien **Géron**: *Hands-on machine learning with Scikit-Learn, Keras and TensorFlow*. O’Reilly, 2nd ed., 2019, [Python tutorials](https://github.com/ageron/handson-ml2)

ML deals with stuff that is actually known for decades (at least the linear modeling part of it), so if we are really serious about to learn ML deeply, we should think over concepts on statistical signal processing, maximum-likelihood, Bayesian vs. frequentist statistics, generalized linear models, hierarchical models...for that we could check these books
- L. Fahrmeir, A. Hamerle, and G. Tutz, Multivariate statistische Verfahren, 2nd ed. de Gruyter, 1996.
- L. Fahrmeir, T. Kneib, S. Lang, and B. D. Marx, Regression, 2nd ed. Springer, 2021.
- A. J. Dobson and A. G. Barnett, An Introduction to Generalized Linear Models, 4th ed. CRC Press, 2018.
- H. Madsen, P. Thyregod, Introduction to General and Generalized Linear Models, CRC Press, 2011.
- A. Agresti, Foundations of Linear and Generalized Models, Wiley, 2015

## Open Course Ware Recommendations

- Online Course by Andrew **Ng** et al. at https://www.coursera.org/ and https://www.deeplearning.ai/
- Online Course by Gilbert **Strang** et al. at https://ocw.mit.edu/courses/mathematics/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/
- Online Course/Material by Aurélien **Géron** https://github.com/ageron
- Online Course by Meinard **Müller** https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_GetStarted.html (focus on music information retrieval)

# Syllabus 2021/22

## Exercise 1: Introduction
[Introduction](exercise01.ipynb)

## Exercise 2: Audio Features I (Segmentation, STFT, Spectrogram, Periodogram)

[Audio Features I](exercise02.ipynb)

## Exercise 3: Audio Features II (Segmentation, RMS/(True)Peak/Crest Factor, R128 loudness)

[Audio Features II](exercise03.ipynb)

## Exercise 4: SVD / 4 Subspaces / Left Inverse

- [SVD and 4 Subspaces](exercise04_svd.ipynb)
- [SVD and Left Inverse](exercise04_leftinv.ipynb)
- [SVD and Right Inverse](exercise04_rightinv.ipynb)

## Exercise 5: Column Space Singular Vectors of a Multitrack Audio Matrix 

- [exercise05.ipynb](exercise05.ipynb)

## Exercise 6: Principal Component Analysis (PCA)

- Matlab code:
    - [exercise06_pca_2D.m](exercise06_pca_2D.m)
    - [exercise06_pca_3D.m](exercise06_pca_3D.m)
- Python notebooks TBD
    


## Exercise 7: QR, SVD, Linear Regression vs. SVD Regression

- [exercise07_QR.ipynb](exercise07_QR.ipynb)
- [exercise07_left_inverse_SVD_QR.ipynb](exercise07_left_inverse_SVD_QR.ipynb)
- [exercise07_linear_regression_LS_vs_SVD.ipynb](exercise07_linear_regression_LS_vs_SVD.ipynb)

## Exercise 8: Ridge Regression / Bias vs. Variance

- [exercise08_ridge_regression.ipynb](exercise08_ridge_regression.ipynb)
- [exercise08_bias_variance.ipynb](exercise08_bias_variance.ipynb)


## Exercise 9: Gradient Descent (Steepest Descent)

- [exercise09_gradient_descent.m](exercise09_gradient_descent.m)


## Exercise 10: Perceptron / Neural Networks

- The XOR mapping is a popular example to motivate non-linearities in models, as linear regression cannot solve this simple problem in [exercise10_xor_example.m](exercise10_xor_example.m)
- Our own implementation of simple **0/1 classification** using only **one layer** with **sigmoid activation** function [exercise10_binary_logistic_regression.py](exercise10_binary_logistic_regression.py)

We should not miss these brilliant resources to start with neural networks
- [https://pythonalgos.com/create-a-neural-network-from-scratch-in-python-3/](https://pythonalgos.com/create-a-neural-network-from-scratch-in-python-3/)
- [https://playground.tensorflow.org](https://playground.tensorflow.org)
- https://www.tensorflow.org/tutorials/keras/overfit_and_underfit (and the other tutorials found there)

## Exercise 11: Binary Classification

- With [exercise11_binary_logistic_regression_tf.py](exercise11_binary_logistic_regression_tf.py) we **compare our** above implementation [exercise10_binary_logistic_regression.py](exercise10_binary_logistic_regression.py) **against a TF model**
- Next, we create more complex models in [exercise11_binary_logistic_regression_tf_with_hidden_layers.py](exercise11_binary_logistic_regression_tf_with_hidden_layers.py) using **hidden layers**, but still with **manually tuned hyper parameters**



## Exercise 12: Multiclass Classification

- With [exercise12_MulticlassClassification_CategoricalCrossentropy.ipynb](exercise12_MulticlassClassification_CategoricalCrossentropy.ipynb) we expand the example [exercise11_binary_logistic_regression_tf_with_hidden_layers.py](exercise11_binary_logistic_regression_tf_with_hidden_layers.py) towards **classification of more than two classes** using **softmax activation** function in the output layer

- With [exercise12_HyperParameterTuning.ipynb](exercise12_HyperParameterTuning.ipynb) we introduce
    - data split into train, validate, test data sets
    - hyper parameter tuning  
    - one hot encoding
    - training of best model with re-set weights using train / val data set
    - final prediction on unseen test data set compared to predictions on train / val data sets
    - confusion matrix and visualization of predictions
    
- Finally we apply all this to a music genre classification application in [exercise12_MusicGenreClassification.ipynb](exercise12_MusicGenreClassification.ipynb)
    - feature design (loudness, crest, peak, rms, spectral weight)
    - feature inspection / avoiding NaNs
    - feature normalization
    - balancing data set wrt class occurence
    
We could move on with dropout layers, regularization...

## Exercise 13: CNN

TBD

[exercise13_CNN.py](exercise13_CNN.py)

## Autorship
- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- current main authors
    - University of Rostock
        - [Frank Schultz](https://orcid.org/0000-0002-3010-0294)
        - [Sascha Spors](https://orcid.org/0000-0001-7225-9992)


## Referencing
- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- please cite this open educational resource (OER) project as *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant ``file(s), github URL, commit number and/or version tag, year``.

## License
- Creative Commons Attribution 4.0 International License ([CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)) for text / graphics
- [MIT license](https://opensource.org/licenses/MIT) for software / code