Skip to content

mayer79/statistical_computing_material

Repository files navigation

Statistical Computing

Lecture Notes

Michael Mayer

Organization

The lecture has six chapters:

  1. R in Action
  2. Statistical Inference
  3. Linear Models
  4. Model Selection and Validation
  5. Trees
  6. Neural Nets

Chapters 3 to 6 can be summarized as "Statistical ML in Action".

Each chapter will keep us busy for two weeks (3 hours + 1 hour exercises).

Prerequisites

Lecture material

Fetch everything by running

git clone https://github.com/mayer79/statistical_computing_material.git

in your Git console, or by downloading everything as Zip file.

Large data

Download the large dataset "January 2022 - Yellow Taxi Trip Records" from this page or use the direct download link.

Place it in the project subfolder "taxi/".

Software

We will work with R version >= 4.1 and RStudio.

In the first two chapters, we will need these contributed R packages:

  • tidyverse
  • plotly
  • insuranceData
  • microbenchmark
  • withr
  • boot
  • coin

For the remaining chapters, we further need:

  • h2o (large package)
  • arrow
  • data.table
  • FNN
  • duckdb
  • sparklyr (large package)
  • rpart.plot
  • ranger
  • xgboost
  • lightgbm
  • hstats
  • MetricsWeighted
  • keras (large, see below)

For the last chapter, we additionally need Python with TensorFlow >= 2.15. You can install it by running the R command keras::install_keras(version = "release-cpu"). If the following code works, you are all set. (Some red start-up messages/warnings are okay.)

library(tensorflow)
tf$constant("Hello Tensorflow!")

Further Material

Books

  • James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). An Introduction to Statistical Learning - with Applications in R. New York: Springer.
  • Hastie, T., Tibshirani, R., Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.
  • Wickham, H., Grolemund, G. (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O'Reilly Media.
  • Chollet, F., Allaire, J. J. (2018). Deep Learning with R. Manning Publications Co.

Video by Trevor Hastie

Copyright

This lecture is being distributed under the creative commons license.

How to cite?

Michael Mayer (2023), Statistical Computing, lecture notes, Institute of Mathematical Statistics and Actuarial Science, University of Bern. URL: https://github.com/mayer79/statistical_computing_material