author | title |
---|---|
Neil Ernst |
Statistical Modeling and Bayesian inference |
One approach building inferential analyses is to use a frequentist, hypothesis testing approach where you examine the long-run probability of the data-generating mechanisms to assess how likely the results are under a null hypothesis.
The alternative is to set some limits on what you feel is likely to be true a priori, model the data generating process statistically, i.e. with a probability distribution, and then run Bayes's theorem
We will start with some motivation from McElreath: https://speakerdeck.com/rmcelreath/l01-statistical-rethinking-winter-2019 and associated videos
- Apply Bayesian inference to software problems
- Relate statistical sampling problems to numerical analysis problems (e.g., as discussed in detail in CSC 349a).
- Apply statistical probability distributions to model software problems.
- Appreciate the rationale for causal graphs and causal language.
# | Topic | Readings | Exercises |
---|---|---|---|
3-1 | Basic Statistical Inference from a Bayesian Perspective • video (Echo360) | ||
3-2 | Statistical Modeling | ||
3-4 | Causal Modeling | ||
3-5 | Probability Distributions and Priors | ||
extra | Sampling | https://chi-feng.github.io/mcmc-demo/app.html - Hamiltonian MC visualization |
- Furia, Torkar, Feldt, Applying Bayesian Analysis Guidelines to Empirical Software Engineering Data: The Case of Programming Languages and Code Quality
- Ernst, Thresholds
- McElreath, Statistical Rethinking ch 2 (netlink id required) and/or watch his lecture video
- (opt) Ray, Devanbu, Filkov, "Rebuttal to Berger et al 2019" - a rebuttal to a replication on code quality and language choice on Github.
- (opt) Dorn, Apel, Mastering Uncertainty in Performance Estimations of Configurable Software Systems
- (opt) McElreath, Statistical Rethinking (a super approachable, gentle introduction with R examples, but also translated into Julia and Python)
- (opt) Gelman, Bayesian Data Analysis (book)
- https://www.bayesrulesbook.com
- A Conceptual introduction to HMC
- Get familiar with RStudio notebooks as that is what we will use for Assignment 1.
- Use Docker to install locally
- VS Code users may want to use the VS Code Remote Containers extension in order to start a command line session for R -
Remote-Containers: Attach to Running Container
. Ask the TAs for technical help with Docker and the image. - You can also use the CS department JupyterHub machine.
- Make sure you can get the sample tutorial notebook to run in its entirety.