Master's Course, SS2015 Faculty of Physics and Astronomy, University of Heidelberg
Computational Statistics and Data Analysis (MVComp2)
Course LSF entry
Lecturer: PD Dr. Coryn Bailer-Jones
Assistants: Dr. Morgan Fouesneau, Dr. Dae-Won Kim
Summer semester 2015
This page provides the homework assignments in a form of python notebooks. You can click in the table below to read it online or download them. (python is not imposed to solve the exercises.)
This course and exercises take a pragmatic approach to using statistics and computational methods to analyse data. The focus will be on concepts, understanding problems, and the application of techniques to solving problems, rather than reproducing proofs or teaching you recipes to memorize.
The course website is available here
This repository gives the homeworks related datasets (table below for links)
The repository will be updated after each class to give the assignments. All datasets, gists of code will also be included. Examples of solutions (hardly unique) will be included eventually.
Some homework guidelines
Notebooks have no meaning of imposing a format to give us back your homework assignments. Instead they give me convenient ways to keep both texts and codes at the same place.
Each week, we will mark your homework on a scale of 100 points in total. (details given with the exercises)
You are allowed to work in groups of at most 3 persons and return 1 document per group.
Homework documents must be returned each Tuesday.
We do not mark your coding skills.
This means we do not read the codes. We do not look out for comments in the codes, but we will not guess what a plot means. Be explicit and describe even in once sentence what you did.
Feel free to use the notebooks (it may not be the most efficient), be careful when printing (Check out
nbconvertto produce a pdf or even latex document).
We do not impose a language. Feel free to use any that you judge efficient for you. Obviously we cannot provide full support, nor we cannot give full tutorials.
If you use R, many examples of code will be included in the lecture notes. If you use Python, all the exercises will be using python (when coding is required).
examples in R from the course are available here: link (will be updated throughout the course)
In case you cannot/do not want to install libraries or softwares on your computer, some free online services exist, such as:
Sage Cloud: python, R, and other languages
Wakari Python only.
some libraries that you may find useful later depending on your language.
There will be 12 lectures on the following dates (the exercise session is on the following day). The topics allocated to the dates may well change!
As github now integrates
nbviewer If a notebook is not accessible through
the links in the table, you can instead click on the files
|Lecture date||Topic||Exercises||datasets & snippets|
|14 April||Introduction and probability basics||notebook||rvs.dat|
|21 April||Estimation and error: describing data and distributions||notebook||star.csv|
|28 April||Statistical models and inference||notebook||hipparcos.dat|
|5 May||Linear models and regression||notebook||rmr_ISwR.dat sdss_sspp_sub.csv|
|12 May||(Bayesian) Model fitting I||notebook||lighthouse.dat|
|19 May||(Bayesian) Model fitting II||notebook||coinflip.dat|
|26 May||MCMC||notebook||2Dline.dat metropolis.py|
|2 June||No lecture|
|9 June||Hypothesis testing||notebook||iswr_vitcap.dat|
|16 June||Model Comparison||notebook||2Dline_modelcomparison.dat|
|23 June||Cross validation, regularization, and basis functions||notebook||ratdiet_fields.dat cars93sel_MASS.dat|
|30 June||Kernels and Mixture models||notebook||ratdiet_fields.dat geyser2_MASS.dat line_outlier.dat|
|7 July||Classification||No assignment|
|14 July||Study week|