Some of the R code I wrote for various courseworks and projects during my MSc in Applied Statistics at Birkbeck between 2008 and 2010. Much of this was written for compilation into LaTeX and ultimately PDF, using Sweave.
Statistical Data Analysis: A couple of courseworks from this first year topic that covered multivariate analysis and generalized linear models. The first coursework covers some multivariate analysis: multivariate hypothesis testing, PCA and the like. The second is all about generalized linear models.
Modern Statistical Methods: This second year course was all about (Monte Carlo) simulation, kernel density estimation, exact tests, bootstrap/jackknife resampling, and Markov Chain Monte Carlo methods.
Statistical Data Mining: A second year course about the statistical approach to machine learning. The task was to try fitting various supervised learning models to the Pima Indian diabetes dataset (contained in the MASS
package in R):
- Linear/quadratic discriminant analysis
- Logistic regression
- Generalized additive model
- Classification tree
- k-nearest neighbours
- Neural network
Final research project: Final year research project where I investigated the empirical properties of range-based (OHLC) volatility estimators as compared with return-based and realized volatility estimators using a data set of 1 minute EURUSD OHLC prices. The paper that I synthesized from this research won the first place departmental prize for research sponsored by Winton Capital.