Scripting for data analysis
Scripting for data analysis with R is a PhD course set to be given in 2017. This repository contains the developing exercises and homework assignments.
1. A crash course in R
Before the seminar I'll send out instructions on how to setup R and RStudio so everyone can work along on their laptop.
Why do data analysis with a scripting language
The RStudio interface
Using R as a calculator
Working interactively and writing code
Reading and looking at data
Installing useful packages
A first graph with ggplot2
Homework for next time: The Unicorn Dataset, exercises in reading data, descriptive statistics, linear models and a few statistical graphs.
2. Programming for data analysis
Programming languages one may encounter in science
Common concepts and code examples
Data structures in R
Homework for next time: The Unicorn Expression Dataset, exercises in data wrangling and more interesting graphs.
3. Working with moderately large data
More about functions
Functional and imperative programming
Doing things many times, loops and plyr
Working on a cluster
Final homework: Design analysis by simulation: pick a data analysis project that you care about; simulate data based on a model and reasonable effect size; implement the data analysis; and apply it to simulated data with and without effects to estimate power and other design characteristics. This ties together skills from all seminars.