-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.Rmd
256 lines (174 loc) · 12.8 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
---
title: "Statistical Programming with R"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: scroll
---
# Intro {.sidebar}
This dashboard covers the materials for the first folow-up course in R programming. Course held in Banja Luka, December 10-13, 2019. The course website of the introductory course can be found [here](http://peterstoltze.dk/RfOS/).
---
Instructors:
- Anne Vinkel Hansen (aih@dst.dk)
- Jolien Cremers (jcre@sund.ku.dk)
---
Part of the material for this course is adapted from [Gerko Vink](https://github.com/gerkovink/R)
---
# Quick Overview
## Column 1
### Outline
This follow-up course deals with the following topics:
1. Statistical inference
2. Linear models
3. Generalised linear models
4. Data validation and editing
5. Imputation
Previous experience with R is required.
## Column 2
### Course schedule
| Time | Topic |
|:------------|:------------------------------------------------|
| *Tuesday* | |
| 09.00-12.00 | Repetition (A) |
| | *Break* |
| 13.00-15.30 | Statistical Inference (B) |
| *Wednesday* | |
| 09.00-12.00 | Linear Models (C) |
| | *Break* |
| 13.00-15.30 | Generalized Linear Models (D) |
| *Thursday* | |
| 09.00-12.00 | Data Validation and Editing (E) |
| | *Break* |
| 13.00-15.30 | Imputation (F) |
| *Friday* | |
| 09.00-12.00 | Evaluation, agreement on summary mission report |
# How to prepare
## Column 1
### Preparing your machine for the course
Dear all,
The below steps guide you through installing both `R` as well as the necessary additions.
We look forward to see you all in Banja Luka,
*Anne and Jolien*
### **System requirements**
Bring a laptop computer to the course and make sure that you have full write access and administrator rights to the machine. We will explore programming and compiling in this course. This means that you need full access to your machine. Some corporate laptops come with limited access for their users, we therefore advice you to bring a personal laptop computer, if you have one.
### **1. Install `R`**
`R` can be obtained [here](https://cran.r-project.org). We won't use `R` directly in the course, but rather call `R` through `RStudio`. Therefore it needs to be installed.
### **2. Install `RStudio` Desktop**
Rstudio is an Integrated Development Environment (IDE). It can be obtained as stand-alone software [here](https://www.rstudio.com/products/rstudio/download/#download). The free and open source `RStudio Desktop` version is sufficient.
### **3. Start RStudio and install the following packages. **
Execute the following lines of code in the console window:
```{r eval=FALSE, echo = TRUE}
install.packages(c("tidyverse", "micemd", "jomo", "pan",
"lme4", "knitr", "rmarkdown", "plotly",
"devtools", "boot", "class", "car", "MASS",
"ISLR", "DAAG", "mice", "mitml", "miceadds",
"Ecdat", "Ecfun", "MEMSS", "VIM", "simputation",
"naniar","visdat", "UpSetR", "DAAG", "magrittr",
"haven", "Matrix", "lattice", "data.table", "grid",
"colorspace", "stringi", "stringdist", "editrules",
"deducorrect", "rex"),
dependencies = TRUE)
```
If you are not sure where to execute code, use the following figure to identify the console:
<center>
<img src="console.png" alt="HTML5 Icon" width = 50%>
</center>
Just copy and paste the installation command and press the return key. When asked
```{r eval = FALSE, echo = TRUE}
Do you want to install from sources the package which needs
compilation? (Yes/no/cancel)
```
type `Yes` in the console and press the return key.
# Tuesday
## Column 1
### Tuesday's materials
We adapt the course as we go. To ensure that you work with the latest iteration of the course materials, we advice all course participants to access the materials online.
- Part A: Repetition
- [Lecture A](Contents/Material/Part A - Repetition/Lecture_A.html){target="_blank"}
- [Lecture A Handout](Contents/Material/Part A - Repetition/Lecture_A_handout.html){target="_blank"}
- [Practical A](Contents/Material/Part A - Repetition/Practical_A_walkthrough.html){target="_blank"}
- [Impractical A](Contents/Material/Part A - Repetition/Practical_A.html){target="_blank"}
- Part B: Statistical Inference
- [Lecture B](Contents/Material/Part B - Statistical inference/Lecture_B.html){target="_blank"}
- [Lecture B Handout](Contents/Material/Part B - Statistical inference/Lecture_B_handout.html){target="_blank"}
- [Practical B](Contents/Material/Part B - Statistical inference/Practical_B_walkthrough.html){target="_blank"}
- [Impractical B](Contents/Material/Part B - Statistical inference/Practical_B.html){target="_blank"}
All lectures are in `html` format. Practicals are walkthrough files that guide you through the exercises. `Impractical` files contain the exercises, without walkthrough, explanations and solutions.
## Column 2
### Useful references
- [The R project homepage](https://www.r-project.org/)
- [CRAN task view for official statistics](https://cran.r-project.org/web/views/OfficialStatistics.html)
- [CRAN area for contributed documentation](https://cran.r-project.org/other-docs.html) No longer actively maintained, but a lot of good resources
- Some places to find the `R` community:
- [#rstats hashtag](https://twitter.com/search?q=%23rstats)
- [Rweekly](https://rweekly.org/)
- [R-bloggers](https://www.r-bloggers.com/)
- [Stack Overflow](https://stackoverflow.com/) A good place to ask questions. Check if your question has been answered before, and make sure to phrase it carefully and correctly.
- [Pipes in R](https://r4ds.had.co.nz/pipes.html)
# Wednesday
## Column 1
### Wednesday's materials
We adapt the course as we go. To ensure that you work with the latest iteration of the course materials, we advice all course participants to access the materials online.
- Part C: `R` Linear Models
- [Lecture C](Contents/Material/Part C - Linear models/Lecture_C.html){target="_blank"}
- [Lecture C Handout](Contents/Material/Part C - Linear models/Lecture_C_handout.html){target="_blank"}
- [Practical C](Contents/Material/Part C - Linear models/Practical_C_walkthrough.html){target="_blank"}
- [Impractical C](Contents/Material/Part C - Linear models/Practical_C.html){target="_blank"}
- [Extra Lecture M Handout](Contents/Material/Part C - Linear models/Lecture_M_handout.html){target="_blank"}
- [Extra Practical M](Contents/Material/Part C - Linear models/Practical_M_walkthrough.html){target="_blank"}
- [Extra Impractical M](Contents/Material/Part C - Linear models/Practical_M.html){target="_blank"}
- [Nurse dataset](Contents/Material/Part C - Linear models/nurses.sav){target="_blank"}
- Part D: Generalized Linear Models
- [Lecture D](Contents/Material/Part D - GLMs/Lecture_D.html){target="_blank"}
- [Lecture D Handout](Contents/Material/Part D - GLMs/Lecture_D_handout.html){target="_blank"}
- [Practical D](Contents/Material/Part D - GLMs/Practical_D_walkthrough.html){target="_blank"}
- [Impractical D](Contents/Material/Part D - GLMs/Practical_D.html){target="_blank"}
- [Extra Practical](Contents/Material/Part D - GLMs/Data_exploration.html){target="_blank"}
- Data: [cabinet](Contents/Material/Part D - GLMs/danish_cabinet.xls),
[benefits](Contents/Material/Part D - GLMs/Benefits.xlsx)
All lectures are in `html` format. Practicals are are provided both as naked questions but also with ample explanations and solutions - choose according to your taste!
## Column 2
### Useful references
- [Practical Regression and Anova using R](https://cran.r-project.org/doc/contrib/Faraway-PRA.pdf) by Julian Faraway
# Thursday
## Column 1
### Thursday's materials
We adapt the course as we go. To ensure that you work with the latest iteration of the course materials, we advice all course participants to access the materials online.
- Part E: Data Validation and Editing
- [Lecture E](Contents/Material/Part E - Data validation/Lecture_E.html){target="_blank"}
- [Practical E](Contents/Material/Part E - Data validation/Practical_E_walkthrough.html){target="_blank"}
- [Impractical E](Contents/Material/Part E - Data validation/Practical_E.html){target="_blank"}
- Data for the practical: [bands.txt](Contents/Material/Part E - Data validation/data/bands.txt){target="_blank"},
[unnamed.xls](Contents/Material/Part E - Data validation/data/unnamed.xls){target="_blank"},
[people.txt](Contents/Material/Part E - Data validation/data/people.txt),
[peoplerules.txt](Contents/Material/Part E - Data validation/data/peoplerules.txt),
[dirty_iris.txt](Contents/Material/Part E - Data validation/data/dirty_iris.txt)
- Part F: Imputation
- [Lecture F](Contents/Material/Part F - Imputation/Lecture_F.html){target="_blank"}
- [Lecture F Handout](Contents/Material/Part F - Imputation/Lecture_F_handout.html){target="_blank"}
- [Practical F](Contents/Material/Part F - Imputation/Practical_F_walkthrough.html){target="_blank"}
- [Impractical F](Contents/Material/Part F - Imputation/Practical_F.html){target="_blank"}
All lectures are in `html` format. Practicals are walkthrough files that guide you through the exercises. `Impractical` files contain the exercises, without walkthrough, explanations and solutions.
## Column 2
### Useful references
- Data validation
- [An introduction to data cleaning with R](https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf)
- [Mark van der Loo](http://www.markvanderloo.eu/) has given several talks and workshops on this, with slides available from his [GitHub](https://github.com/markvanderloo)
- Imputation
- [Flexible Imputation of Missing Data](https://stefvanbuuren.name/fimd/): written by Stef van Buuren, uses the `mice` package.
- [Applied Missing Data Analysis](https://bookdown.org/mwheymans/bookmi/): book on missing data and imputation in R and SPSS.
- [Visualization of Missingness Patterns](https://cran.r-project.org/web/packages/naniar/vignettes/naniar-visualisation.html): Tutorial for the naniar package that does nice visualizations of missingness patterns.
- [Tutorial](https://www.analyticsvidhya.com/blog/2016/03/tutorial-powerful-packages-imputing-missing-values/): Tutorial describing the use of 5 different R-packages for missing data imputation.
- [Tutorial](http://www.di.fc.ul.pt/~jpn/r/missing/index.html): Another tutorial on missing data in R.
- [Tutorial](https://github.com/gerkovink/miceVignettes): a set of tutorials for the `mice` package.
# Further studies
## Column 1
### What to do after the course
The following references are currently available for free, either as pdfs or as extensive webpages (written with [RMarkdown](https://rmarkdown.rstudio.com/) and [bookdown](https://bookdown.org/)). They are all very useful and we highly recommend them.
- [R for Data Science](https://r4ds.had.co.nz): written by Hadley Wickham and Garrett Grolemund this book relies almost exclusively on the [tidyverse](https://www.tidyverse.org/) approach to data analysis. Many highly effective tools will be right at your fingertips after reading this book and working through the many exercises.
- [Hands-On Programming with R](https://rstudio-education.github.io/hopr/): a great read by Garrett Grolemund emphasizing programming techniques with R.
- [Advanced R](https://adv-r.hadley.nz/): You want to gain deeper knowledge of R and you want to learn from one of the most influential R contributors. This one is for you!
- [Introduction to Statistical Learning](http://faculty.marshall.usc.edu/gareth-james/ISL/): an introductory book on statistical learning, with applications in R. The R code is somewhat old-style and you might be able to find newer packages for the tasks, but ths is a solid read well worth the effort.
- [Data Analysis and Graphics Using R](http://www.pindex.com/uploads/post_docs/Maindonald%20Data%20Analysis%20and%20Graphics(PINDEX-DOC-6953).pdf): a detailed book that covers a lot about categorical data analysis and fitting `glm`s in `R`.
- [Happy Git and GitHub for the useR ](https://happygitwithr.com/index.html): a great introduction to version control using Git and GitHub together with RStudio. Written by Jenny Bryan in a very concise style. Highly recommended!
<!-- ## Column 2 -->