This repository contains the material to learn about the R programming language. The topics are divided in four parts:
-
Basics starting point if you have few or zero experience with R. The basic concepts are explained here.
- Introduction about me, R and this course
- Exploring data using the
ggplot2
anddplyr
packages - Presenting results using the
rmarkdown
package - More exercises about exploring data
- Transforming data to a tidy format using the
tidy
package - Make calculations with times and dates using the
lubridate
package - Reading data from a file into R using the
readr
package, or connect to a database using theDBI
package - An introduction into statistical models
- Making a simple linear model
- Making a simple logistic regression model
-
ML here more advanced techniques for modeling a dataset are explained
- Explanation of re sampling techniques like cross validation and bootstrapping
- Automatic feature selection
- Decision trees and random forests
- More examples for modeling and the Simpsons paradox
-
Big Data explanation of what is big data, hadoop and spark and how to use it in combination with R
- Introduction to Big Data
- Introduction to Hadoop, MapReduce ans Spark
- Introduction to the
sparklyr
package - Introduction to Cloudera
-
Advanced here more advanced R topics are explained
- Functional programming using the
purrr
package - Making interactive web pages using
shiny
- Using asynchronous programming in a
shiny
application - Scraping information from an internet page using the
rvest
package
- Functional programming using the
Every chapter has the following parts:
- An
{chapter_name}.md
file that contains the presentation used for the chapter - An
*.R
or*.Rmd
file that contains the exercises for the chapter
In the answers
map all the answers for the exercises can be found. The extra
folder contains examples that are not closely related to one of the chapters, or new chapters that are beeing made. The datasets
folder contains the external datasets that are used during this course. Because I gave this course first in Colombia there is also a presentaciones_en_español
folder with the presentations in Spanish for most of the material. Finally there is a references.md
file which contains all the sources I used for creating this material.