Machine Learning in R
This is the repository for D-Lab’s Introduction to Machine Learning in R workshop. View the associated slides here.
- Background on machine learning
- Classification vs regression
- Performance metrics
- Data preprocessing
- Missing data
- Train/test splits
- Algorithm walkthroughs
- Decision trees
- Random forests
- Gradient boosted machines
- SuperLearner ensembling
Please follow the notes in participant-instructions.md.
Assumed participant background
We assume that participants have familiarity with:
- basic R syntax
- statistical concepts such as mean and standard deviation
Please bring a laptop with the following:
- R version 3.5 or greater
- RStudio integrated development environment (IDE) is highly recommended but not required.
Browse resources listed on the D-Lab Machine Learning Working Group repository. Scroll down to see code examples in R and Python, books, courses at UC Berkeley, online classes, and other resources and groups to help you along your machine learning journey!
The slides were made using xaringan, which is a wrapper for remark.js. Check out Chapter 7 if you are interested in making your own! The theme borrows from Brad Boehmke's presentation on Decision Trees, Bagging, and Random Forests - with an example implementation in R.