Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
143 lines (112 sloc) 4.5 KB
title: "Touring tidyverse"
subtitle: "tidymodels"
author: "Misha Balyasin"
date: "2019/02/28"
lib_dir: libs
css: xaringan-themer.css
highlightStyle: solarized-dark
highlightLines: true
countIncrementalSlides: false
```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
```{r xaringan-themer, include = FALSE}
header_font_google = google_font("Josefin Slab", "600"),
text_font_google = google_font("Work Sans", "300", "300i"),
code_font_google = google_font("IBM Plex Mono")
# Plan for today
1. Brief intro to `tidymodels`.
1. Part 1 of the demo: `dials` + `parsnip`.
1. Break.
1. Part 2 of the demo: `recipes`.
1. Extracurricular activities.
# Job plug
Slides, markdown and code -
```{r, echo=FALSE, fig.width=15}
# Plan of talks
## Initial
1. Wrangle: tidyr
2. Wrangle: dplyr
3. Wrangle: lubridate, hms, blob, forcats, stringr
4. Program: purrr + glue
5. Program: tidyeval
6. Model: rsample, tidyposterior, recipes, broom
7. Beyond tidyverse (tibbletime, tidytext, janitor)
## Current
1. Wrangle: tidyr -- `r emo::ji("check")`
2. Wrangle: dplyr -- `r emo::ji("check")`
3. ~~Wrangle: lubridate, hms, blob, forcats, stringr~~
4. Program: purrr and glue -- `r emo::ji("check")`
5. Program: tidyeval --`r emo::ji("check")`
6. **Model: tidymodels**
7. Beyond tidyverse (tibbletime, tidytext, janitor) -- `r emo::ji("construction")`
background-image: url(
background-size: 100px
background-position: 90% 6%
# tidymodels
tidymodels is a "meta-package" for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
1. `broom` takes the messy output of built-in functions in R, such as lm, nls, or t.test, and turns them into tidy data frames.
1. `infer` is a modern approach to statistical inference.
1. `recipes` is a general data preprocessor with a modern interface.
1. `rsample` has infrastructure for resampling data so that models can be assessed and empirically validated.
1. `yardstick` contains tools for evaluating models (e.g. accuracy, RMSE, etc.)
1. `tidypredict` translates some model prediction equations to SQL for high-performance computing.
1. `tidyposterior` can be used to compare models using resampling and Bayesian analysis.
1. `tidytext` contains tidy tools for quantitative text analysis, including basic text summarization, sentiment analysis, and text modeling.
1. `dials` contains tools to create and manage values of tuning parameters and is designed to integrate well with the parsnip package.
# Current state
1. Current version - `r packageVersion("tidymodels")`.
1. Developed by __Max Kuhn__, Hadley Wickham.
1. Very early stage.
```{r, echo = FALSE}
tibble::tibble(package = c("broom", "infer", "recipes", "rsample", "yardstick", "tidypredict", "tidyposterior", "tidytext", "parsnip", "dials")) %>%
dplyr::mutate(version = purrr::map_chr(package, ~as.character(packageVersion(.x))))
# Prior art
1. [`h2o`](
1. [`caret`](
1. [`mlr`](
1. [`scikit-learn`](
1. Apache Spark with, e.g., [`sparklyr`](
1. [MLflow](
1. [CRAN Machine Learning taskview](
1. More?
# Extended demo
We will have a (very) condensed walkthrough of 2-day workshop "Applied Machine Learning" by Max Kuhn that he had during `rstudio::conf(2019L)`.
This should give you an idea about what `tidymodels` is planning to be and what is possible already now.
This is **NOT** a Machine Learning tutorial!
# Resources
1. Applied Machine Learning workshop -
# Contacts
Slides, markdown and code -
You can’t perform that action at this time.