Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
138 lines (111 sloc) 3.29 KB
---
title: "Touring the tidyverse"
subtitle: "Beyond tidyverse"
author: "Misha Balyasin"
date: "2019/04/16"
output:
xaringan::moon_reader:
lib_dir: libs
css: xaringan-themer.css
nature:
highlightStyle: solarized-dark
highlightLines: true
countIncrementalSlides: false
---
background-image: url(https://raw.githubusercontent.com/romatik/touring_the_tidyverse/master/img/github_project.png)
background-size: 100%
```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
library(magrittr)
```
```{r xaringan-themer, include = FALSE}
library(xaringanthemer)
solarized_dark(
header_font_google = google_font("Josefin Slab", "600"),
text_font_google = google_font("Work Sans", "300", "300i"),
code_font_google = google_font("IBM Plex Mono")
)
```
Slides, markdown and code - https://github.com/romatik/touring_the_tidyverse
---
background-image: url(https://raw.githubusercontent.com/romatik/touring_the_tidyverse/master/img/tidyverse.png)
background-size: 100%
---
# Plan of talks
.pull-left[
## Initial
1. Wrangle: tidyr
2. Wrangle: dplyr
3. Wrangle: lubridate, hms, blob, forcats, stringr
4. Program: purrr + glue
5. Program: tidyeval
6. Model: rsample, tidyposterior, recipes, broom
7. Beyond tidyverse (tibbletime, tidytext, janitor)
]
--
.pull-right[
## Current
1. Wrangle: tidyr -- `r emo::ji("check")`
2. Wrangle: dplyr -- `r emo::ji("check")`
3. ~~Wrangle: lubridate, hms, blob, forcats, stringr~~
4. Program: purrr and glue -- `r emo::ji("check")`
5. Program: tidyeval --`r emo::ji("check")`
6. Model: tidymodels --`r emo::ji("check")`
7. **Beyond tidyverse**
]
---
# Tidy manifesto<sup>1</sup>
1. Reuse existing data structures.
2. Compose simple functions with the pipe.
- Strive to keep functions as simple as possible.
- Avoid mixing side-effects with transformations.
- Function names should be verbs.
3. Embrace R typical functional programming.
- Immutable objects and copy-on-modify semantics.
- The generic functions provided by S3 and S4.
- Tools that abstract over for-loops.
4. Design for humans.
- Invest time in naming your functions.
- Favour explicit, lengthy names, over short, implicit, names.
- Think about how autocomplete can also make an API that’s easy to write.
.footnote[
[1] https://cran.r-project.org/web/packages/tidyverse/vignettes/manifesto.html
]
---
# Tidy data
1. Each variable forms a column.
2. Each observation forms a row.
3. Each type of observational unit forms a table.
This is Codd's 3rd normal form.
---
# tidy projects on CRAN
```{r}
library(magrittr)
res <- tools::CRAN_package_db() %>%
tibble::as_tibble(.name_repair = "unique") %>%
dplyr::filter(stringr::str_detect(Package, "tidy") |
stringr::str_detect(Description, "tidy")) %>%
dplyr::select(Package, Description)
nrow(res)
```
Cleaned up number:
```{r echo = FALSE, message = FALSE}
cleaned <- readr::read_csv("tidyverse/tidy_packages.csv")
nrow(cleaned)
```
---
# Very scientific classification
```{r echo = FALSE}
dplyr::count(cleaned, area, sort = TRUE) %>%
dplyr::filter(n > 2) %>%
gt::gt()
```
---
# Spotlight
1. `janitor` + `skimr` for data exploration.
1. `tsiblle` for time-series.
1. `tidytext` for NLP.
---
# Contacts
http://mishabalyasin.com/
Slides, markdown and code - https://github.com/romatik/touring_the_tidyverse
You can’t perform that action at this time.