Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"hash": "3ea97871f74836d15d22db5ec0939940",
"result": {
"markdown": "---\ntitle: \"1 - Introduction\"\nsubtitle: \"Machine learning with tidymodels\"\nformat:\n revealjs: \n slide-number: true\n footer: <https://workshops.tidymodels.org>\n include-before-body: header.html\n include-after-body: footer-annotations.html\n theme: [default, tidymodels.scss]\n width: 1280\n height: 720\nknitr:\n opts_chunk: \n echo: true\n collapse: true\n comment: \"#>\"\n---\n\n\n\n\n::: r-fit-text\nWelcome!\n:::\n\n## Who are you?\n\n- You can use the magrittr `%>%` or base R `|>` pipe\n\n- You are familiar with functions from dplyr, tidyr, ggplot2\n\n- You have exposure to basic statistical concepts\n\n- You do **not** need intermediate or expert familiarity with modeling or ML\n\n## Who are tidymodels?\n\n- Simon Couch\n- Hannah Frick\n- Emil Hvitfeldt\n- Max Kuhn\n\n. . .\n\nMany thanks to Davis Vaughan, Julia Silge, David Robinson, Julie Jung, Alison Hill, and Desirée De Leon for their role in creating these materials!\n\n## Asking for help\n\n. . .\n\n🟪 \"I'm stuck and need help!\"\n\n. . .\n\n🟩 \"I finished the exercise\"\n\n\n## 👀 {.annotation}\n\n![](images/pointing.svg){.absolute top=\"0\" right=\"0\"}\n\n## Tentative plan for this workshop\n\n::: columns\n::: {.column width=\"50%\"}\n- *Today:* \n\n - Your data budget\n - What makes a model\n - Evaluating models\n:::\n::: {.column width=\"50%\"}\n- *Tomorrow:*\n \n - Feature engineering\n - Tuning hyperparameters\n - Racing methods\n - Iterative search methods\n:::\n:::\n\n## {.center}\n\n### Introduce yourself to your neighbors 👋\n\n<br></br>\n\nCheck Slack (`#ml-ws-2023`) for an RStudio Cloud link.\n\n## What is machine learning?\n\n![](https://imgs.xkcd.com/comics/machine_learning.png){fig-align=\"center\"}\n\n::: footer\n<https://xkcd.com/1838/>\n:::\n\n## What is machine learning?\n\n![](images/what_is_ml.jpg){fig-align=\"center\"}\n\n::: footer\nIllustration credit: <https://vas3k.com/blog/machine_learning/>\n:::\n\n## What is machine learning?\n\n![](images/ml_illustration.jpg){fig-align=\"center\"}\n\n::: footer\nIllustration credit: <https://vas3k.com/blog/machine_learning/>\n:::\n\n## Your turn {transition=\"slide-in\"}\n\n![](images/parsnip-flagger.jpg){.absolute top=\"0\" right=\"0\" width=\"150\" height=\"150\"}\n\n. . .\n\n*How are statistics and machine learning related?*\n\n*How are they similar? Different?*\n\n\n::: {.cell}\n::: {.cell-output-display}\n```{=html}\n<div class=\"countdown\" id=\"statistics-vs-ml\" data-update-every=\"1\" tabindex=\"0\" style=\"right:0;bottom:0;\">\n<div class=\"countdown-controls\"><button class=\"countdown-bump-down\">&minus;</button><button class=\"countdown-bump-up\">&plus;</button></div>\n<code class=\"countdown-time\"><span class=\"countdown-digits minutes\">03</span><span class=\"countdown-digits colon\">:</span><span class=\"countdown-digits seconds\">00</span></code>\n</div>\n```\n:::\n:::\n\n\n::: notes\nthe \"two cultures\"\n\nmodel first vs. data first\n\ninference vs. prediction\n:::\n\n## What is tidymodels? ![](hexes/tidymodels.png){.absolute top=-20 right=0 width=\"64\" height=\"74.24\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidymodels)\n#> ── Attaching packages ──────────────────────────── tidymodels 1.1.0 ──\n#> ✔ broom 1.0.5 ✔ rsample 1.1.1.9000\n#> ✔ dials 1.2.0 ✔ tibble 3.2.1 \n#> ✔ dplyr 1.1.2 ✔ tidyr 1.3.0 \n#> ✔ infer 1.0.4 ✔ tune 1.1.1.9001\n#> ✔ modeldata 1.1.0 ✔ workflows 1.1.3 \n#> ✔ parsnip 1.1.0.9003 ✔ workflowsets 1.0.1 \n#> ✔ purrr 1.0.1 ✔ yardstick 1.2.0.9001\n#> ✔ recipes 1.0.6\n#> ── Conflicts ─────────────────────────────── tidymodels_conflicts() ──\n#> ✖ purrr::discard() masks scales::discard()\n#> ✖ dplyr::filter() masks stats::filter()\n#> ✖ dplyr::lag() masks stats::lag()\n#> ✖ recipes::step() masks stats::step()\n#> • Use tidymodels_prefer() to resolve common conflicts.\n```\n:::\n\n\n## {background-image=\"images/tm-org.png\" background-size=\"contain\"}\n\n## The whole game\n\nPart of any modelling process is\n\n* Splitting your data into training and test set\n* Using a resampling scheme\n* Fitting models\n* Assessing performance\n* Choosing a model\n* Fitting and assessing the final model\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-split.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-model-1.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n:::notes\nStress that we are **not** fitting a model on the entire training set other than for illustrative purposes in deck 2.\n:::\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-model-n.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-resamples.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-select.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-final-fit.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n## The whole game\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/whole-game-final-performance.jpg){fig-align='center' width=3543}\n:::\n:::\n\n\n\n## Let's install some packages\n\nIf you are using your own laptop instead of RStudio Cloud:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"pak\")\n\npkgs <- c(\"bonsai\", \"doParallel\", \"embed\", \"finetune\", \"lightgbm\", \"lme4\", \n \"parallelly\", \"plumber\", \"probably\", \"ranger\", \"rpart\", \"rpart.plot\", \n \"stacks\", \"textrecipes\", \"tidymodels\", \"tidymodels/modeldatatoo\", \n \"vetiver\")\npak::pak(pkgs)\n```\n:::\n\n\n. . .\n\nCheck Slack (`#ml-ws-2023`) for an RStudio Cloud link.\n\n\n## Our versions\n\n\n::: {.cell}\n\n:::\n\n\nbonsai (0.2.1.9000, Github (tidymodels/bonsai@aab79), broom (1.0.5, local), dials (1.2.0, CRAN), doParallel (1.0.17, CRAN), dplyr (1.1.2, CRAN), embed (1.0.0, CRAN), finetune (1.1.0.9000, Github (tidymodels/finetune@52d), ggplot2 (3.4.2, CRAN), lightgbm (3.3.5, CRAN), lme4 (1.1-33, CRAN), modeldata (1.1.0, CRAN), modeldatatoo (0.1.0.9000, Github (tidymodels/modeldatatoo), parallelly (1.36.0, CRAN), parsnip (1.1.0.9003, Github (tidymodels/parsnip@e627), plumber (1.2.1, CRAN), probably (1.0.2, CRAN), purrr (1.0.1, CRAN), ranger (0.15.1, CRAN), recipes (1.0.6, CRAN), rpart (4.1.19, CRAN), rpart.plot (3.1.1, CRAN), rsample (1.1.1.9000, Github (tidymodels/rsample@afc4), scales (1.2.1, CRAN), stacks (1.0.2.9000, local), textrecipes (1.0.2, CRAN), tibble (3.2.1, CRAN), tidymodels (1.1.0, CRAN), tidyr (1.3.0, CRAN), tune (1.1.1.9001, Github (tidymodels/tune@fea8b02), vetiver (0.2.0, CRAN), workflows (1.1.3, CRAN), workflowsets (1.0.1, CRAN), yardstick (1.2.0.9001, Github (tidymodels/yardstick@6c), and Quarto (1.3.433)\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {
"include-in-header": [
"<link href=\"../../site_libs/countdown-0.4.0/countdown.css\" rel=\"stylesheet\" />\n<script src=\"../../site_libs/countdown-0.4.0/countdown.js\"></script>\n"
],
"include-after-body": [
"\n<script>\n // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n // slide changes (different for each slide format).\n (function () {\n // dispatch for htmlwidgets\n function fireSlideEnter() {\n const event = window.document.createEvent(\"Event\");\n event.initEvent(\"slideenter\", true, true);\n window.document.dispatchEvent(event);\n }\n\n function fireSlideChanged(previousSlide, currentSlide) {\n fireSlideEnter();\n\n // dispatch for shiny\n if (window.jQuery) {\n if (previousSlide) {\n window.jQuery(previousSlide).trigger(\"hidden\");\n }\n if (currentSlide) {\n window.jQuery(currentSlide).trigger(\"shown\");\n }\n }\n }\n\n // hookup for slidy\n if (window.w3c_slidy) {\n window.w3c_slidy.add_observer(function (slide_num) {\n // slide_num starts at position 1\n fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n });\n }\n\n })();\n</script>\n\n"
]
},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
Loading