Reproducible workflows at scale with
Ambitious workflows in R, such as machine learning analyses, can be
difficult to manage. A single round of computation can take several
hours to complete, and routine updates to the code and data tend to
invalidate hard-earned results. You can enhance the maintainability,
hygiene, speed, scale, and reproducibility of such projects with the
drake R package.
drake resolves the dependency
structure of your analysis pipeline, skips tasks that are already up to
date, executes the rest with optional distributed
computing, and organizes the
output so you rarely have to think about data files. This workshop will
teach you how to create and maintain machine learning projects with
install.packages("remotes") remotes::install_github("wlandau/learndrake") keras::install_keras() # Check if the installation succeeded. tensorflow::tf_config()
If you are using RStudio version 1.2.5003 and encounter this fatal
downgrading TensorFlow to version 1.13.1. Note:
silently tries to upgrade TensorFlow to version >= 2, so you will need
to run it with
tensorflow = "1.13.1.
RStudio Cloud (pre-built)
- Sign up for RStudio Cloud.
- Navigate to https://rstudio.cloud/project/627076 to open a new copy of the workshop.
- Optional: save a permanent copy so you can come back to it later. Look for the red “temporary copy” text at the top and click the “save a permanent copy” option next to it.
RStudio Cloud (custom)
This approach takes a bit longer to set up than the pre-built project.
- Sign up for RStudio Cloud.
- Create a fresh new project.
- Run this setup script to install the dependencies and download the materials.
- Advantage: no need to sign up for RStudio Cloud.
- Disadvantage: long load times and quick timeouts.
The functions in
learndrake help navigate and deploy the workshop
materials. If you installed the package and dependencies as above, you
can take the workshop locally without an internet connection. Start with
the introductory slides, then move on to the notebooks. Launch apps
along the way as
||Launch a Shiny app that accompanies a tutorial.|
||Save the app files so you can deploy to shinyapps.io or Shiny Server.|
||Save the tutorials to your computer: R notebooks and supporting files.|
||Save the introductory slides to your computer.|
||Open the introductory slides in a web browser.|
The workshop begins with an introductory presentation on
drake. You can find a video
Alternatively, you can view the slides at
https://wlandau.github.io/learndrake/index.html or open them yourself
in a browser with
After the introductory presentation, students work through a sequence of
R notebooks in order. Use
save_notebooks() to save the notebooks and
supporting files to your
|Files and R Markdown||
4-static.Rmd come with supporting Shiny
apps to conduct the learning exercises. Use
launch_app() to run any of
|Matt Dancho||Publishing the original blog post with the workshop’s underlying case study.|
|Eric Nantz||Reviewing and providing feedback on this workshop.|