This repository is part of a community effort to collect, curate, and share publicly available examples of data analysis projects powered by the
drake R package. Each folder is its own example with a self-sufficient set of code and data files.
Run in a browser
You can download example files and run them locally with
# Install and load drake. devtools::install_github("ropensci/drake") library(drake) # List the available examples. drake_examples() # Get an example drake_example("main") list.files() # See the new 'main' folder
customer-churn-simple: based on an RStudio Solutions Engineering example of how to use Keras with R. The motivation comes from a blog post by Matt Dancho, and the code is based on a notebook by Edgar Ruiz. If it takes too long to store your deep learning models this kind of workflow, try
customer-churn-fast: similar to
customer-churn-simple, but with more nuance to mitigate a potential serialization bottleneck.
drake's main bare-bones introductory example, written by Kirill Müller.
gsp: A concrete example using real econometrics data. It explores the relationships between gross state product and other quantities, and it shows off
drake's ability to generate lots of reproducibly-tracked tasks with ease.
packages: A concrete example using data on R package downloads. It demonstrates how
drakecan refresh a project based on new incoming data without restarting everything from scratch.
mtcars: An old legacy example with the
load_mtcars_example()to set up the project in your workspace.
High-performance computing examples
mlr-slurm: an example machine learning workflow rigged to deploy to a SLURM cluster.
Docker-psock: demonstrates how to deploy targets to a Docker container using a specialized PSOCK cluster.
drake's high-performance computing functionality to send work to a Grid Engine cluster.
slurm: similar to
sge, but for SLURM.
torque: similar to
sge, but for TORQUE.
Example for developing
hpc-profiling: an example with a small number of medium-ish-sized datasets. The goal is to assess how long it takes (relatively speaking) to shuffle data around hpc workers.
overhead: an example explicitly designed to maximize strain on
drake's internals. The purpose is to support profiling studies to speed up
Demonstrations of specific features
script-based-workflows: demonstrates how to adapt
draketo an imperative script-based project.
code_to_plan: questioning. Refer to