Skip to content

Commit

Permalink
Merge pull request #160 from datmo/r-notebook
Browse files Browse the repository at this point in the history
R notebook example
  • Loading branch information
asampat3090 committed May 28, 2018
2 parents 68475d0 + 7f78034 commit e6a0aad
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 2 deletions.
9 changes: 7 additions & 2 deletions examples/R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,18 @@ $ pip install datmo
$ mkdir MY_DATMO_PROJECT
$ cd MY_DATMO_PROJECT
2. Initialize the datmo project
2. Initialize the datmo project (skip this step if using an example with .Rmd)

$ datmo init

3. Copy/save example files within project folder (if directory, copy the contents of the directory)

$ cp /path/to/SCRIPT_NAME.R .
If it is an Rmd notebook, you can run the following

$ cp /path/to/NOTEBOOK_NAME.Rmd .
If the filename for the example is a directory then you can run the following

$ cp /path/to/DIRECTORY/* .
Expand All @@ -44,4 +48,5 @@ in the example.

| feature | filename(s) | Instructions |
| ------------- |:-------------:| -----|
| Create Snapshot | `snapshot_create_iris_caret.R`| (1) Run `snapshot_create_iris_caret.R` <br> (2) See snapshot created with `$ datmo snapshot ls` |
| Create Snapshot | `snapshot_create_iris_caret.R`| (1) Open and run `snapshot_create_iris_caret.R` in RStudio <br> (2) See snapshot created with `$ datmo snapshot ls` |
| Create Snapshot Notebook | `snapshot_create_notebook.Rmd`| (1) Open and run `snapshot_create_notebook.Rmd` in RStudio <br> (2) See snapshot created with `$ datmo snapshot ls` |
File renamed without changes.
101 changes: 101 additions & 0 deletions examples/R/snapshot_create_notebook.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
title: "Log your experiments in R with Datmo"
author: "Nick Walsh"
output:
html_document:
df_print: paged
toc: yes
rmarkdown::html_vignette:
number_sections: yes
toc: yes
---

Setup
=====

First, we'll need to install a few packages for use today. They'll contain everything we'll need to model our data and create visualizations.

```{r installPackages}
install.packages("datasets") # Package that contains the Iris dataset
install.packages("caret", dependencies = TRUE) # Model selection/tuning package
install.packages("rpart.plot") # Visualization package
```

We're going to install a python package called Datmo, which will enable us to log and track our experiments through the power of *snapshots*.
If you don't already have pip, you can [find it here](https://pip.pypa.io/en/stable/installing/).

```{bash}
pip install datmo
```

Next, we're going to want to make sure we've set the proper working directory. We can do this easily through the
RStudio file finder on the right, or with the following command.

This will be necessary so that Datmo knows the proper directory to perform tracking in.

```{r "setup", include=FALSE}
require("knitr")
opts_knit$set(root.dir = "~/Dev/datmo-R-example") # Replace with whatever your root directory for the project is
```

Now we're going to initialize a Datmo repository. This will enable us to create snapshots for logging our experiments.

```{r initializeDatmo}
system("datmo init", input=c("my new project","test description"), timeout=15)
```

Example
======

Ok, time to start with loading in the Fisher Iris dataset.

```{r loadData}
library(datasets)
df <- iris # Create dataframe from the Iris dataset
head(df) # View first few rows of dataset
```

Now that our dataframe is loaded in, we can import the *caret* package to perform training.

```{r fitModel}
library(caret)
modFit <- train(Species ~., method = "rpart", data=df) #Fit model
print(modFit$finalModel) #Summarize model
```

Our model is built, but it's kind of hard to comprehend with just the metrics. Let's create a visualization to showcase the
splits in our decision tree.

```{r visualizeModel}
library(rpart.plot)
rpart.plot(modFit$finalModel) #create decision tree visualization
```

Awesome! Since we're happy with our model results, we'll want to save our model and log configuration and stats sections in a snapshot.
We can do this with the following syntax, where we're creating a *char* string of format "--PROPERTY key:value" that will be passed to
the snapshot create code block.

```{r defineSnapshot}
config<- paste(sep="",
" --config method:", modFit$method,
" --config modelType:", modFit$modelType)
#define metrics to save from the model
stats<- paste(sep="",
" --stats Accuracy:", modFit$results$Accuracy[1],
" --stats Kappa:", modFit$results$Kappa[1])
config
stats
```

```{r snapshotCreate}
system2("datmo", args=paste("snapshot create", "-m 'Whoah, my first snapshot!'", config, stats))
```

```{bash}
datmo snapshot ls
```

0 comments on commit e6a0aad

Please sign in to comment.