# Installation

Since Neptune is available on CRAN you can simpy use `install.packages` function.

For this tutorial I'll need a few more packages and those are listed below.



In [1]:
# install neptune
install.packages('neptune', dependencies = TRUE)

# install other packages for this tutorial
install.packages(c('digest', 'mlbench', 'randomForest'), dependencies = TRUE)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)



ERROR: ignored

For simplicity install miniconda nad copy the python path of your miniconda environemnt to the `init_neptune` function that we will need later.

In [2]:
reticulate:: install_miniconda()
reticulate:: py_config()

python:         /root/.local/share/r-miniconda/envs/r-reticulate/bin/python
libpython:      /root/.local/share/r-miniconda/envs/r-reticulate/lib/libpython3.6m.so
pythonhome:     /root/.local/share/r-miniconda/envs/r-reticulate:/root/.local/share/r-miniconda/envs/r-reticulate
version:        3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54)  [GCC 7.3.0]
numpy:          /root/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/numpy
numpy_version:  1.18.1

## Load packages and data

Nothing fancy here.

In [3]:
# load libraries
library(neptune)
library(digest)
library(mlbench)
library(randomForest)

SEED=1234
set.seed(SEED)

# load dataset 
data(Sonar)
dataset <- Sonar
x <- dataset[,1:60]   # predictors
y <- dataset[,61]     # labels

randomForest 4.6-14

Type rfNews() to see new features/changes/bug fixes.



# Initialize Neptune

In order to start logging things to Neptune you need to "connect" your script to Neptune service. 

Since Neptune is working on top of "reticulate" and Python you may need to specify your Python environment with `python` and `python_path` arguments.

In [4]:
init_neptune(project_name = 'shared/r-integration',
             api_token = 'ANONYMOUS',
             python_path='/root/.local/share/r-miniconda/envs/r-reticulate/bin/python'
             )

Project(shared/r-integration)

# Create Experiment

To start tracking your work you need to create a Neptune experiment. 

You can name your experiment, add tags to organize things and track hyperparameters of your ML models.

In [5]:
params = list(ntree=100,
              mtry=10,
              maxnodes=20
              )

create_experiment(name='training on Sonar', 
                  tags=c('random-forest','sonar'),
                  params = params
)

Experiment(RIN-164)

# Set properties

You can also use `set_property` function to save `key:value` pairs.

For example, I'll keep track of the `data version` and random `seed`.

In [0]:
set_property(property = 'data-version', value = digest(dataset))
set_property(property = 'seed', value = SEED)

# Train your model

I defined hyperparameters in the `params` list and I will pass them directly into `randomForest`.

In [0]:
model <- randomForest(x = x, y = y,
  ntree=params$ntree, mtry = params$mtry, maxnodes = params$maxnodes,
  importance = TRUE
  )

# Log metrics 

Logging metrics to Neptune is trivial.

For example, I will log mean out-of-bag error and errors per class taken from the confusion matrix.


In [0]:
log_metric('mean OOB error', mean(model$err.rate[,1]))
log_metric('error class M', model$confusion[1,3])
log_metric('error class R', model$confusion[2,3])

You can also log multiple values to the same channel. 

If you do that, Neptune will automatically create charts for you.

In [0]:
for (err in (model$err.rate[,1])) {
  log_metric('OOB errors', err)
}

# Log artifacts

You can log any file to Neptune. just use the `log_artifact` function.

For example, we can log our `model.Rdata` file.

In [0]:
save(model, file="model.Rdata")
log_artifact('model.Rdata')

# Log charts

You can log performance charts like ROC AUC, Confusion Matrix or anything else you think is important.

You need to specify the log channel to which you want to log it and push a chart file there.

For example, I'll log two versions of feature importance visualizations to Neptune.

In [0]:
for (t in c(1,2)){
  jpeg('temp_plot.jpeg')
  varImpPlot(model,type=t)
  dev.off()
  log_image('feature_importance', 'temp_plot.jpeg')
}

# Stop the experiment

After everything is done you need to stop the experiment.

Thanks to `create_experiment` and `stop_experiment` functions you can create multiple experiments in a single script.

In [0]:
stop_experiment()

# Explore experiments in Neptune

Now you can explore everything you logged in Neptune.

You can use your link or go check out [this experiment dashboard](https://ui.neptune.ai/o/shared/org/r-integration/experiments?viewId=fa3b57a5-77fb-4edb-83fc-505014d3649d):

![image](https://neptune.ai/wp-content/uploads/r-integration-tour.gif)

## Create your free account

The best part is, Neptune is completely free for individuals and research teams so you can go ahead and [create your free account](https://neptune.ai?utm_source=colab&utm_medium=notebook&utm_campaign=integration-r) and check it out for yourself.