Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
ADLSmaterial
Advanced
Exercise1
Exercise2
Exercise3
Exercise4
Exercise5
Exercise6
Exercise7
Exercise8
Exercise9
README.md

README.md

Contents

In this tutorial folder you will find a set of exercises to help you get started with using Azure Data Lake Analytics with R for Data Science work. Each Exercise folder contains usql and R scripts with description of the exercise. The advanced folder contains some more complex examples.

Note: Please provision Azure Data Lake Analytics Account and Data Lake Store in your Azure subscription by following the instructions here in order to do the exercises below. Upload myiris.csv and myiris_wheader.csv to /TutorialMaterial folder in your Data Lake Store. These csv files will be used in most of the exercises.

To use Jupyter notebook to do this tutorial please find the notebook here. We will use the az cli commands from within a Jupyter notebook.

Exercise 1 - Inline R code.

Exercise 2 - Deploy R script as resource.

Exercise 3 - Check the R environment in ADLA. Install the magrittr package.

Exercise 4 - Using checkpoint to create a zip file containing required packages and all the dependencies. Demonstrate how to deploy and use dplyr package.

Exercise 5 - Understanding how partitioning works in ADLA. Specifically the REDUCE operation.

Exercise 6 - About rReturnType as “charactermatrix”.

Exercise 7 - Save the trained model. Deploy trained model for scoring.

Exercise 8 - Use R scripts as combiners (using COMBINE and Extension.R.Combiner).

Exercise 9 - Use rxDTree() from RevoScaleR package.

Advanced - More complex examples.