Course Materials for Three Day Workshop on Scalable Data Science with Microsoft R Server and Spark with HDInsight
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
mr4ds @ 9774185 update submodules Jun 26, 2017
mrs-spark @ f3ef886
.gitignore update gitignore May 21, 2017
.gitmodules Update submodule URL and instructions for cloning May 21, 2017
LICENSE Initial commit Feb 16, 2017
LearnAnalytics-mr4ds-spark.Rproj update mr4ds submodule Mar 14, 2017
OSSLicense.txt Update submodule URL and instructions for cloning May 21, 2017


This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact with any additional questions or comments.

Course Materials

This workshop presents two days of training of Microsoft R for Data Science, and one day of Spark training. It is therefore meant to be presented in a three-day format.

In the first two days, you'll learn about using open source R and Microsoft R for data science workflows. In the third day, you'll take what you learned from open source R and scalable Microsoft R, and deploy it to a production-ready HDInsight Spark cluster.

Cloning the Materials

Note, this is the delivery repository. This means that there is no content specific to this repository itself. All the content comes from submodules in other git repositories. To clone this repository, and obtain all the submodules, please do the following from any terminal with git:

git clone --recursive

Alternatively, clone the barebone repository and then update

git submodule update --init --recursive

Submit issues and pull requests to the underlying submodule itself. For example, if you find an error in the Microsoft R for Data Science module, submit an issue here.