This workshop presents two days of training of Microsoft R for Data Science, and one day of Spark training. It is therefore meant to be presented in a three-day format.
In the first two days, you'll learn about using open source R and Microsoft R for data science workflows. In the third day, you'll take what you learned from open source R and scalable Microsoft R, and deploy it to a production-ready HDInsight Spark cluster.
Cloning the Materials
Note, this is the delivery repository. This means that there is no content specific to this repository itself. All the content comes from submodules in other git repositories. To clone this repository, and obtain all the submodules, please do the following from any terminal with
git clone --recursive https://github.com/Azure/learnAnalytics-mr4ds-spark.git
Alternatively, clone the barebone repository and then update
git submodule update --init --recursive
Submit issues and pull requests to the underlying submodule itself. For example, if you find an error in the Microsoft R for Data Science module, submit an issue here.