What better way to learn the R interfaces to Apache Spark than to write a book about them?
What this book will cover:
-
Part I
-
Introduction to Spark (what is Spark, installation, etc.)
-
Interfacing R with Spark via the
sparklyr
package (connectiong to Spark, data wrangling withdplyr
, etc.) -
Essentials of machine learning (cross-validation, target leakage, etc.)
-
-
Part II
-
Machine learning in Spark via MLlib
-
Machine learning in Spark via the
rsparkling
package
-