As Apache Mahout0.13.0 is build on Scala2.10 which is no longer supported in Apache Spark2.x. we need to build mahout 0.13.0 in order to run with scala2.11 and Spark2.x.
This is an example for Correlated Cross-Occurrence algorithm playing Apache Mahout0.13.0 with the latest Apache Spark. I will update this project when Apache Mahout0.14.0 comes up.
$ git clone http://github.com/apache/mahout $ cd mahout $ mvn clean install -Pscala-2.11,spark-2.1 -DskipTests
or just download prebuilt mahout with Scala2.11 from
https://github.com/heroku/predictionio-buildpack/tree/master/repo/org/apache/mahout
$ mvn clean scala:compile package
Copy all datasources from kaggle to /opt/nfs.
You need to set up your NFS server and nfs directory as /opt/nfs when your spark is in cluster mode.
$ mvn exec:exec@run-local
$ mvn exec:exec@run-cluster
Confirm that your master node has a hostname as 'master' or you need to change the master url specifiled in pom.xml