Skip to content

Example Spark ML project using R.A. Fisher's famous "iris" dataset

License

Notifications You must be signed in to change notification settings

JakeGreene/iris-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Iris-ML

A sample machine learning project using Apache Spark.

Data

I am using R.A. Fisher's famous "iris" dataset, a dataset that contains 150 entries with 3 classes. A description of the data can be found here

Usage

This project is using Spark 1.6.0 and scala 2.11. Spark does not currently provide a 2.11 distribution, meaning you will need to spend ~15 minutes to download and compile the source.

To use this project, run the following commands after setting or substituting SPARK_1_6_HOME to the spark 1.6.0 directory, and replacing the src/main/resources/iris.data with whichever data you want to use:

sbt clean assembly
# The classification task
${SPARK_1_6_HOME}/bin/spark-submit --class ca.jakegreene.iris.IrisClassification --master spark://127.0.0.1:7077 target/scala-2.11/iris.jar src/main/resources/iris.data
# The clustering task
${SPARK_1_6_HOME}/bin/spark-submit --class ca.jakegreene.iris.IrisClustering --master spark://127.0.0.1:7077 target/scala-2.11/iris.jar src/main/resources/iris.data

About

Example Spark ML project using R.A. Fisher's famous "iris" dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages