Iris-ML

A sample machine learning project using Apache Spark.

Data

I am using R.A. Fisher's famous "iris" dataset, a dataset that contains 150 entries with 3 classes. A description of the data can be found here

Usage

This project is using Spark 1.6.0 and scala 2.11. Spark does not currently provide a 2.11 distribution, meaning you will need to spend ~15 minutes to download and compile the source.

To use this project, run the following commands after setting or substituting SPARK_1_6_HOME to the spark 1.6.0 directory, and replacing the src/main/resources/iris.data with whichever data you want to use:

sbt clean assembly
# The classification task
${SPARK_1_6_HOME}/bin/spark-submit --class ca.jakegreene.iris.IrisClassification --master spark://127.0.0.1:7077 target/scala-2.11/iris.jar src/main/resources/iris.data
# The clustering task
${SPARK_1_6_HOME}/bin/spark-submit --class ca.jakegreene.iris.IrisClustering --master spark://127.0.0.1:7077 target/scala-2.11/iris.jar src/main/resources/iris.data

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
project		project
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Iris-ML

Data

Usage

About

Releases

Packages

Languages

License

JakeGreene/iris-ml

Folders and files

Latest commit

History

Repository files navigation

Iris-ML

Data

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages