Skip to content

Decision tree algorithm using Spark machine learning libraries

Notifications You must be signed in to change notification settings

rpeddacama/SparkMachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SparkMachineLearning

This is an implementation of Decisiontree model using Spark MLLib libraries.

It uses NewYork city motor vehicle collisions data from public website:

https://data.cityofnewyork.us/Public-Safety/NYPD-Motor-Vehicle-Collisions/h9gi-nx95

Source data is cleansed with R script to remove missing values.

Feature vectors used for building the model are:

Time of collision Street name where collision happened Borough name where collision happened

Input data is split for training and testing the decisiontree model generated.

About

Decision tree algorithm using Spark machine learning libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published