Second attempt at the Kaggle competition "San Francisco Crime Classification"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

San Francisco Crime Classification (Kaggle) using Spark and logistic regression


The "San Francisco Crime Classification" challenge, is a Kaggle competition aimed to predict the category of the crimes that occurred in the city, given the time and location of the incident.

In this post, I explain and outline my second solution to this challenge. This time using Spark and Python.

Link to the competition: San Francisco Crime Classification

Learning method

The algorithm chosen for the implemented solution, is a multinomial logistic regression, a classification model based on regression where the dependent variable (what we want to predict) is categorical (opposite of continuous).


A written report is available at: San Francisco Crime Classification - a Kaggle Competition