Second attempt at the Kaggle competition "San Francisco Crime Classification"
The "San Francisco Crime Classification" challenge, is a Kaggle competition aimed to predict the category of the crimes that occurred in the city, given the time and location of the incident.

In this post, I explain and outline my second solution to this challenge. This time using Spark and Python.

Link to the competition: San Francisco Crime Classification

Learning method

The algorithm chosen for the implemented solution, is a multinomial logistic regression, a classification model based on regression where the dependent variable (what we want to predict) is categorical (opposite of continuous).


A written report is available at: San Francisco Crime Classification - a Kaggle Competition