Analytics using SFO crime data from Kaggle
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
CrimeCategory_by_DayOfWeek.jpg
Crimecount_byCategory&Yr.jpg
README.md
corr_categ_vars.jpg
corr_matrx_quant.jpg
cov_variables.jpg
heatmap_SFO.png
hmap.R
multinom.zip
multinom_pgm.R
relations.R

README.md

Analytics using SFO crime data from Kaggle



Graphical analysis & Data Exploration

(1) HEATMAPS:
Code for heatmap and charts given in program "hmap.R".
To view images/charts or run the code online use the Kaggle platform link below:
https://www.kaggle.com/anu2analytics/sf-crime/heatmap-for-crime-categories-and-area
One heatmap image added to this folder as heatmap_SFO.png

(2) CHI-SQ TEST & CORRELATION VISUALIZATION:
program "relations.R" performs the following tasks:

    (a) chi-sq test to check if relationship exists between crimecategory (categorical variable) and other factors like district, x/y coordinaates, month, year, etc.
    (b) corrgram function to visualize correlation matrix and dependencies.
Related images include corr_categ_vars.jpg, corr_matrx_quant.jpg & cov_variables.jpg .

Prediction Algorithm:

Multinomial regression.

Model 1: simple formula with default options.

Formula: Category ~ DayOfWeek PdDistrict
Score: 2.65
Output file: multinom.zip

Model 2: complex formula using higher number of iterations.

Formula: multinom(Category ~ DayOfWeek + year + mth + PdDistrict, data = mytrain, maxit = 500)
Score: 2.60
Output file: multinom_dates.zip