# Safety Recommenders Project:

# Project Description

The objective of the project is create a data product that helps people that are traveling to Washington DC to determine how safe are the most popular places in the city.

## Dataset Description

This Project one dataset. The  dataset is a csv file that contains all the crimes that have occurred in the past 8 years around Washington DC.
In order to improve the model, new features will be added to the dataset, mainly weather data.

Finally, after the first analyses, some of the features of the crime dataset will be removed to make the data product simpler to use. Features like date, time, location and address will be kept, and features like offense type and location codes used by the police department will also be removed.


### Importing Main Libraries:

In [80]:
%matplotlib notebook
import IPython
from IPython.display import display
from sqlalchemy import create_engine
import psycopg2
import psycopg2.extras
import pandas as pd
import csv
from numpy import nan as NA
from datetime import datetime
import re
import sys
import numpy as np
import matplotlib.pyplot as plt 
import scipy as sp
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import Normalizer
from sklearn import model_selection
from sklearn.model_selection import cross_val_score
from sklearn.metrics import classification_report
from pandas import *
import pickle
import requests
import os
from sklearn.preprocessing import LabelEncoder

# ------------------------DATA INGESTION------------------------

First of all, we will change pandas settings so it can show all the features 

In [81]:
#Panda settings
#Pandas will not display all columns in our data when using the head() function without this
pd.set_option('max_columns',50) 
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

Here We are creating the module that will take care of our csv file. One class is designed to deal with csv files and the other class to deal with a postgres database:

In [82]:
class Ingestion(object):
    """This is the ingestion class to deal with csv directly from the same directory where the module is"""

    def __init__(self, file, sep = ",", header = 0 ):
        self.file = file
        self.delimiter = sep
        self.df = pd.read_csv(file, sep= sep, header=header, engine='python', 
                              parse_dates = False)

    def file_csv(self):
        return self.df

class IngestionDatabase(object):

    """ This is the ingestion class to deal with postgress database """
    def __init__(self, database, query):
        self.engine = create_engine(database)
        self.table_names = self.engine.table_names()
        self.con = self.engine.connect()
        self.rs = self.con.execute(query)
        self.df = pd.DataFrame(self.rs.fetchmany(size=15))

    def cols(self):
        self.df.columns = self.rs.keys()
        return self.df

class OnlineFetch(object):
    
    def __init__(self, URL, fname="dc-crimes-search-results.csv"):
        self.URL = URL
        self.fname = fname 
        
  
    def fetch_data(self):
        """
        Helper method to retrieve the ML Repository dataset.
        """
        self.response = requests.get(self.URL)
        self.outpath  = os.path.abspath(self.fname)
        
        with open(self.outpath, 'wb') as f:
            f.write(self.response.content)
            return self.outpath

## Creating the Ingestion instances for the crime dataset and weather dataset:

In [83]:
ingest = Ingestion('DC_Crime_Official2.csv')
data = ingest.file_csv()

In [84]:
ingest = Ingestion('dc_weather_2010-2018.csv')
weather_data = ingest.file_csv()

In [85]:
data.head(5)

Unnamed: 0,NEIGHBORHOOD_CLUSTER,CENSUS_TRACT,offensegroup,LONGITUDE,END_DATE,offense-text,SHIFT,YBLOCK,DISTRICT,WARD,YEAR,offensekey,BID,sector,PSA,ucr-rank,BLOCK_GROUP,VOTING_PRECINCT,XBLOCK,BLOCK,START_DATE,CCN,OFFENSE,ANC,REPORT_DAT,METHOD,location,LATITUDE
0,cluster 15,600.0,property,-77.069766,2015-05-05T17:45:00.000,theft f/auto,evening,140519.0,2.0,3.0,2015,property|theft f/auto,,2D2,204.0,7,000600 3,precinct 27,393951.0,3500 - 3599 block of lowell street nw,2015-05-05T15:00:00.000,15064669,theft f/auto,3C,2015-05-05T22:06:00.000Z,others,"38.932540365033717,-77.069768169731375",38.932533
1,cluster 2,3000.0,violent,-77.031208,2015-05-05T20:55:00.000,robbery,evening,140084.0,3.0,1.0,2015,violent|robbery,,3D1,302.0,4,003000 2,precinct 39,397294.0,1300 - 1399 block of irving street nw,2015-05-05T20:50:00.000,15064781,robbery,1A,2015-05-06T01:05:00.000Z,gun,"38.928638407177509,-77.031210110921407",38.928631
2,cluster 3,4400.0,property,-77.02846,2015-05-05T22:14:00.000,motor vehicle theft,evening,139201.0,3.0,1.0,2015,property|motor vehicle theft,,3D2,305.0,8,004400 1,precinct 22,397532.0,1200 - 1247 block of florida avenue nw,2015-05-05T18:18:00.000,15064796,motor vehicle theft,1B,2015-05-06T02:58:00.000Z,others,"38.920684762113531,-77.028462123394547",38.920677
3,cluster 23,8804.0,violent,-76.985496,2015-06-23T08:00:00.000,homicide,midnight,137689.0,5.0,5.0,2015,violent|homicide,,5D3,506.0,1,008804 1,precinct 78,401258.0,1200 - 1299 block of holbrook terrace ne,2015-06-23T05:23:00.000,15094190,homicide,5D,2015-06-24T04:00:00.000Z,gun,"38.907066722563066,-76.985498377563218",38.907059
4,cluster 25,8100.0,property,-76.990245,2015-06-23T06:30:00.000,theft f/auto,day,136398.650012,1.0,6.0,2015,property|theft f/auto,,1D2,107.0,7,008100 1,precinct 81,400846.210015,duncan place ne and 12th street ne,2015-06-22T22:00:00.000,15094194,theft f/auto,6A,2015-06-23T12:48:00.000Z,others,"38.895443280295112,-76.990247632314322",38.895435


In [86]:
weather_data.head(5)

Unnamed: 0,Year,Month,Day,Hour,Temperature,Precipitation,Snowfall
0,2010,1,1,0,44.92,0.0,0.0
1,2010,1,1,1,43.83,0.0,0.0
2,2010,1,1,2,42.22,0.0,0.0
3,2010,1,1,3,40.23,0.0,0.0
4,2010,1,1,4,39.38,0.0,0.0


# -------------------------DATA WRANGLING--------------------

1. Wrangling crime data:

In [87]:
class Wrangling(object):

    def __init__(self, data = data):
        self.df = data
# drop empty rows
    def dropNA(self):
        self.df = self.df.dropna(how='all') # this only drop rows with 100% NA
        return self.df

    def __offense_column(self, text1 ='theft/other', text2 ='theft f/auto', text3 = 'assault w/dangerous weapon',
                       repl1 = 'theft', repl2 = 'auto theft', repl3 = 'assault with weapon' ):

        """There are 9 categories of offenses here:
        This function will transform the caterogies into more readable text
        for example : assault w/dangerous weapon = assault with dangerous weapon"""

        self.df['offense_text'] = self.df['offense_text'].replace([text1, text2, # add the column name to the arguments.
        text3], [repl1, repl2, repl3])
        return self.df

    def date_time_transformer(self, time = 'START_DATE', second_date = 'REPORT_DAT', third_date = 'END_DATE'):
        ''' transform into datetime 64 object and eliminate the second date column'''
        self.df[second_date] = pd.to_datetime(self.df['REPORT_DAT'])
        self.df.drop([time, third_date], axis = 1, inplace = True)
        return self.df

    def __latlong_cutter(self):
        """ Reduce the presition of the lat long data by cutting them."""
        self.newlat = []
        self.newlon = []
        for item in self.df['LATITUDE']:
            item = str(item)
            item = float(item[0:6])
            self.newlat.append(item)

        self.df['LATITUDE'] = self.newlat

        for item in self.df['LONGITUDE']:
            item = str(item)
            item = float(item[0:7])
            self.newlon.append(item)
        self.df['LONGITUDE'] = self.newlon

        return self.df

    def lat_long_rounder(self, decimals = 3):
        """ Reduce the presition of the lat long data by rounging decimals"""
        self.df['LATITUDE'] = self.df['LATITUDE'].round(decimals = decimals)
        self.df['LONGITUDE'] = self.df['LONGITUDE'].round(decimals = decimals)
        return self.df

    def adress_format_modifier(self):
        """This columns replace some of the content from the block columns to it is easy to parse it"""

        self.splitted = []

        # creating the splited column
        # this is working. it cannot be transformed into pandas' .replace because it is using the split method
        # Note that the built in .replace it does not work properly with integers and neither with large amounts of
        # things to change.. This works but it is not very wise to use.
        for row in self.df['BLOCK']:
            row = row.replace("block of ", "")
            row = row.replace("street", "St")
            row = row.replace("-", "")
            row = row.split(' ', 1)
            self.splitted.append(row)
        self.df['splitted'] = self.splitted
        return self.df

    def block_parser(self):
        """ This is the block parser that separate block in start and en blocks"""

        self.startblock = []
        self.endblock_1 = []
        self.endblock = []
        #  create column 'startblock'
        for row in self.df['splitted']:
            row = row[0]
            self.startblock.append(row)
        self.df['startblock'] = self.startblock
        # create column  'endblock_1'
        for row in self.df['splitted']:
            row = row[-1].lstrip() # enblock_1
            row = row.split(' ',1)
            self.endblock_1.append(row)
        self.df['endblock_1'] = self.endblock_1
        # create column  'endblock'
        for row in self.df['endblock_1']:
            row = row[0]
            self.endblock.append(row)
        self.df['endblock'] = self.endblock
        return self.df

    def street_parser(self):
        self.street = []
        #creating column 'street'
        for row in self.df['endblock_1']:
            row = row[1]
            self.street.append(row)
        self.df['street'] = self.street
        return self.df

In [88]:
Wrangled = Wrangling()
Wrangled.dropNA()
Wrangled.date_time_transformer()
Wrangled.lat_long_rounder()
Wrangled.adress_format_modifier()
Wrangled.block_parser()
crime_df = Wrangled.street_parser()

In [89]:
crime_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 282475 entries, 0 to 282474
Data columns (total 31 columns):
NEIGHBORHOOD_CLUSTER    278891 non-null object
CENSUS_TRACT            281632 non-null float64
offensegroup            282475 non-null object
LONGITUDE               282475 non-null float64
offense-text            282475 non-null object
SHIFT                   282475 non-null object
YBLOCK                  282475 non-null float64
DISTRICT                282313 non-null float64
WARD                    282463 non-null float64
YEAR                    282475 non-null int64
offensekey              282475 non-null object
BID                     46124 non-null object
sector                  282296 non-null object
PSA                     282296 non-null float64
ucr-rank                282475 non-null int64
BLOCK_GROUP             281632 non-null object
VOTING_PRECINCT         282414 non-null object
XBLOCK                  282475 non-null float64
BLOCK                   282475 non-null

In [90]:
crime_df["REPORT_DAT"] = pd.to_datetime(crime_df["REPORT_DAT"])

    1.1: Set the REPORT_DAT column from the crime dataset as an index:

In [91]:
crime_df = crime_df.rename(columns = {'REPORT_DAT': 'datetime'})

In [92]:

crime_df["year"] =crime_df["datetime"].dt.year
crime_df["month"] =crime_df["datetime"].dt.month
crime_df["day"] =crime_df["datetime"].dt.day
crime_df["hour"] =crime_df["datetime"].dt.hour
crime_df["minute"] =crime_df["datetime"].dt.minute
crime_df["second"] =crime_df["datetime"].dt.second

        1.1.1 The maximum precision of the weather dataset is hourly, so we need to change the format:

In [93]:
crime_df['datetime'] = crime_df['datetime'].apply(lambda x: x.replace(minute = 0, second = 0))

        1.1.2: Adding index to crime dataset:

In [94]:
crime_df = crime_df.set_index(pd.DatetimeIndex(crime_df['datetime']), drop = False)

In [95]:
crime_df.head()

Unnamed: 0_level_0,NEIGHBORHOOD_CLUSTER,CENSUS_TRACT,offensegroup,LONGITUDE,offense-text,SHIFT,YBLOCK,DISTRICT,WARD,YEAR,offensekey,BID,sector,PSA,ucr-rank,BLOCK_GROUP,VOTING_PRECINCT,XBLOCK,BLOCK,CCN,OFFENSE,ANC,datetime,METHOD,location,LATITUDE,splitted,startblock,endblock_1,endblock,street,year,month,day,hour,minute,second
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1
2015-05-05 22:00:00,cluster 15,600.0,property,-77.07,theft f/auto,evening,140519.0,2.0,3.0,2015,property|theft f/auto,,2D2,204.0,7,000600 3,precinct 27,393951.0,3500 - 3599 block of lowell street nw,15064669,theft f/auto,3C,2015-05-05 22:00:00,others,"38.932540365033717,-77.069768169731375",38.933,"[3500, 3599 lowell St nw]",3500,"[3599, lowell St nw]",3599,lowell St nw,2015,5,5,22,6,0
2015-05-06 01:00:00,cluster 2,3000.0,violent,-77.031,robbery,evening,140084.0,3.0,1.0,2015,violent|robbery,,3D1,302.0,4,003000 2,precinct 39,397294.0,1300 - 1399 block of irving street nw,15064781,robbery,1A,2015-05-06 01:00:00,gun,"38.928638407177509,-77.031210110921407",38.929,"[1300, 1399 irving St nw]",1300,"[1399, irving St nw]",1399,irving St nw,2015,5,6,1,5,0
2015-05-06 02:00:00,cluster 3,4400.0,property,-77.028,motor vehicle theft,evening,139201.0,3.0,1.0,2015,property|motor vehicle theft,,3D2,305.0,8,004400 1,precinct 22,397532.0,1200 - 1247 block of florida avenue nw,15064796,motor vehicle theft,1B,2015-05-06 02:00:00,others,"38.920684762113531,-77.028462123394547",38.921,"[1200, 1247 florida avenue nw]",1200,"[1247, florida avenue nw]",1247,florida avenue nw,2015,5,6,2,58,0
2015-06-24 04:00:00,cluster 23,8804.0,violent,-76.985,homicide,midnight,137689.0,5.0,5.0,2015,violent|homicide,,5D3,506.0,1,008804 1,precinct 78,401258.0,1200 - 1299 block of holbrook terrace ne,15094190,homicide,5D,2015-06-24 04:00:00,gun,"38.907066722563066,-76.985498377563218",38.907,"[1200, 1299 holbrook terrace ne]",1200,"[1299, holbrook terrace ne]",1299,holbrook terrace ne,2015,6,24,4,0,0
2015-06-23 12:00:00,cluster 25,8100.0,property,-76.99,theft f/auto,day,136398.650012,1.0,6.0,2015,property|theft f/auto,,1D2,107.0,7,008100 1,precinct 81,400846.210015,duncan place ne and 12th street ne,15094194,theft f/auto,6A,2015-06-23 12:00:00,others,"38.895443280295112,-76.990247632314322",38.895,"[duncan, place ne and 12th St ne]",duncan,"[place, ne and 12th St ne]",place,ne and 12th St ne,2015,6,23,12,48,0


2. Wrangling weather data:

In [96]:
weather_data['datetime'] =  pd.to_datetime(weather_data[['Year', 'Month', 'Day', 'Hour']])

In [97]:
weather_data.head(5)

Unnamed: 0,Year,Month,Day,Hour,Temperature,Precipitation,Snowfall,datetime
0,2010,1,1,0,44.92,0.0,0.0,2010-01-01 00:00:00
1,2010,1,1,1,43.83,0.0,0.0,2010-01-01 01:00:00
2,2010,1,1,2,42.22,0.0,0.0,2010-01-01 02:00:00
3,2010,1,1,3,40.23,0.0,0.0,2010-01-01 03:00:00
4,2010,1,1,4,39.38,0.0,0.0,2010-01-01 04:00:00


    2.1 Seting index for weather data:

In [98]:
weather_data = weather_data.set_index(pd.DatetimeIndex(weather_data['datetime']), drop = False)

In [99]:
weather_data.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 74064 entries, 2010-01-01 00:00:00 to 2018-06-13 23:00:00
Data columns (total 8 columns):
Year             74064 non-null int64
Month            74064 non-null int64
Day              74064 non-null int64
Hour             74064 non-null int64
Temperature      74064 non-null float64
Precipitation    74064 non-null float64
Snowfall         74064 non-null float64
datetime         74064 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(3), int64(4)
memory usage: 5.1 MB


In [100]:
weather_data.head(5)

Unnamed: 0_level_0,Year,Month,Day,Hour,Temperature,Precipitation,Snowfall,datetime
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2010-01-01 00:00:00,2010,1,1,0,44.92,0.0,0.0,2010-01-01 00:00:00
2010-01-01 01:00:00,2010,1,1,1,43.83,0.0,0.0,2010-01-01 01:00:00
2010-01-01 02:00:00,2010,1,1,2,42.22,0.0,0.0,2010-01-01 02:00:00
2010-01-01 03:00:00,2010,1,1,3,40.23,0.0,0.0,2010-01-01 03:00:00
2010-01-01 04:00:00,2010,1,1,4,39.38,0.0,0.0,2010-01-01 04:00:00


3. Merging the datasets:

First, we need to eliminate duplicate values and select the time range of the second dataset. this is to prevent the left join to generate extra values

In [101]:
a= weather_data.index.duplicated(keep = False)
a = DataFrame(a)
b = a[a[0]== True]
b.count()

0    0
dtype: int64

3.1 Setting up the time range:

In [102]:
crime_df['datetime'].min()

Timestamp('2010-06-11 04:00:00')

In [103]:
crime_df['datetime'].max()

Timestamp('2018-06-11 03:00:00')

In [104]:
crime_df = crime_df.sort_values('datetime')
weather_data = weather_data.sort_values('datetime')

Defaulting to column, but this will raise an ambiguity error in a future version
  """Entry point for launching an IPython kernel.
Defaulting to column, but this will raise an ambiguity error in a future version
  


3.1.1 Setting up the time range so it matches the crime data (the weather dataset contains dates that do not exist in the weather dataset:

In [105]:
idx = pd.date_range('2010-06-11 04:00:00', '2018-06-11 03:00:00', freq = "H")

In [106]:
crime_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 282475 entries, 2010-06-11 04:00:00 to 2018-06-11 03:00:00
Data columns (total 37 columns):
NEIGHBORHOOD_CLUSTER    278891 non-null object
CENSUS_TRACT            281632 non-null float64
offensegroup            282475 non-null object
LONGITUDE               282475 non-null float64
offense-text            282475 non-null object
SHIFT                   282475 non-null object
YBLOCK                  282475 non-null float64
DISTRICT                282313 non-null float64
WARD                    282463 non-null float64
YEAR                    282475 non-null int64
offensekey              282475 non-null object
BID                     46124 non-null object
sector                  282296 non-null object
PSA                     282296 non-null float64
ucr-rank                282475 non-null int64
BLOCK_GROUP             281632 non-null object
VOTING_PRECINCT         282414 non-null object
XBLOCK                  282475 non-null float64
BLOCK

In [107]:
df = pd.merge(crime_df, weather_data, how='left', on='datetime', left_index= True)

Defaulting to column, but this will raise an ambiguity error in a future version
  exec(code_obj, self.user_global_ns, self.user_ns)


In [108]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 282475 entries, 2010-06-11 04:00:00 to 2018-06-11 03:00:00
Data columns (total 44 columns):
NEIGHBORHOOD_CLUSTER    278891 non-null object
CENSUS_TRACT            281632 non-null float64
offensegroup            282475 non-null object
LONGITUDE               282475 non-null float64
offense-text            282475 non-null object
SHIFT                   282475 non-null object
YBLOCK                  282475 non-null float64
DISTRICT                282313 non-null float64
WARD                    282463 non-null float64
YEAR                    282475 non-null int64
offensekey              282475 non-null object
BID                     46124 non-null object
sector                  282296 non-null object
PSA                     282296 non-null float64
ucr-rank                282475 non-null int64
BLOCK_GROUP             281632 non-null object
VOTING_PRECINCT         282414 non-null object
XBLOCK                  282475 non-null float64
BLOCK

# --------------------Exploration of the dataset and preparation of the dataset for Predictions--------------------

In [109]:
df['offensegroup'].value_counts()

property    234352
violent      48123
Name: offensegroup, dtype: int64

1. Elimination of extra possible target variables:

In [110]:
df = df.drop(columns = ['METHOD', 'CCN', 'offensekey',
                       'offense-text','OFFENSE', 'BID', 'ucr-rank'])

2. Elimination of Na values. We discovered that dropping NAN yiels better results than imputing missing values

In [111]:
df = df.dropna()

""" it is important to say that droppning nans yields 
    better results that imputing the missing
    values.
"""

' it is important to say that droppning nans yields \n    better results that imputing the missing\n    values.\n'

3. Selecting X and y and encoding categorical values:

In [112]:
X = df.drop(columns = ['offensegroup'])

y = df['offensegroup']

X = ['NEIGHBORHOOD_CLUSTER', 'CENSUS_TRACT','LONGITUDE',
       'SHIFT', 'YBLOCK', 'DISTRICT', 'WARD', 'YEAR', 'sector', 'PSA',
       'BLOCK_GROUP', 'VOTING_PRECINCT', 'XBLOCK', 'BLOCK', 'ANC', 'datetime',
       'location', 'LATITUDE', 'endblock', 'street', 'year', 'month', 'day', 'hour', 'minute',
       'second', 'Year', 'Month', 'Day', 'Hour', 'Temperature',]

In [113]:
def encoder(data):
    """This function will encode multiple features using LabelEncoder"""
    encoder = LabelEncoder() # it only support one dimentional columns.. 
    
    for colname,col in data.iteritems(): # adapted from stack overflow
            data[colname] = LabelEncoder().fit_transform(col.astype(str)) # note here that all was transformef to a string so from now, all x features are encoded in a way that cannot be interpreted.
    
    return (data)

In [114]:
X = encoder(X)
y = LabelEncoder().fit_transform(y)

In [115]:
X['datetime'].nunique()

65006

### Using yellowbrick to study the most important features of this dataset:

Our target variable for this will be offense group, which ranks offenses between violent and non-violent.

In [116]:
df.columns

Index(['NEIGHBORHOOD_CLUSTER', 'CENSUS_TRACT', 'offensegroup', 'LONGITUDE',
       'SHIFT', 'YBLOCK', 'DISTRICT', 'WARD', 'YEAR', 'sector', 'PSA',
       'BLOCK_GROUP', 'VOTING_PRECINCT', 'XBLOCK', 'BLOCK', 'ANC', 'datetime',
       'location', 'LATITUDE', 'splitted', 'startblock', 'endblock_1',
       'endblock', 'street', 'year', 'month', 'day', 'hour', 'minute',
       'second', 'Year', 'Month', 'Day', 'Hour', 'Temperature',
       'Precipitation', 'Snowfall'],
      dtype='object')

In [117]:
features = [
    'NEIGHBORHOOD_CLUSTER', 'CENSUS_TRACT','LONGITUDE',
       'SHIFT', 'YBLOCK', 'DISTRICT', 'WARD', 'YEAR', 'sector', 'PSA',
       'BLOCK_GROUP', 'VOTING_PRECINCT', 'XBLOCK', 'BLOCK', 'ANC', 'datetime',
       'location', 'LATITUDE', 'splitted', 'startblock', 'endblock_1',
       'endblock', 'street', 'year', 'month', 'day', 'hour', 'minute',
       'second', 'Year', 'Month', 'Day', 'Hour', 'Temperature',
       'Precipitation', 'Snowfall'
]

Xi = df[features]
Xi = encoder(Xi)
yi = df['offensegroup']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [222]:
from sklearn.ensemble import GradientBoostingClassifier
from yellowbrick.features.importances import FeatureImportances
fig = plt.figure()
ax = fig.add_subplot()

viz = FeatureImportances(GradientBoostingClassifier(), ax=ax)
viz.fit(Xi, yi)
viz.poof()

<IPython.core.display.Javascript object>

In [223]:
with open('Feature_importance.pickle', 'wb') as figure1:
    
    pickle.dump([fig,ax, viz] , figure1)

In [None]:
with open('Feature_importance.pickle', 'rb') as fig1:
    
    object_list = pickle.load(fig1)

In [240]:
y_ = y.reshape(-1, 1)

In [241]:
from sklearn.cluster import KMeans
wcss = []

for i in range(1,11):
    kmeans = KMeans(n_clusters = i, random_state=0, max_iter=300, init='k-means++', n_init=10)
    kmeans.fit(y_)
    wcss.append(kmeans.inertia_)
plt.plot(range(1,11), wcss)
plt.title('Elbow')
plt.xlabel('Number of Clusters')
plt.ylabel('wcss')
plt.show()    

<IPython.core.display.Javascript object>

In [242]:
from sklearn.cluster import MiniBatchKMeans

from yellowbrick.cluster import KElbowVisualizer

visualizer = KElbowVisualizer(MiniBatchKMeans(), k=(1,10))

visualizer.fit(y_) # Fit the training data to the visualizer
visualizer.poof() # Draw/show/poof the data

<IPython.core.display.Javascript object>

#    --------------------MODEL SELECTION---------------------------------



In [118]:
from sklearn.pipeline import Pipeline
from yellowbrick.classifier import ClassificationReport
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import LinearSVC, NuSVC, SVC
from sklearn.linear_model import LogisticRegressionCV, LogisticRegression, SGDClassifier
from sklearn.ensemble import BaggingClassifier, ExtraTreesClassifier, RandomForestClassifier

In [119]:
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2,random_state= 0)

In [120]:
scaler = StandardScaler()
#scaler =  MinMaxScaler()
#scaler = Normalizer()
X_train = scaler.fit(X_train).transform(X_train)
X_test = scaler.fit(X_test).transform(X_test)

In [121]:
def classifier_graph(classifier):
    classes = ['Nonviolent','Violent'] # 1 is 0 and 2 is 1..... non violent and violent respectively.
    model = classifier
    visualizer = ClassificationReport(model, classes=classes)



    visualizer.fit(X_train, y_train)  # Fit the visualizer and the model
    visualizer.score(X_test, y_test)  # Evaluate the model on the test data
    g = visualizer.poof()             # Draw/show/poof the data

In [122]:
y__ = pd.Series(y)

In [123]:
y__.value_counts()

0    230621
1     47182
dtype: int64

In [None]:
classifier_graph(KNeighborsClassifier(n_neighbors = 10, metric = 'manhattan', weights = 'distance', algorithm = 'auto'))

In [44]:
classifier_graph(GaussianNB())   

<IPython.core.display.Javascript object>

In [265]:
classifier_graph(LinearSVC())

<IPython.core.display.Javascript object>

In [None]:
classifier_graph(SVC(C= 100, gamma = 0.01, kernel = 'rbf'))

In [249]:
classifier_graph(LogisticRegressionCV())

<IPython.core.display.Javascript object>

In [250]:
classifier_graph(LogisticRegression())

<IPython.core.display.Javascript object>

In [251]:
classifier_graph(SGDClassifier())



<IPython.core.display.Javascript object>

In [254]:
classifier_graph(BaggingClassifier())

<IPython.core.display.Javascript object>

In [255]:
classifier_graph(ExtraTreesClassifier())

<IPython.core.display.Javascript object>

In [264]:
classifier_graph(RandomForestClassifier(n_estimators=300))

<IPython.core.display.Javascript object>



# ________________USING ANN__________
 

In [46]:
import keras
import tensorflow
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from keras.utils import np_utils

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [48]:
y = y.reshape(-1, 1)
onehotencoder = OneHotEncoder()
y = onehotencoder.fit_transform(y).toarray()
y = pd.DataFrame(y)


In [49]:
y.columns = ['1', '2']

In [50]:
y.head()

Unnamed: 0,1,2
0,0.0,1.0
1,0.0,1.0
2,1.0,0.0
3,1.0,0.0
4,1.0,0.0


In [51]:
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2,random_state= 0)

In [52]:
X_train.shape

(222242, 36)

In [53]:
scaler = StandardScaler()
#scaler =  MinMaxScaler()
#scaler = Normalizer()
X_train = scaler.fit(X_train).transform(X_train)
X_test = scaler.fit(X_test).transform(X_test)

Finding the best learning rates:

In [101]:
from keras.optimizers import SGD

def ann_model(input_dim = 36):
    model = Sequential()
    model.add(Dense(units=100, input_dim=input_dim))
    model.add(Activation('relu'))
    model.add(Dense(units=100))
    model.add(Activation('relu'))
    model.add(Dense(units=100))
    model.add(Activation('relu'))
    model.add(Dense(units=100))
    model.add(Activation('relu'))
    model.add(Dense(units=2))
    model.add(Activation('softmax'))
    return(model)
learning_rate = [0.000001,0.01,1]

for lr in learning_rate:
    
    print('Testing for learning rate {}'.format(lr))
    model = ann_model()
    my_optimizer = SGD(lr = lr)
    model.compile(optimizer = my_optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'] )
    model.fit(X_train, y_train, batch_size = 500, epochs = 100)

Testing for learning rate 1e-06
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Ep

Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Testing for learning rate 1
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch

def sequential_experiments(n_layers = 1, n_nodes = 100)

    if n_layers = 1 and n_nodes = 100:
        
        model = Sequential()

        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))

        model.add(Dense(units=2))
        model.add(Activation('softmax'))

    
        model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
        model.fit(X_train, y_train, batch_size = 500, epochs = 100)
        return(model)
    
    if n_layers = 10 and n_nodes = 300:
        
        model = Sequential()
        
        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))
        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))
        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))
        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))
        model.add(Dense(units=250, input_dim=36))
        model.add(Activation('relu'))
        
        model.add(Dense(units=2))
        model.add(Activation('softmax'))

    
        model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
        model.fit(X_train, y_train, batch_size = 500, epochs = 100)
        return(model)
    
    
This function will be able to run many different combinations of nodes and layers, the idea is that when finds a model with the best score, it will save it to disk.

Try with adam optimizer:

In [54]:
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(patience = 3)

model = Sequential()

model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=2))
model.add(Activation('softmax'))

# 
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
model.fit(X_train, y_train, batch_size = 500, epochs = 1000)


Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

Epoch 79/1000
Epoch 80/1000
Epoch 81/1000
Epoch 82/1000
Epoch 83/1000
Epoch 84/1000
Epoch 85/1000
Epoch 86/1000
Epoch 87/1000
Epoch 88/1000
Epoch 89/1000
Epoch 90/1000
Epoch 91/1000
Epoch 92/1000
Epoch 93/1000
Epoch 94/1000
Epoch 95/1000
Epoch 96/1000
Epoch 97/1000
Epoch 98/1000
Epoch 99/1000
Epoch 100/1000
Epoch 101/1000
Epoch 102/1000
Epoch 103/1000
Epoch 104/1000
Epoch 105/1000
Epoch 106/1000
Epoch 107/1000
Epoch 108/1000
Epoch 109/1000
Epoch 110/1000
Epoch 111/1000
Epoch 112/1000
Epoch 113/1000
Epoch 114/1000
Epoch 115/1000
Epoch 116/1000
Epoch 117/1000
Epoch 118/1000
Epoch 119/1000
Epoch 120/1000
Epoch 121/1000
Epoch 122/1000
Epoch 123/1000
Epoch 124/1000
Epoch 125/1000
Epoch 126/1000
Epoch 127/1000
Epoch 128/1000
Epoch 129/1000
Epoch 130/1000
Epoch 131/1000
Epoch 132/1000
Epoch 133/1000
Epoch 134/1000
Epoch 135/1000
Epoch 136/1000
Epoch 137/1000
Epoch 138/1000
Epoch 139/1000
Epoch 140/1000
Epoch 141/1000
Epoch 142/1000
Epoch 143/1000
Epoch 144/1000
Epoch 145/1000
Epoch 146/1000
E

Epoch 155/1000
Epoch 156/1000
Epoch 157/1000
Epoch 158/1000
Epoch 159/1000
Epoch 160/1000
Epoch 161/1000
Epoch 162/1000
Epoch 163/1000
Epoch 164/1000
Epoch 165/1000
Epoch 166/1000
Epoch 167/1000
Epoch 168/1000
Epoch 169/1000
Epoch 170/1000
Epoch 171/1000
Epoch 172/1000
Epoch 173/1000
Epoch 174/1000
Epoch 175/1000
Epoch 176/1000
Epoch 177/1000
Epoch 178/1000
Epoch 179/1000
Epoch 180/1000
Epoch 181/1000
Epoch 182/1000
Epoch 183/1000
Epoch 184/1000
Epoch 185/1000
Epoch 186/1000
Epoch 187/1000
Epoch 188/1000
Epoch 189/1000
Epoch 190/1000
Epoch 191/1000
Epoch 192/1000
Epoch 193/1000
Epoch 194/1000
Epoch 195/1000
Epoch 196/1000
Epoch 197/1000
Epoch 198/1000
Epoch 199/1000
Epoch 200/1000
Epoch 201/1000
Epoch 202/1000
Epoch 203/1000
Epoch 204/1000
Epoch 205/1000
Epoch 206/1000
Epoch 207/1000
Epoch 208/1000
Epoch 209/1000
Epoch 210/1000
Epoch 211/1000
Epoch 212/1000
Epoch 213/1000
Epoch 214/1000
Epoch 215/1000
Epoch 216/1000
Epoch 217/1000
Epoch 218/1000
Epoch 219/1000
Epoch 220/1000
Epoch 221/

Epoch 231/1000
Epoch 232/1000
Epoch 233/1000
Epoch 234/1000
Epoch 235/1000
Epoch 236/1000
Epoch 237/1000
Epoch 238/1000
Epoch 239/1000
Epoch 240/1000
Epoch 241/1000
Epoch 242/1000
Epoch 243/1000
Epoch 244/1000
Epoch 245/1000
Epoch 246/1000
Epoch 247/1000
Epoch 248/1000
Epoch 249/1000
Epoch 250/1000
Epoch 251/1000
Epoch 252/1000
Epoch 253/1000
Epoch 254/1000
Epoch 255/1000
Epoch 256/1000
Epoch 257/1000
Epoch 258/1000
Epoch 259/1000
Epoch 260/1000
Epoch 261/1000
Epoch 262/1000
Epoch 263/1000
Epoch 264/1000
Epoch 265/1000
Epoch 266/1000
Epoch 267/1000
Epoch 268/1000
Epoch 269/1000
Epoch 270/1000
Epoch 271/1000
Epoch 272/1000
Epoch 273/1000
Epoch 274/1000
Epoch 275/1000
Epoch 276/1000
Epoch 277/1000
Epoch 278/1000
Epoch 279/1000
Epoch 280/1000
Epoch 281/1000
Epoch 282/1000
Epoch 283/1000
Epoch 284/1000
Epoch 285/1000
Epoch 286/1000
Epoch 287/1000
Epoch 288/1000
Epoch 289/1000
Epoch 290/1000
Epoch 291/1000
Epoch 292/1000
Epoch 293/1000
Epoch 294/1000
Epoch 295/1000
Epoch 296/1000
Epoch 297/

Epoch 307/1000
Epoch 308/1000
Epoch 309/1000
Epoch 310/1000
Epoch 311/1000
Epoch 312/1000
Epoch 313/1000
Epoch 314/1000
Epoch 315/1000
Epoch 316/1000
Epoch 317/1000
Epoch 318/1000
Epoch 319/1000
Epoch 320/1000
Epoch 321/1000
Epoch 322/1000
Epoch 323/1000
Epoch 324/1000
Epoch 325/1000
Epoch 326/1000
Epoch 327/1000
Epoch 328/1000
Epoch 329/1000
Epoch 330/1000
Epoch 331/1000
Epoch 332/1000
Epoch 333/1000
Epoch 334/1000
Epoch 335/1000
Epoch 336/1000
Epoch 337/1000
Epoch 338/1000
Epoch 339/1000
Epoch 340/1000
Epoch 341/1000
Epoch 342/1000
Epoch 343/1000
Epoch 344/1000
Epoch 345/1000
Epoch 346/1000
Epoch 347/1000
Epoch 348/1000
Epoch 349/1000
Epoch 350/1000
Epoch 351/1000
Epoch 352/1000
Epoch 353/1000
Epoch 354/1000
Epoch 355/1000
Epoch 356/1000
Epoch 357/1000
Epoch 358/1000
Epoch 359/1000
Epoch 360/1000
Epoch 361/1000
Epoch 362/1000
Epoch 363/1000
Epoch 364/1000
Epoch 365/1000
Epoch 366/1000
Epoch 367/1000
Epoch 368/1000
Epoch 369/1000
Epoch 370/1000
Epoch 371/1000
Epoch 372/1000
Epoch 373/

Epoch 383/1000
Epoch 384/1000
Epoch 385/1000
Epoch 386/1000
Epoch 387/1000
Epoch 388/1000
Epoch 389/1000
Epoch 390/1000
Epoch 391/1000
Epoch 392/1000
Epoch 393/1000
Epoch 394/1000
Epoch 395/1000
Epoch 396/1000
Epoch 397/1000
Epoch 398/1000
Epoch 399/1000
Epoch 400/1000
Epoch 401/1000
Epoch 402/1000
Epoch 403/1000
Epoch 404/1000
Epoch 405/1000
Epoch 406/1000
Epoch 407/1000
Epoch 408/1000
Epoch 409/1000
Epoch 410/1000
Epoch 411/1000
Epoch 412/1000
Epoch 413/1000
Epoch 414/1000
Epoch 415/1000
Epoch 416/1000
Epoch 417/1000
Epoch 418/1000
Epoch 419/1000
Epoch 420/1000
Epoch 421/1000
Epoch 422/1000
Epoch 423/1000
Epoch 424/1000
Epoch 425/1000
Epoch 426/1000
Epoch 427/1000
Epoch 428/1000
Epoch 429/1000
Epoch 430/1000
Epoch 431/1000
Epoch 432/1000
Epoch 433/1000
Epoch 434/1000
Epoch 435/1000
Epoch 436/1000
Epoch 437/1000
Epoch 438/1000
Epoch 439/1000
Epoch 440/1000
Epoch 441/1000
Epoch 442/1000
Epoch 443/1000
Epoch 444/1000
Epoch 445/1000
Epoch 446/1000
Epoch 447/1000
Epoch 448/1000
Epoch 449/

Epoch 459/1000
Epoch 460/1000
Epoch 461/1000
Epoch 462/1000
Epoch 463/1000
Epoch 464/1000
Epoch 465/1000
Epoch 466/1000
Epoch 467/1000
Epoch 468/1000
Epoch 469/1000
Epoch 470/1000
Epoch 471/1000
Epoch 472/1000
Epoch 473/1000
Epoch 474/1000
Epoch 475/1000
Epoch 476/1000
Epoch 477/1000
Epoch 478/1000
Epoch 479/1000
Epoch 480/1000
Epoch 481/1000
Epoch 482/1000
Epoch 483/1000
Epoch 484/1000
Epoch 485/1000
Epoch 486/1000
Epoch 487/1000
Epoch 488/1000
Epoch 489/1000
Epoch 490/1000
Epoch 491/1000
Epoch 492/1000
Epoch 493/1000
Epoch 494/1000
Epoch 495/1000
Epoch 496/1000
Epoch 497/1000
Epoch 498/1000
Epoch 499/1000
Epoch 500/1000
Epoch 501/1000
Epoch 502/1000
Epoch 503/1000
Epoch 504/1000
Epoch 505/1000
Epoch 506/1000
Epoch 507/1000
Epoch 508/1000
Epoch 509/1000
Epoch 510/1000
Epoch 511/1000
Epoch 512/1000
Epoch 513/1000
Epoch 514/1000
Epoch 515/1000
Epoch 516/1000
Epoch 517/1000
Epoch 518/1000
Epoch 519/1000
Epoch 520/1000
Epoch 521/1000
Epoch 522/1000
Epoch 523/1000
Epoch 524/1000
Epoch 525/

Epoch 535/1000
Epoch 536/1000
Epoch 537/1000
Epoch 538/1000
Epoch 539/1000
Epoch 540/1000
Epoch 541/1000
Epoch 542/1000
Epoch 543/1000
Epoch 544/1000
Epoch 545/1000
Epoch 546/1000
Epoch 547/1000
Epoch 548/1000
Epoch 549/1000
Epoch 550/1000
Epoch 551/1000
Epoch 552/1000
Epoch 553/1000
Epoch 554/1000
Epoch 555/1000
Epoch 556/1000
Epoch 557/1000
Epoch 558/1000
Epoch 559/1000
Epoch 560/1000
Epoch 561/1000
Epoch 562/1000
Epoch 563/1000
Epoch 564/1000
Epoch 565/1000
Epoch 566/1000
Epoch 567/1000
Epoch 568/1000
Epoch 569/1000
Epoch 570/1000
Epoch 571/1000
Epoch 572/1000
Epoch 573/1000
Epoch 574/1000
Epoch 575/1000
Epoch 576/1000
Epoch 577/1000
Epoch 578/1000
Epoch 579/1000
Epoch 580/1000
Epoch 581/1000
Epoch 582/1000
Epoch 583/1000
Epoch 584/1000
Epoch 585/1000
Epoch 586/1000
Epoch 587/1000
Epoch 588/1000
Epoch 589/1000
Epoch 590/1000
Epoch 591/1000
Epoch 592/1000
Epoch 593/1000
Epoch 594/1000
Epoch 595/1000
Epoch 596/1000
Epoch 597/1000
Epoch 598/1000
Epoch 599/1000
Epoch 600/1000
Epoch 601/

Epoch 611/1000
Epoch 612/1000
Epoch 613/1000
Epoch 614/1000
Epoch 615/1000
Epoch 616/1000
Epoch 617/1000
Epoch 618/1000
Epoch 619/1000
Epoch 620/1000
Epoch 621/1000
Epoch 622/1000
Epoch 623/1000
Epoch 624/1000
Epoch 625/1000
Epoch 626/1000
Epoch 627/1000
Epoch 628/1000
Epoch 629/1000
Epoch 630/1000
Epoch 631/1000
Epoch 632/1000
Epoch 633/1000
Epoch 634/1000
Epoch 635/1000
Epoch 636/1000
Epoch 637/1000
Epoch 638/1000
Epoch 639/1000
Epoch 640/1000
Epoch 641/1000
Epoch 642/1000
Epoch 643/1000
Epoch 644/1000
Epoch 645/1000
Epoch 646/1000
Epoch 647/1000
Epoch 648/1000
Epoch 649/1000
Epoch 650/1000
Epoch 651/1000
Epoch 652/1000
Epoch 653/1000
Epoch 654/1000
Epoch 655/1000
Epoch 656/1000
Epoch 657/1000
Epoch 658/1000
Epoch 659/1000
Epoch 660/1000
Epoch 661/1000
Epoch 662/1000
Epoch 663/1000
Epoch 664/1000
Epoch 665/1000
Epoch 666/1000
Epoch 667/1000
Epoch 668/1000
Epoch 669/1000
Epoch 670/1000
Epoch 671/1000
Epoch 672/1000
Epoch 673/1000
Epoch 674/1000
Epoch 675/1000
Epoch 676/1000
Epoch 677/

Epoch 687/1000
Epoch 688/1000
Epoch 689/1000
Epoch 690/1000
Epoch 691/1000
Epoch 692/1000
Epoch 693/1000
Epoch 694/1000
Epoch 695/1000
Epoch 696/1000
Epoch 697/1000
Epoch 698/1000
Epoch 699/1000
Epoch 700/1000
Epoch 701/1000
Epoch 702/1000
Epoch 703/1000
Epoch 704/1000
Epoch 705/1000
Epoch 706/1000
Epoch 707/1000
Epoch 708/1000
Epoch 709/1000
Epoch 710/1000
Epoch 711/1000
Epoch 712/1000
Epoch 713/1000
Epoch 714/1000
Epoch 715/1000
Epoch 716/1000
Epoch 717/1000
Epoch 718/1000
Epoch 719/1000
Epoch 720/1000
Epoch 721/1000
Epoch 722/1000
Epoch 723/1000
Epoch 724/1000
Epoch 725/1000
Epoch 726/1000
Epoch 727/1000
Epoch 728/1000
Epoch 729/1000
Epoch 730/1000
Epoch 731/1000
Epoch 732/1000
Epoch 733/1000
Epoch 734/1000
Epoch 735/1000
Epoch 736/1000
Epoch 737/1000
Epoch 738/1000
Epoch 739/1000
Epoch 740/1000
Epoch 741/1000
Epoch 742/1000
Epoch 743/1000
Epoch 744/1000
Epoch 745/1000
Epoch 746/1000
Epoch 747/1000
Epoch 748/1000
Epoch 749/1000
Epoch 750/1000
Epoch 751/1000
Epoch 752/1000
Epoch 753/

Epoch 763/1000
Epoch 764/1000
Epoch 765/1000
Epoch 766/1000
Epoch 767/1000
Epoch 768/1000
Epoch 769/1000
Epoch 770/1000
Epoch 771/1000
Epoch 772/1000
Epoch 773/1000
Epoch 774/1000
Epoch 775/1000
Epoch 776/1000
Epoch 777/1000
Epoch 778/1000
Epoch 779/1000
Epoch 780/1000
Epoch 781/1000
Epoch 782/1000
Epoch 783/1000
Epoch 784/1000
Epoch 785/1000
Epoch 786/1000
Epoch 787/1000
Epoch 788/1000
Epoch 789/1000
Epoch 790/1000
Epoch 791/1000
Epoch 792/1000
Epoch 793/1000
Epoch 794/1000
Epoch 795/1000
Epoch 796/1000
Epoch 797/1000
Epoch 798/1000
Epoch 799/1000
Epoch 800/1000
Epoch 801/1000
Epoch 802/1000
Epoch 803/1000
Epoch 804/1000
Epoch 805/1000
Epoch 806/1000
Epoch 807/1000
Epoch 808/1000
Epoch 809/1000
Epoch 810/1000
Epoch 811/1000
Epoch 812/1000
Epoch 813/1000
Epoch 814/1000
Epoch 815/1000
Epoch 816/1000
Epoch 817/1000
Epoch 818/1000
Epoch 819/1000
Epoch 820/1000
Epoch 821/1000
Epoch 822/1000
Epoch 823/1000
Epoch 824/1000
Epoch 825/1000
Epoch 826/1000
Epoch 827/1000
Epoch 828/1000
Epoch 829/

Epoch 839/1000
Epoch 840/1000
Epoch 841/1000
Epoch 842/1000
Epoch 843/1000
Epoch 844/1000
Epoch 845/1000
Epoch 846/1000
Epoch 847/1000
Epoch 848/1000
Epoch 849/1000
Epoch 850/1000
Epoch 851/1000
Epoch 852/1000
Epoch 853/1000
Epoch 854/1000
Epoch 855/1000
Epoch 856/1000
Epoch 857/1000
Epoch 858/1000
Epoch 859/1000
Epoch 860/1000
Epoch 861/1000
Epoch 862/1000
Epoch 863/1000
Epoch 864/1000
Epoch 865/1000
Epoch 866/1000
Epoch 867/1000
Epoch 868/1000
Epoch 869/1000
Epoch 870/1000
Epoch 871/1000
Epoch 872/1000
Epoch 873/1000
Epoch 874/1000
Epoch 875/1000
Epoch 876/1000
Epoch 877/1000
Epoch 878/1000
Epoch 879/1000
Epoch 880/1000
Epoch 881/1000
Epoch 882/1000
Epoch 883/1000
Epoch 884/1000
Epoch 885/1000
Epoch 886/1000
Epoch 887/1000
Epoch 888/1000
Epoch 889/1000
Epoch 890/1000
Epoch 891/1000
Epoch 892/1000
Epoch 893/1000
Epoch 894/1000
Epoch 895/1000
Epoch 896/1000
Epoch 897/1000
Epoch 898/1000
Epoch 899/1000
Epoch 900/1000
Epoch 901/1000
Epoch 902/1000
Epoch 903/1000
Epoch 904/1000
Epoch 905/

Epoch 915/1000
Epoch 916/1000
Epoch 917/1000
Epoch 918/1000
Epoch 919/1000
Epoch 920/1000
Epoch 921/1000
Epoch 922/1000
Epoch 923/1000
Epoch 924/1000
Epoch 925/1000
Epoch 926/1000
Epoch 927/1000
Epoch 928/1000
Epoch 929/1000
Epoch 930/1000
Epoch 931/1000
Epoch 932/1000
Epoch 933/1000
Epoch 934/1000
Epoch 935/1000
Epoch 936/1000
Epoch 937/1000
Epoch 938/1000
Epoch 939/1000
Epoch 940/1000
Epoch 941/1000
Epoch 942/1000
Epoch 943/1000
Epoch 944/1000
Epoch 945/1000
Epoch 946/1000
Epoch 947/1000
Epoch 948/1000
Epoch 949/1000
Epoch 950/1000
Epoch 951/1000
Epoch 952/1000
Epoch 953/1000
Epoch 954/1000
Epoch 955/1000
Epoch 956/1000
Epoch 957/1000
Epoch 958/1000
Epoch 959/1000
Epoch 960/1000
Epoch 961/1000
Epoch 962/1000
Epoch 963/1000
Epoch 964/1000
Epoch 965/1000
Epoch 966/1000
Epoch 967/1000
Epoch 968/1000
Epoch 969/1000
Epoch 970/1000
Epoch 971/1000
Epoch 972/1000
Epoch 973/1000
Epoch 974/1000
Epoch 975/1000
Epoch 976/1000
Epoch 977/1000
Epoch 978/1000
Epoch 979/1000
Epoch 980/1000
Epoch 981/

Epoch 991/1000
Epoch 992/1000
Epoch 993/1000
Epoch 994/1000
Epoch 995/1000
Epoch 996/1000
Epoch 997/1000
Epoch 998/1000
Epoch 999/1000
Epoch 1000/1000


<keras.callbacks.History at 0x7feb1df91860>

In [55]:
y_predict = model.predict(X_test)
y_test = y_test.astype(float)
y_predict = y_predict
y_predict = y_predict >= 0.5

In [56]:
y_traindf = pd.DataFrame(y_train)

Creating confusion matrix:

In [57]:
from sklearn.metrics import confusion_matrix
from numpy import argmax
cf = confusion_matrix(y_test.values.argmax(axis=1), y_predict.argmax(axis=1))
from sklearn.metrics import accuracy_score
#accuracy = accuracy_score(y_test_white, y_predict)
cf

array([[41462,  4845],
       [ 7048,  2206]])

In [58]:
accuracy = classification_report(y_test, y_predict)

In [59]:
print(accuracy) # 0, violent, 1 non violent.. it changed my 1, 2 columns!!

             precision    recall  f1-score   support

          0       0.85      0.90      0.87     46307
          1       0.31      0.24      0.27      9254

avg / total       0.76      0.79      0.77     55561



Saving the model to disk:

In [60]:
from keras.models import load_model
model.save('ANN_trained_model_1.h5')

In [None]:
ann_model = load_model('ANN_trained_model_1.h5')
y_predict = ann_model.predict(X_test)
y_test = y_test.astype(float)
y_predict = y_predict
y_predict = y_predict > 0.15

In [61]:
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(patience = 3)

model = Sequential()

model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))
model.add(Dense(units=350, input_dim=36))
model.add(Activation('relu'))

model.add(Dense(units=2))
model.add(Activation('softmax'))

# 
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
model.fit(X_train, y_train, batch_size = 50, epochs = 700)


Epoch 1/700
Epoch 2/700
Epoch 3/700
Epoch 4/700
Epoch 5/700
Epoch 6/700
Epoch 7/700
Epoch 8/700
Epoch 9/700
Epoch 10/700
Epoch 11/700
Epoch 12/700
Epoch 13/700
Epoch 14/700
Epoch 15/700
Epoch 16/700
Epoch 17/700
Epoch 18/700
Epoch 19/700
Epoch 20/700
Epoch 21/700
Epoch 22/700
Epoch 23/700
Epoch 24/700
Epoch 25/700
Epoch 26/700
Epoch 27/700
Epoch 28/700
Epoch 29/700
Epoch 30/700
Epoch 31/700
Epoch 32/700
Epoch 33/700
Epoch 34/700
Epoch 35/700
Epoch 36/700
Epoch 37/700
Epoch 38/700
Epoch 39/700
Epoch 40/700
Epoch 41/700
Epoch 42/700
Epoch 43/700
Epoch 44/700
Epoch 45/700
Epoch 46/700
Epoch 47/700
Epoch 48/700
Epoch 49/700
Epoch 50/700
Epoch 51/700
Epoch 52/700
Epoch 53/700
Epoch 54/700
Epoch 55/700
Epoch 56/700
Epoch 57/700
Epoch 58/700
Epoch 59/700
Epoch 60/700
Epoch 61/700
Epoch 62/700
Epoch 63/700
Epoch 64/700
Epoch 65/700
Epoch 66/700
Epoch 67/700
Epoch 68/700
Epoch 69/700
Epoch 70/700
Epoch 71/700
Epoch 72/700
Epoch 73/700
Epoch 74/700
Epoch 75/700
Epoch 76/700
Epoch 77/700
Epoch 78

Epoch 80/700
Epoch 81/700
Epoch 82/700
Epoch 83/700
Epoch 84/700
Epoch 85/700
Epoch 86/700
Epoch 87/700
Epoch 88/700
Epoch 89/700
Epoch 90/700
Epoch 91/700
Epoch 92/700
Epoch 93/700
Epoch 94/700
Epoch 95/700
Epoch 96/700
Epoch 97/700
Epoch 98/700
Epoch 99/700
Epoch 100/700
Epoch 101/700
Epoch 102/700
Epoch 103/700
Epoch 104/700
Epoch 105/700
Epoch 106/700
Epoch 107/700
Epoch 108/700
Epoch 109/700
Epoch 110/700
Epoch 111/700
Epoch 112/700
Epoch 113/700
Epoch 114/700
Epoch 115/700
Epoch 116/700
Epoch 117/700
Epoch 118/700
Epoch 119/700
Epoch 120/700
Epoch 121/700
Epoch 122/700
Epoch 123/700
Epoch 124/700
Epoch 125/700
Epoch 126/700
Epoch 127/700
Epoch 128/700
Epoch 129/700
Epoch 130/700
Epoch 131/700
Epoch 132/700
Epoch 133/700
Epoch 134/700
Epoch 135/700
Epoch 136/700
Epoch 137/700
Epoch 138/700
Epoch 139/700
Epoch 140/700
Epoch 141/700
Epoch 142/700
Epoch 143/700
Epoch 144/700
Epoch 145/700
Epoch 146/700
Epoch 147/700
Epoch 148/700
Epoch 149/700
Epoch 150/700
Epoch 151/700
Epoch 152/70

Epoch 157/700
Epoch 158/700
Epoch 159/700
Epoch 160/700
Epoch 161/700
Epoch 162/700
Epoch 163/700
Epoch 164/700
Epoch 165/700
Epoch 166/700
Epoch 167/700
Epoch 168/700
Epoch 169/700
Epoch 170/700
Epoch 171/700
Epoch 172/700
Epoch 173/700
Epoch 174/700
Epoch 175/700
Epoch 176/700
Epoch 177/700
Epoch 178/700
Epoch 179/700
Epoch 180/700
Epoch 181/700
Epoch 182/700
Epoch 183/700
Epoch 184/700
Epoch 185/700
Epoch 186/700
Epoch 187/700
Epoch 188/700
Epoch 189/700
Epoch 190/700
Epoch 191/700
Epoch 192/700
Epoch 193/700
Epoch 194/700
Epoch 195/700
Epoch 196/700
Epoch 197/700
Epoch 198/700
Epoch 199/700
Epoch 200/700
Epoch 201/700
Epoch 202/700
Epoch 203/700
Epoch 204/700
Epoch 205/700
Epoch 206/700
Epoch 207/700
Epoch 208/700
Epoch 209/700
Epoch 210/700
Epoch 211/700
Epoch 212/700
Epoch 213/700
Epoch 214/700
Epoch 215/700
Epoch 216/700
Epoch 217/700
Epoch 218/700
Epoch 219/700
Epoch 220/700
Epoch 221/700
Epoch 222/700
Epoch 223/700
Epoch 224/700
Epoch 225/700
Epoch 226/700
Epoch 227/700
Epoch 

Epoch 234/700
Epoch 235/700
Epoch 236/700
Epoch 237/700
Epoch 238/700
Epoch 239/700
Epoch 240/700
Epoch 241/700
Epoch 242/700
Epoch 243/700
Epoch 244/700
Epoch 245/700
Epoch 246/700
Epoch 247/700
Epoch 248/700
Epoch 249/700
Epoch 250/700
Epoch 251/700
Epoch 252/700
Epoch 253/700
Epoch 254/700
Epoch 255/700
Epoch 256/700
Epoch 257/700
Epoch 258/700
Epoch 259/700
Epoch 260/700
Epoch 261/700
Epoch 262/700
Epoch 263/700
Epoch 264/700
Epoch 265/700
Epoch 266/700
Epoch 267/700
Epoch 268/700
Epoch 269/700
Epoch 270/700
Epoch 271/700
Epoch 272/700
Epoch 273/700
Epoch 274/700
Epoch 275/700
Epoch 276/700
Epoch 277/700
Epoch 278/700
Epoch 279/700
Epoch 280/700
Epoch 281/700
Epoch 282/700
Epoch 283/700
Epoch 284/700
Epoch 285/700
Epoch 286/700
Epoch 287/700
Epoch 288/700
Epoch 289/700
Epoch 290/700
Epoch 291/700
Epoch 292/700
Epoch 293/700
Epoch 294/700
Epoch 295/700
Epoch 296/700
Epoch 297/700
Epoch 298/700
Epoch 299/700
Epoch 300/700
Epoch 301/700
Epoch 302/700
Epoch 303/700
Epoch 304/700
Epoch 

Epoch 311/700
Epoch 312/700
Epoch 313/700
Epoch 314/700
Epoch 315/700
Epoch 316/700
Epoch 317/700
Epoch 318/700
Epoch 319/700
Epoch 320/700
Epoch 321/700
Epoch 322/700
Epoch 323/700
Epoch 324/700
Epoch 325/700
Epoch 326/700
Epoch 327/700
Epoch 328/700
Epoch 329/700
Epoch 330/700
Epoch 331/700
Epoch 332/700
Epoch 333/700
Epoch 334/700
Epoch 335/700
Epoch 336/700
Epoch 337/700
Epoch 338/700
Epoch 339/700
Epoch 340/700
Epoch 341/700
Epoch 342/700
Epoch 343/700
Epoch 344/700
Epoch 345/700
Epoch 346/700
Epoch 347/700
Epoch 348/700
Epoch 349/700
Epoch 350/700
Epoch 351/700
Epoch 352/700
Epoch 353/700
Epoch 354/700
Epoch 355/700
Epoch 356/700
Epoch 357/700
Epoch 358/700
Epoch 359/700
Epoch 360/700
Epoch 361/700
Epoch 362/700
Epoch 363/700
Epoch 364/700
Epoch 365/700
Epoch 366/700
Epoch 367/700
Epoch 368/700
Epoch 369/700
Epoch 370/700
Epoch 371/700
Epoch 372/700
Epoch 373/700
Epoch 374/700
Epoch 375/700
Epoch 376/700
Epoch 377/700
Epoch 378/700
Epoch 379/700
Epoch 380/700
Epoch 381/700
Epoch 

Epoch 388/700
Epoch 389/700
Epoch 390/700
Epoch 391/700
Epoch 392/700
Epoch 393/700
Epoch 394/700
Epoch 395/700
Epoch 396/700
Epoch 397/700
Epoch 398/700
Epoch 399/700
Epoch 400/700
Epoch 401/700
Epoch 402/700
Epoch 403/700
Epoch 404/700
Epoch 405/700
Epoch 406/700
Epoch 407/700
Epoch 408/700
Epoch 409/700
Epoch 410/700
Epoch 411/700
Epoch 412/700
Epoch 413/700
Epoch 414/700
Epoch 415/700
Epoch 416/700
Epoch 417/700
Epoch 418/700
Epoch 419/700
Epoch 420/700
Epoch 421/700
Epoch 422/700
Epoch 423/700
Epoch 424/700
Epoch 425/700
Epoch 426/700
Epoch 427/700
Epoch 428/700
Epoch 429/700
Epoch 430/700
Epoch 431/700
Epoch 432/700
Epoch 433/700
Epoch 434/700
Epoch 435/700
Epoch 436/700
Epoch 437/700
Epoch 438/700
Epoch 439/700
Epoch 440/700
Epoch 441/700
Epoch 442/700
Epoch 443/700
Epoch 444/700
Epoch 445/700
Epoch 446/700
Epoch 447/700
Epoch 448/700
Epoch 449/700
Epoch 450/700
Epoch 451/700
Epoch 452/700
Epoch 453/700
Epoch 454/700
Epoch 455/700
Epoch 456/700
Epoch 457/700
Epoch 458/700
Epoch 

Epoch 465/700
Epoch 466/700
Epoch 467/700
Epoch 468/700
Epoch 469/700
Epoch 470/700
Epoch 471/700
Epoch 472/700
Epoch 473/700
Epoch 474/700
Epoch 475/700
Epoch 476/700
Epoch 477/700
Epoch 478/700
Epoch 479/700
Epoch 480/700
Epoch 481/700
Epoch 482/700
Epoch 483/700
Epoch 484/700
Epoch 485/700
Epoch 486/700
Epoch 487/700
Epoch 488/700
Epoch 489/700
Epoch 490/700
Epoch 491/700
Epoch 492/700
Epoch 493/700
Epoch 494/700
Epoch 495/700
Epoch 496/700
Epoch 497/700
Epoch 498/700
Epoch 499/700
Epoch 500/700
Epoch 501/700
Epoch 502/700
Epoch 503/700
Epoch 504/700
Epoch 505/700
Epoch 506/700
Epoch 507/700
Epoch 508/700
Epoch 509/700
Epoch 510/700
Epoch 511/700
Epoch 512/700
Epoch 513/700
Epoch 514/700
Epoch 515/700
Epoch 516/700
Epoch 517/700
Epoch 518/700
Epoch 519/700
Epoch 520/700
Epoch 521/700
Epoch 522/700
Epoch 523/700
Epoch 524/700
Epoch 525/700
Epoch 526/700
Epoch 527/700
Epoch 528/700
Epoch 529/700
Epoch 530/700
Epoch 531/700
Epoch 532/700
Epoch 533/700
Epoch 534/700
Epoch 535/700
Epoch 

Epoch 542/700
Epoch 543/700
Epoch 544/700
Epoch 545/700
Epoch 546/700
Epoch 547/700
Epoch 548/700
Epoch 549/700
Epoch 550/700
Epoch 551/700
Epoch 552/700
Epoch 553/700
Epoch 554/700
Epoch 555/700
Epoch 556/700
Epoch 557/700
Epoch 558/700
Epoch 559/700
Epoch 560/700
Epoch 561/700
Epoch 562/700
Epoch 563/700
Epoch 564/700
Epoch 565/700
Epoch 566/700
Epoch 567/700
Epoch 568/700
Epoch 569/700
Epoch 570/700
Epoch 571/700
Epoch 572/700
Epoch 573/700
Epoch 574/700
Epoch 575/700
Epoch 576/700
Epoch 577/700
Epoch 578/700
Epoch 579/700
Epoch 580/700
Epoch 581/700
Epoch 582/700
Epoch 583/700
Epoch 584/700
Epoch 585/700
Epoch 586/700
Epoch 587/700
Epoch 588/700
Epoch 589/700
Epoch 590/700
Epoch 591/700
Epoch 592/700
Epoch 593/700
Epoch 594/700
Epoch 595/700
Epoch 596/700
Epoch 597/700
Epoch 598/700
Epoch 599/700
Epoch 600/700
Epoch 601/700
Epoch 602/700
Epoch 603/700
Epoch 604/700
Epoch 605/700
Epoch 606/700
Epoch 607/700
Epoch 608/700
Epoch 609/700
Epoch 610/700
Epoch 611/700
Epoch 612/700
Epoch 

Epoch 619/700
Epoch 620/700
Epoch 621/700
Epoch 622/700
Epoch 623/700
Epoch 624/700
Epoch 625/700
Epoch 626/700
Epoch 627/700
Epoch 628/700
Epoch 629/700
Epoch 630/700
Epoch 631/700
Epoch 632/700
Epoch 633/700
Epoch 634/700
Epoch 635/700
Epoch 636/700
Epoch 637/700
Epoch 638/700
Epoch 639/700
Epoch 640/700
Epoch 641/700
Epoch 642/700
Epoch 643/700
Epoch 644/700
Epoch 645/700
Epoch 646/700
Epoch 647/700
Epoch 648/700
Epoch 649/700
Epoch 650/700
Epoch 651/700
Epoch 652/700
Epoch 653/700
Epoch 654/700
Epoch 655/700
Epoch 656/700
Epoch 657/700
Epoch 658/700
Epoch 659/700
Epoch 660/700
Epoch 661/700
Epoch 662/700
Epoch 663/700
Epoch 664/700
Epoch 665/700
Epoch 666/700
Epoch 667/700
Epoch 668/700
Epoch 669/700
Epoch 670/700
Epoch 671/700
Epoch 672/700
Epoch 673/700
Epoch 674/700
Epoch 675/700
Epoch 676/700
Epoch 677/700
Epoch 678/700
Epoch 679/700
Epoch 680/700
Epoch 681/700
Epoch 682/700
Epoch 683/700
Epoch 684/700
Epoch 685/700
Epoch 686/700
Epoch 687/700
Epoch 688/700
Epoch 689/700
Epoch 

Epoch 696/700
Epoch 697/700
Epoch 698/700
Epoch 699/700
Epoch 700/700


<keras.callbacks.History at 0x7feb1abc4438>

In [65]:
from keras.models import load_model
model.save('ANN_trained_model_2.h5')

In [75]:
ann_model = load_model('ANN_trained_model_2.h5')
y_predict = ann_model.predict(X_test)
y_test = y_test.astype(float)
y_predict = y_predict
y_predict = y_predict > 0.50

In [76]:
from sklearn.metrics import confusion_matrix
from numpy import argmax
cf = confusion_matrix(y_test.values.argmax(axis=1), y_predict.argmax(axis=1))
from sklearn.metrics import accuracy_score
#accuracy = accuracy_score(y_test_white, y_predict)
cf

array([[43659,  2648],
       [ 7645,  1609]])

In [77]:
accuracy = classification_report(y_test, y_predict)

In [78]:
print(accuracy) # 0, violent, 1 non violent.. it changed my 1, 2 columns!!

             precision    recall  f1-score   support

          0       0.85      0.94      0.89     46307
          1       0.38      0.17      0.24      9254

avg / total       0.77      0.81      0.79     55561



Next Steps:
1. Choosing best models and using kfold cross validation for train and test sets.
   Also check features that may introduce leakage (sunday).
   
2. Hyperparameter tunning and dimensionality reduction. (sunday)
3. Try with 50 top restaurant locations.
4. Creation of API using flask and a self updating database (psycopg2) with the same columns than our wrangled csv file. (monday-friday)
4. Create more visualization (next-week)
5. Write the paper. (next-week)

5. heat map with clusters from restarurant adresses.

In [None]:
# Choose and tune model...
# how do we add more weights to the poor performing class so it can perform better?.
# we expect the model to improve. with more user and police data entries, since most of the crimes are not reported.
# add an update model.

# DBSCAN clustering.

In [None]:
new_data = {'block': 'xxxx', 'location': 'xxxxxx'}
new_data = pd.DataFrame(new_data)
new_data = new_data.values()

model.predict(new_data)