# Problema

## Vancouver crime information

Select all crimes commited after 14:00 h

Using the given sqlite3 connection:
- Store all the crimes committed after 18:00 h in a `late_crimes` variable.
- Store the number of crimes committed on the month with most crimes in a `dangerous_month_crimes` variable.


# Solução

In [56]:
import numpy as np
import pandas as pd
import sqlite3

import os

In [57]:
# create a new connection to a db in memory
if os.path.exists('files/van_crime.db'):
    os.remove('files/van_crime.db')

conn = sqlite3.connect('files/van_crime.db')

# create a cursor
c = conn.cursor()

# restore the given van_crime_2003.sql dump
c.executescript(open('files/van_crime_2003.sql', 'r').read())

<sqlite3.Cursor at 0x183a72e5a40>

In [58]:
van_crimes_df = pd.read_sql('SELECT * FROM van_crimes WHERE hour > 14', conn)

In [59]:
van_crimes_df.head()

Unnamed: 0,TYPE,YEAR,MONTH,DAY,HOUR,MINUTE,HUNDRED_BLOCK,NEIGHBOURHOOD,X,Y
0,Theft from Vehicle,2003,11,17.0,16.0,0.0,56XX OAK ST,South Cambie,490682.32,5453536.96
1,Theft from Vehicle,2003,12,28.0,16.0,45.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36
2,Theft from Vehicle,2003,12,12.0,15.0,30.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36
3,Theft from Vehicle,2003,11,5.0,16.0,30.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36
4,Theft of Vehicle,2003,9,2.0,21.0,0.0,20XX E 28TH AVE,Kensington-Cedar Cottage,495267.03,5454779.05


In [60]:
van_crimes_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 126 entries, 0 to 125
Data columns (total 10 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   TYPE           126 non-null    object 
 1   YEAR           126 non-null    int64  
 2   MONTH          126 non-null    int64  
 3   DAY            126 non-null    float64
 4   HOUR           126 non-null    float64
 5   MINUTE         126 non-null    float64
 6   HUNDRED_BLOCK  126 non-null    object 
 7   NEIGHBOURHOOD  126 non-null    object 
 8   X              126 non-null    float64
 9   Y              126 non-null    float64
dtypes: float64(5), int64(2), object(3)
memory usage: 10.0+ KB


Store all the crimes committed after 18:00 h in a `late_crimes` variable.

In [61]:
late_crimes = van_crimes_df.loc[van_crimes_df['HOUR'] > 18]
late_crimes.head()

Unnamed: 0,TYPE,YEAR,MONTH,DAY,HOUR,MINUTE,HUNDRED_BLOCK,NEIGHBOURHOOD,X,Y
4,Theft of Vehicle,2003,9,2.0,21.0,0.0,20XX E 28TH AVE,Kensington-Cedar Cottage,495267.03,5454779.05
5,Theft from Vehicle,2003,9,27.0,22.0,30.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36
6,Theft from Vehicle,2003,12,17.0,21.0,0.0,31XX WILLOW ST,Fairview,491115.72,5456039.96
7,Theft from Vehicle,2003,9,1.0,20.0,10.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36
8,Theft from Vehicle,2003,8,17.0,19.0,0.0,85XX STANLEY PARK DR,Stanley Park,489104.19,5460347.36


In [62]:
late_crimes.shape

(57, 10)

Store the number of crimes committed on the month with most crimes in a `dangerous_month_crimes` variable.

In [63]:
van_crimes_df['MONTH'].value_counts()

MONTH
6     17
5     16
12    14
9     11
8     11
1     10
7      9
10     9
3      9
4      9
2      8
11     3
Name: count, dtype: int64

In [64]:
# The month with most crimes in 6

dangerous_month_crimes = van_crimes_df['MONTH'].value_counts().head(1)

In [65]:
dangerous_month_crimes.values

array([17], dtype=int64)

In [66]:
c.close()
conn.close()

# Testes

In [67]:
def test_crimes_1():
    return van_crimes_df.shape == (126, 10)

def test_crimes_2():
    return van_crimes_df.loc[14, 'HOUR'] == 23

def test_dangerous_month_1():
    return dangerous_month_crimes.values[0] == 17

def test_dangerous_month_2():
    return dangerous_month_crimes.index[0] == 6

def test_late_crimes_1():
    return late_crimes.shape == (57, 10)

def test_late_crimes_2():
    return late_crimes.loc[7, 'HOUR'] == 20

In [68]:
test_crimes_1()

True

In [69]:
test_crimes_2()

True

In [70]:
test_dangerous_month_1()

True

In [71]:
test_dangerous_month_2()

True

In [72]:
test_late_crimes_1()

True

In [73]:
test_late_crimes_2()

True