# Project 3: Data and Maps!

Thanks to John P. Dickersion for the project idea!

**Posted:** Nov 7th, 2019.

**Due:** Nov. 26th, 2019.

In this project we are going to work with a fairly clean set of data from Baltimore crime data covering the years 2011 and 2012.  This is a fairly open ended project, you will need to work with the data a bit and come up with your own things to show.

In [30]:
# Includes and Standard Magic...
### Standard Magic and startup initializers.

# Load Numpy
import numpy as np
# Load MatPlotLib
import matplotlib
import matplotlib.pyplot as plt
# Load Pandas
import pandas as pd
# Load Stats
from scipy import stats
# import folium TODO: add folium to path
import re

import datetime as dt

# This lets us show plots inline and also save PDF plots if we want them
%matplotlib inline
matplotlib.style.use('fivethirtyeight')

## Part 1: Data Wrangling.

The data is a bit messy to start out with.  Perform the following tasks to make it clean and tidy.

1. Split the `Location 1` column into a `lat` and `long` columns.  Ensure that the columns are of float type and you drop any record that is missing a location.
2. You can drop the `arrest`, `post`, `charge`, and the `Location 1` column.
3. Merge the date and time column and make sure they are the proper type.  Drop any row that does not have a date and time.
4. Set the index so that we can sort and slice based on the date/time.
5. Drop any records that have NA values.
6. Go through the remaining columns and ensure you have set the dtype properly.
7. Display the head of the table and the dtypes in your notebook.





### Code Description:
First I dropped the na values using dropna
To create latitude and longituted columns, I defined functions to split and clean up the `Location 1` entries. Then I used .apply() to create new lat and long columns.

To merge datetimes into one column, I created a for loop to split date and time strings then create datetime objects with them. The for loop created entries of a new column in the dataframe.

I then set the index as the datetime columns and dropped unneccessary columns.

In [27]:
#Read in dataframe
raw_df = pd.read_csv("./BPD_Arrests.csv")

#drop NaN location values
raw_df.dropna(axis = 0, inplace = True)

#define function to return latitude (first value)
def get_lat(lat_long):
    temp_array = re.split(" ", str(lat_long))
    return float(re.sub(r"\(|\,", "", temp_array[0]))

#define function to get longitute (second value)
def get_long(lat_long):
    temp_array = re.split(" ", str(lat_long))
    return float(re.sub(r"\)", "", temp_array[1]))

# make colums for latitude and longitude
raw_df["Latitude"] = raw_df["Location 1"].apply(get_lat)
raw_df["Longitude"] = raw_df["Location 1"].apply(get_long)

# Drop the arrest, post, charge, and the Location 1 column
raw_df.drop(axis = 1, columns = ["arrest", "post", "charge", "Location 1"], inplace = True)

Unnamed: 0,age,race,sex,arrestDate,arrestTime,arrestLocation,incidentOffense,incidentLocation,chargeDescription,district,neighborhood,Latitude,Longitude
1,37,B,M,01/01/2011,00:01:00,2000 Wilkens Ave,79-Other,Wilkens Av & S Payson St,Reckless Endangerment || Hand Gun Violation,SOUTHERN,Carrollton Ridge,39.281403,-76.648364
3,50,B,M,01/01/2011,00:04:00,2100 Ashburton St,79-Other,2100 Ashburton St,Reg Firearm:Illegal Possession || Hgv,WESTERN,Panway/Braddish Avenue,39.31172,-76.662355
5,41,B,M,01/01/2011,00:05:00,2900 Spellman Rd,81-Recovered Property,2900 Spelman Rd,Reckless Endangerment || Handgun Violation,SOUTHERN,Cherry Hill,39.244989,-76.627358
6,29,B,M,01/01/2011,00:05:00,800 N Monroe St,79-Other,800 N Monroe St,Handgun On Person || Handgun Violation,WESTERN,Midtown-Edmondson,39.297982,-76.647511
9,53,B,M,01/01/2011,00:15:00,3300 Woodland Ave,54-Armed Person,3300 Woodland Av,Reckless Endangerment || Hgv,NORTHWESTERN,Central Park Heights,39.343677,-76.67273


In [37]:
# create a new columns using the arrays to initialize datetime objects
for index, row in raw_df.iterrows():
    #split date and time into arrays
    date_array = re.split("/", row["arrestDate"])
    time_array = re.split(":", row["arrestTime"])
    #make a column with datetime object
    raw_df['arrestDateTime'] = dt.datetime(
        int(date_array[2]), 
        int(date_array[0]), 
        int(date_array[1]), 
        int(time_array[0]), 
        int(time_array[1]), 
        int(time_array[2])
    )

# drop arrestTime and arrestDate columns
raw_df.drop(axis = 1, columns = ["arrestTime", "arrestDate"], inplace = True)


Unnamed: 0,age,race,sex,arrestLocation,incidentOffense,incidentLocation,chargeDescription,district,neighborhood,Latitude,Longitude,arrestDateTime
1,37,B,M,2000 Wilkens Ave,79-Other,Wilkens Av & S Payson St,Reckless Endangerment || Hand Gun Violation,SOUTHERN,Carrollton Ridge,39.281403,-76.648364,2012-12-31 23:55:00
3,50,B,M,2100 Ashburton St,79-Other,2100 Ashburton St,Reg Firearm:Illegal Possession || Hgv,WESTERN,Panway/Braddish Avenue,39.31172,-76.662355,2012-12-31 23:55:00
5,41,B,M,2900 Spellman Rd,81-Recovered Property,2900 Spelman Rd,Reckless Endangerment || Handgun Violation,SOUTHERN,Cherry Hill,39.244989,-76.627358,2012-12-31 23:55:00
6,29,B,M,800 N Monroe St,79-Other,800 N Monroe St,Handgun On Person || Handgun Violation,WESTERN,Midtown-Edmondson,39.297982,-76.647511,2012-12-31 23:55:00
9,53,B,M,3300 Woodland Ave,54-Armed Person,3300 Woodland Av,Reckless Endangerment || Hgv,NORTHWESTERN,Central Park Heights,39.343677,-76.67273,2012-12-31 23:55:00


In [41]:
#set index as datetimes
raw_df.set_index(keys = "arrestDateTime")

# check that all columns have the correct dtype
raw_df.dtypes

age                           int64
race                         object
sex                          object
arrestLocation               object
incidentOffense              object
incidentLocation             object
chargeDescription            object
district                     object
neighborhood                 object
Latitude                    float64
Longitude                   float64
arrestDateTime       datetime64[ns]
dtype: object

In [47]:
# display dataframe
raw_df.head()

Unnamed: 0,age,race,sex,arrestLocation,incidentOffense,incidentLocation,chargeDescription,district,neighborhood,Latitude,Longitude,arrestDateTime
1,37,B,M,2000 Wilkens Ave,79-Other,Wilkens Av & S Payson St,Reckless Endangerment || Hand Gun Violation,SOUTHERN,Carrollton Ridge,39.281403,-76.648364,2012-12-31 23:55:00
3,50,B,M,2100 Ashburton St,79-Other,2100 Ashburton St,Reg Firearm:Illegal Possession || Hgv,WESTERN,Panway/Braddish Avenue,39.31172,-76.662355,2012-12-31 23:55:00
5,41,B,M,2900 Spellman Rd,81-Recovered Property,2900 Spelman Rd,Reckless Endangerment || Handgun Violation,SOUTHERN,Cherry Hill,39.244989,-76.627358,2012-12-31 23:55:00
6,29,B,M,800 N Monroe St,79-Other,800 N Monroe St,Handgun On Person || Handgun Violation,WESTERN,Midtown-Edmondson,39.297982,-76.647511,2012-12-31 23:55:00
9,53,B,M,3300 Woodland Ave,54-Armed Person,3300 Woodland Av,Reckless Endangerment || Hgv,NORTHWESTERN,Central Park Heights,39.343677,-76.67273,2012-12-31 23:55:00


### Question 1:
How many records did we drop using our processing above?  Do you think this will affect our data later?  What type of missingness do you think these values have? 

In [60]:
# read in the data again and check the columns
initial_df = pd.read_csv("./BPD_Arrests.csv")

initial_df.dtypes
# initial_df[initial_df["chargeDescription"] == "Violation Of Probation || Violation Of Probation"]["Location 1"].value_counts()

arrest               float64
age                    int64
race                  object
sex                   object
arrestDate            object
arrestTime            object
arrestLocation        object
incidentOffense       object
incidentLocation      object
charge                object
chargeDescription     object
district              object
post                 float64
neighborhood          object
Location 1            object
dtype: object

### Answer 1:
The initial data contained 104528 rows. After we dropped the rows with na values, the dataframe contained 54040 rows. We dropped a little less than half our data. This can lead to our sample having a **larger variance** and/or having a **biased sample.**

### Question 2:
Thinking about the kinds of missing-ness in our data.  What is one imputation method that we could have used to fill in some gaps?  Implement one such method that is not just `dropna`.

### Answer 2:
I infer that the data is *Missing at Random*. In order to account for the missing data. I will use other information to predict missing values. When the lat/long of a row is missing, I will check if the neighboorhood, arrest location, and incident location are blank. If they are, I will discard the entry, because it tells us very little about the arrest. If any of these values are known, I will fill our missing data with a random sample with the same values. This is a combination of bootstrapping and value prediction.

In [65]:
# define a method to add a new row to the dataframe with desired values.
# def add_row(dataframe):
#     row = dataframe.iloc[0]
#     row
for j, row in initial_df.iterrows():
    if (pd.isna(row["Location 1"])):
        # initialize empty strings
        d = ""
        n = ""
        a = ""
        i = ""
        # get information we will use to predict the data
        if (!pd.isna(row["district"])):
            d = row["district"]
        if (!pd.isna(row["neighborhood"])):
            n = row["neighborhood"]
        if (!pd.isna(row["arrestLocation"])):
            a = row["arrestLocation"]
        if (!pd.isna(row["incidentLocation"])):
            i = row["incidentLocation"]
        
        # bootstrap the missing data using known data from rows with like attributes.
        if (d or n or a or i):
            matching_sample = raw_df[raw_df["district" == d] &&
                                    raw_df["neighborhood" == n] &&
                                    raw_df["arrestLocation" == a] &&
                                    raw_df["incidentLocation" == i]]
# initial_df.iloc[0]

0
8
22
34
35
43
44
47
48
49
51
54
58
61
62
64
69
71
74
77
84
85
86
90
93
94
95
97
99
100
102
103
105
108
109
113
114
115
117
119
122
125
126
130
138
140
142
145
150
151
152
153
154
156
157
158
159
160
161
162
163
164
165
166
167
168
170
171
172
173
174
175
176
177
178
179
180
181
182
184
186
187
189
195
196
197
198
199
203
207
209
211
224
226
229
235
239
244
245
246
248
251
253
254
256
257
258
259
260
266
267
268
269
270
271
273
274
275
276
277
278
279
280
281
284
285
291
296
298
305
306
314
315
316
324
327
328
331
332
333
334
335
338
339
340
341
342
343
347
348
353
354
356
357
358
359
370
371
372
373
374
375
377
380
381
382
383
384
385
386
387
390
395
396
405
408
410
411
412
416
422
431
432
433
434
435
436
437
438
439
440
441
443
444
446
447
451
452
453
457
458
459
461
462
463
464
469
474
475
476
478
480
481
483
484
490
491
492
494
500
501
502
503
508
509
518
521
525
526
527
528
529
530
531
537
541
545
546
548
554
556
559
564
567
569
570
572
573
574
575
576
577
578
579
581
582
583
584

4959
4963
4964
4965
4966
4969
4971
4972
4973
4974
4975
4976
4977
4978
4980
4981
4986
4988
4989
4990
4993
4994
4995
4996
4997
4998
4999
5008
5010
5014
5020
5021
5024
5025
5026
5027
5031
5032
5033
5034
5035
5036
5038
5039
5041
5043
5046
5047
5052
5053
5054
5055
5056
5057
5058
5059
5060
5061
5062
5064
5065
5066
5068
5069
5071
5074
5075
5077
5079
5086
5088
5089
5093
5096
5099
5102
5104
5105
5106
5111
5115
5118
5119
5123
5124
5125
5126
5129
5131
5134
5135
5137
5140
5142
5143
5144
5145
5151
5160
5161
5162
5164
5169
5170
5171
5172
5174
5177
5178
5181
5191
5192
5197
5200
5201
5206
5207
5214
5215
5219
5220
5222
5223
5226
5227
5231
5232
5234
5237
5238
5241
5242
5243
5247
5248
5249
5254
5255
5257
5258
5260
5267
5268
5270
5273
5278
5279
5282
5283
5285
5286
5287
5289
5290
5291
5292
5293
5294
5295
5296
5298
5300
5301
5302
5303
5304
5308
5309
5310
5311
5313
5316
5322
5325
5326
5330
5331
5332
5335
5339
5344
5348
5352
5353
5359
5360
5361
5367
5370
5375
5376
5381
5383
5388
5391
5392
5393
5396
5397
5398


9571
9572
9573
9575
9577
9581
9582
9584
9585
9588
9590
9591
9592
9593
9594
9595
9598
9599
9600
9601
9605
9606
9607
9610
9611
9612
9613
9614
9615
9617
9618
9620
9621
9622
9623
9624
9625
9626
9627
9629
9631
9634
9635
9637
9642
9646
9648
9651
9652
9655
9656
9657
9658
9661
9662
9663
9664
9666
9667
9670
9677
9678
9682
9685
9688
9693
9697
9700
9701
9703
9704
9705
9707
9708
9709
9710
9711
9713
9714
9715
9717
9718
9719
9720
9721
9722
9723
9725
9726
9727
9728
9729
9730
9731
9734
9735
9738
9739
9740
9741
9742
9749
9750
9752
9760
9767
9768
9769
9771
9779
9780
9781
9782
9783
9784
9787
9788
9796
9797
9798
9799
9800
9801
9806
9807
9810
9813
9814
9816
9818
9819
9822
9826
9827
9828
9830
9832
9835
9837
9838
9844
9847
9848
9852
9856
9858
9861
9867
9870
9872
9877
9881
9884
9888
9889
9893
9894
9897
9898
9900
9901
9902
9903
9904
9905
9906
9909
9910
9914
9915
9916
9917
9918
9919
9920
9922
9923
9924
9926
9927
9928
9929
9930
9933
9936
9937
9939
9940
9945
9947
9948
9951
9960
9961
9972
9973
9974
9984
9986
9989


12969
12975
12976
12977
12978
12979
12980
12981
12982
12984
12986
12989
12994
12995
12996
12997
12999
13000
13001
13003
13004
13005
13009
13010
13011
13015
13016
13017
13018
13019
13022
13023
13030
13031
13032
13033
13034
13036
13037
13041
13042
13045
13046
13055
13058
13059
13061
13065
13066
13072
13080
13081
13082
13084
13087
13091
13094
13101
13108
13112
13113
13117
13119
13120
13121
13129
13130
13132
13134
13135
13137
13139
13140
13142
13143
13145
13146
13147
13150
13151
13152
13156
13157
13158
13159
13160
13163
13172
13173
13174
13176
13178
13181
13183
13184
13189
13190
13192
13194
13198
13199
13200
13201
13207
13208
13210
13211
13225
13226
13227
13231
13238
13239
13240
13244
13245
13249
13250
13251
13256
13261
13264
13265
13266
13268
13271
13272
13273
13274
13275
13276
13277
13279
13280
13282
13283
13285
13286
13287
13288
13289
13290
13291
13292
13293
13295
13296
13297
13299
13302
13308
13310
13312
13314
13315
13316
13318
13324
13325
13329
13330
13331
13333
13335
13342
13343
1334

16430
16431
16432
16433
16434
16435
16436
16437
16438
16441
16442
16443
16444
16445
16447
16448
16451
16455
16456
16459
16462
16464
16465
16469
16476
16477
16480
16481
16484
16485
16486
16487
16488
16491
16492
16495
16496
16497
16498
16499
16500
16502
16504
16505
16506
16507
16509
16512
16513
16514
16515
16517
16518
16521
16522
16523
16524
16529
16531
16532
16537
16538
16539
16540
16541
16542
16547
16549
16550
16551
16552
16555
16557
16559
16562
16563
16565
16576
16577
16578
16583
16584
16587
16593
16594
16596
16602
16605
16607
16615
16616
16619
16627
16637
16638
16639
16640
16641
16645
16647
16649
16651
16652
16657
16659
16664
16666
16671
16672
16675
16680
16681
16682
16704
16708
16709
16710
16713
16714
16719
16721
16723
16725
16727
16728
16729
16730
16737
16740
16743
16744
16745
16748
16751
16752
16753
16755
16756
16762
16763
16764
16765
16767
16768
16769
16770
16771
16772
16773
16774
16775
16776
16777
16781
16782
16783
16784
16786
16787
16788
16789
16794
16795
16796
16797
16798
1679

21333
21334
21336
21338
21342
21344
21345
21346
21352
21353
21354
21355
21357
21359
21360
21361
21362
21366
21376
21377
21381
21383
21386
21388
21390
21393
21395
21397
21400
21401
21403
21406
21409
21410
21412
21414
21421
21425
21427
21429
21432
21437
21439
21450
21456
21459
21460
21467
21472
21474
21476
21480
21483
21486
21487
21489
21491
21492
21493
21495
21496
21498
21500
21502
21503
21504
21505
21506
21510
21511
21512
21526
21527
21528
21529
21533
21535
21536
21538
21539
21542
21543
21544
21546
21549
21550
21551
21552
21554
21560
21561
21563
21565
21566
21567
21568
21569
21570
21571
21572
21577
21578
21581
21582
21586
21587
21588
21590
21592
21593
21594
21601
21602
21605
21608
21611
21612
21616
21618
21623
21624
21628
21629
21630
21631
21632
21633
21634
21635
21639
21640
21641
21644
21646
21647
21648
21650
21651
21653
21654
21656
21657
21658
21659
21660
21661
21662
21664
21670
21672
21675
21676
21677
21678
21679
21680
21681
21682
21683
21686
21687
21691
21692
21693
21694
21695
2169

26055
26056
26057
26058
26059
26060
26063
26064
26065
26070
26071
26075
26077
26078
26079
26080
26081
26083
26087
26088
26093
26097
26098
26099
26110
26111
26113
26117
26122
26123
26124
26125
26132
26135
26137
26141
26142
26143
26145
26147
26149
26150
26151
26152
26153
26154
26155
26156
26157
26158
26159
26161
26162
26164
26167
26168
26169
26170
26171
26173
26174
26175
26179
26185
26186
26187
26188
26189
26190
26191
26192
26193
26194
26195
26196
26198
26199
26200
26201
26203
26205
26206
26209
26214
26215
26220
26223
26225
26226
26227
26230
26231
26232
26233
26234
26236
26237
26238
26242
26243
26245
26248
26251
26252
26256
26257
26258
26259
26267
26268
26272
26274
26276
26279
26281
26282
26287
26288
26292
26294
26296
26303
26311
26312
26313
26314
26315
26316
26321
26322
26326
26330
26331
26332
26333
26334
26335
26336
26338
26339
26342
26343
26347
26348
26351
26352
26353
26354
26355
26359
26362
26363
26364
26366
26372
26373
26374
26376
26377
26379
26382
26383
26384
26386
26389
26395
2639

29817
29823
29824
29825
29829
29830
29833
29838
29842
29851
29852
29853
29854
29861
29863
29865
29872
29873
29875
29880
29890
29891
29893
29896
29897
29899
29900
29901
29911
29920
29922
29929
29940
29942
29943
29945
29946
29947
29949
29950
29951
29952
29953
29954
29956
29963
29975
29984
29987
29988
29989
29993
29995
30000
30001
30002
30004
30005
30006
30007
30008
30010
30011
30013
30014
30016
30017
30018
30019
30020
30021
30025
30030
30031
30032
30036
30037
30041
30043
30044
30045
30047
30048
30051
30052
30053
30056
30060
30061
30062
30063
30064
30065
30068
30075
30078
30079
30080
30090
30093
30098
30099
30101
30103
30108
30109
30110
30111
30112
30125
30128
30129
30133
30135
30136
30144
30152
30154
30156
30157
30158
30159
30160
30161
30162
30163
30164
30165
30166
30167
30168
30169
30170
30171
30173
30174
30177
30180
30181
30184
30194
30195
30201
30202
30204
30206
30209
30212
30215
30219
30220
30221
30222
30223
30226
30230
30235
30236
30237
30238
30240
30241
30242
30243
30244
30245
3024

34530
34531
34534
34535
34536
34537
34538
34541
34544
34545
34548
34550
34551
34554
34555
34556
34557
34560
34561
34566
34567
34568
34569
34570
34572
34573
34574
34575
34578
34579
34580
34582
34583
34588
34589
34590
34599
34600
34601
34602
34603
34605
34610
34612
34618
34625
34641
34645
34649
34650
34651
34658
34662
34665
34666
34667
34670
34681
34683
34686
34689
34692
34694
34695
34699
34701
34708
34709
34710
34714
34716
34717
34720
34721
34722
34725
34726
34729
34731
34732
34733
34734
34739
34740
34743
34745
34746
34747
34748
34752
34756
34757
34758
34759
34760
34763
34764
34767
34768
34772
34773
34774
34775
34776
34777
34778
34783
34784
34785
34787
34788
34789
34790
34791
34796
34797
34798
34800
34812
34813
34818
34819
34822
34823
34824
34825
34838
34845
34846
34854
34864
34865
34868
34870
34872
34876
34877
34878
34880
34883
34886
34889
34890
34891
34892
34893
34901
34903
34904
34907
34908
34909
34911
34914
34915
34916
34917
34918
34921
34922
34923
34924
34929
34930
34931
34932
3493

39011
39014
39016
39018
39019
39022
39023
39024
39029
39034
39041
39042
39044
39046
39047
39052
39055
39058
39060
39064
39065
39066
39067
39068
39069
39071
39072
39073
39076
39081
39082
39085
39086
39087
39089
39090
39091
39092
39093
39094
39112
39114
39115
39116
39123
39132
39134
39135
39138
39148
39149
39152
39153
39154
39157
39159
39160
39161
39162
39163
39165
39167
39171
39182
39183
39185
39190
39196
39199
39216
39237
39240
39246
39251
39252
39254
39256
39257
39260
39267
39268
39269
39274
39285
39287
39289
39291
39292
39295
39299
39304
39309
39313
39314
39315
39320
39321
39325
39326
39331
39332
39334
39343
39345
39346
39347
39348
39350
39351
39353
39354
39357
39358
39361
39362
39368
39369
39379
39381
39382
39385
39387
39388
39389
39390
39391
39393
39394
39395
39396
39398
39399
39400
39404
39405
39406
39409
39416
39417
39418
39419
39420
39421
39422
39423
39426
39427
39428
39432
39437
39438
39440
39443
39446
39447
39448
39453
39458
39462
39466
39467
39468
39469
39470
39475
39476
3947

42647
42651
42652
42656
42658
42663
42664
42667
42668
42669
42671
42672
42674
42675
42676
42678
42680
42683
42684
42685
42689
42699
42702
42706
42707
42708
42713
42714
42720
42724
42728
42731
42732
42734
42739
42740
42741
42744
42745
42746
42747
42748
42753
42758
42759
42760
42769
42770
42771
42772
42776
42777
42784
42787
42795
42796
42802
42803
42804
42807
42813
42817
42826
42830
42835
42841
42842
42844
42845
42847
42850
42851
42852
42853
42854
42855
42856
42858
42859
42864
42865
42867
42868
42869
42870
42871
42875
42877
42881
42882
42884
42885
42887
42888
42889
42890
42891
42892
42893
42895
42896
42897
42901
42902
42905
42906
42910
42911
42912
42915
42916
42917
42918
42920
42927
42929
42930
42933
42936
42937
42945
42947
42953
42965
42966
42969
42972
42977
42980
42981
42983
42988
42989
42994
42997
43001
43002
43003
43004
43005
43006
43007
43008
43009
43010
43011
43012
43013
43014
43016
43017
43019
43021
43027
43028
43029
43030
43031
43033
43035
43036
43040
43041
43044
43045
43052
4305

46511
46512
46513
46514
46515
46517
46518
46520
46521
46522
46523
46524
46531
46532
46540
46543
46544
46559
46560
46561
46562
46563
46564
46565
46567
46568
46569
46570
46571
46573
46574
46582
46583
46588
46589
46590
46592
46595
46596
46597
46598
46601
46610
46611
46618
46620
46622
46626
46632
46634
46638
46640
46648
46650
46652
46653
46655
46657
46658
46661
46663
46664
46665
46666
46668
46669
46670
46673
46674
46675
46676
46677
46678
46683
46684
46686
46688
46689
46690
46691
46692
46693
46694
46695
46696
46697
46700
46701
46702
46703
46704
46707
46708
46709
46710
46711
46712
46713
46717
46718
46720
46722
46723
46726
46727
46731
46733
46739
46741
46743
46744
46749
46751
46752
46753
46756
46761
46766
46768
46769
46772
46773
46774
46775
46777
46789
46790
46791
46792
46794
46796
46797
46806
46807
46808
46818
46820
46821
46822
46825
46827
46829
46830
46831
46832
46837
46838
46839
46840
46841
46842
46843
46844
46845
46850
46853
46858
46859
46860
46861
46862
46863
46864
46865
46866
46867
4686

50657
50660
50661
50664
50668
50670
50675
50678
50679
50680
50681
50682
50683
50684
50685
50686
50687
50688
50689
50690
50692
50696
50697
50698
50700
50701
50704
50707
50708
50709
50713
50715
50717
50719
50725
50726
50732
50733
50740
50747
50749
50750
50751
50753
50754
50755
50756
50757
50759
50760
50764
50767
50768
50769
50775
50776
50779
50780
50787
50788
50796
50797
50805
50815
50819
50826
50827
50830
50832
50833
50834
50837
50838
50839
50840
50841
50842
50843
50844
50845
50846
50849
50850
50852
50856
50858
50861
50863
50864
50865
50867
50868
50873
50878
50879
50881
50883
50885
50886
50887
50897
50898
50899
50900
50905
50910
50914
50915
50916
50917
50920
50937
50938
50939
50940
50941
50943
50946
50948
50949
50954
50957
50960
50961
50964
50967
50968
50974
50978
50981
50984
50985
50989
50990
50991
50993
50994
50996
50998
50999
51002
51006
51007
51009
51010
51011
51012
51013
51014
51015
51016
51018
51019
51021
51022
51024
51025
51027
51033
51040
51043
51045
51046
51048
51050
51057
5105

55239
55240
55248
55251
55252
55264
55274
55279
55284
55288
55290
55294
55297
55298
55304
55305
55309
55310
55312
55313
55321
55322
55326
55327
55329
55330
55332
55333
55334
55336
55338
55339
55340
55342
55343
55350
55355
55356
55360
55361
55363
55365
55368
55369
55371
55376
55377
55378
55379
55380
55381
55388
55390
55392
55398
55399
55400
55403
55405
55406
55407
55408
55409
55410
55411
55412
55413
55415
55420
55421
55422
55423
55424
55425
55431
55434
55435
55436
55448
55449
55450
55463
55468
55469
55470
55471
55472
55477
55480
55483
55484
55485
55486
55487
55488
55489
55494
55495
55496
55499
55500
55501
55502
55503
55504
55505
55506
55507
55513
55523
55525
55526
55527
55528
55529
55530
55537
55538
55541
55543
55544
55547
55556
55557
55560
55563
55566
55567
55568
55570
55573
55574
55577
55579
55589
55592
55594
55600
55601
55607
55621
55622
55629
55637
55639
55640
55645
55647
55656
55657
55658
55665
55667
55676
55680
55684
55695
55696
55701
55702
55711
55713
55717
55718
55719
55725
5572

59393
59394
59397
59399
59414
59419
59423
59426
59427
59429
59430
59436
59442
59451
59455
59462
59466
59467
59468
59470
59475
59479
59481
59484
59488
59489
59491
59493
59495
59500
59502
59503
59504
59505
59507
59508
59509
59510
59513
59514
59521
59526
59528
59529
59531
59532
59533
59535
59538
59541
59542
59543
59550
59551
59553
59554
59555
59556
59559
59563
59565
59566
59567
59569
59570
59571
59572
59573
59575
59576
59577
59578
59579
59580
59581
59583
59590
59591
59595
59598
59606
59616
59621
59622
59629
59632
59645
59649
59650
59651
59653
59654
59659
59660
59662
59664
59665
59666
59667
59668
59669
59671
59672
59674
59677
59678
59679
59680
59681
59683
59690
59691
59693
59694
59696
59697
59698
59701
59703
59705
59706
59722
59724
59729
59734
59735
59742
59748
59750
59752
59753
59756
59759
59760
59762
59763
59765
59768
59769
59781
59782
59784
59785
59803
59806
59807
59809
59812
59813
59816
59819
59820
59822
59830
59831
59837
59841
59842
59843
59848
59849
59852
59856
59857
59859
59862
5986

63318
63319
63320
63324
63325
63326
63327
63329
63330
63332
63335
63337
63339
63340
63341
63342
63344
63346
63354
63355
63356
63362
63367
63373
63379
63380
63389
63390
63391
63394
63397
63403
63404
63410
63411
63419
63421
63423
63428
63432
63437
63438
63439
63442
63443
63447
63450
63455
63466
63467
63472
63476
63478
63488
63493
63495
63497
63498
63501
63502
63503
63506
63509
63518
63520
63521
63523
63524
63525
63526
63528
63529
63530
63531
63532
63534
63536
63539
63543
63544
63545
63559
63564
63566
63571
63572
63573
63574
63579
63580
63581
63582
63586
63588
63589
63593
63595
63596
63600
63601
63606
63609
63611
63613
63614
63616
63626
63630
63638
63645
63651
63652
63653
63654
63655
63659
63660
63664
63666
63672
63673
63677
63678
63685
63688
63690
63692
63693
63694
63700
63702
63705
63706
63707
63710
63711
63714
63716
63717
63720
63723
63724
63725
63726
63727
63728
63733
63738
63739
63741
63745
63748
63749
63750
63751
63752
63753
63754
63756
63758
63760
63762
63771
63772
63775
63776
6378

67732
67736
67744
67756
67762
67766
67767
67768
67769
67771
67777
67780
67781
67782
67783
67784
67785
67787
67788
67789
67790
67792
67793
67794
67795
67800
67801
67803
67810
67811
67812
67813
67814
67815
67816
67817
67820
67824
67825
67827
67829
67841
67843
67845
67847
67849
67850
67851
67852
67859
67860
67866
67867
67872
67873
67875
67879
67880
67883
67884
67886
67895
67903
67904
67905
67909
67910
67914
67915
67928
67932
67936
67939
67941
67942
67947
67948
67949
67950
67957
67961
67962
67967
67968
67974
67975
67984
67985
67986
67987
67990
67991
67992
67993
67995
67996
67997
68001
68003
68005
68006
68008
68010
68011
68012
68013
68014
68015
68016
68019
68021
68023
68025
68028
68029
68030
68032
68033
68037
68040
68043
68044
68045
68059
68060
68061
68067
68071
68075
68080
68084
68087
68089
68090
68095
68099
68100
68102
68103
68104
68107
68114
68116
68117
68122
68126
68128
68130
68140
68143
68147
68150
68151
68152
68153
68157
68159
68162
68166
68167
68168
68177
68183
68184
68191
68194
6819

71713
71715
71718
71725
71730
71731
71740
71741
71743
71744
71745
71747
71753
71754
71756
71757
71760
71761
71766
71767
71768
71769
71771
71772
71773
71776
71777
71778
71779
71780
71782
71784
71785
71786
71787
71788
71789
71790
71792
71793
71794
71795
71796
71797
71798
71799
71802
71803
71804
71807
71811
71812
71813
71814
71815
71816
71817
71823
71825
71826
71827
71829
71832
71833
71834
71836
71839
71848
71849
71850
71853
71854
71859
71864
71866
71867
71868
71869
71871
71872
71873
71874
71877
71881
71888
71889
71893
71900
71902
71903
71904
71905
71906
71908
71910
71914
71918
71919
71920
71921
71924
71929
71932
71933
71934
71936
71939
71946
71947
71948
71949
71951
71952
71953
71955
71956
71957
71958
71959
71960
71963
71964
71965
71967
71970
71974
71975
71976
71977
71982
71983
71984
71985
71989
71990
71995
71996
71997
72001
72002
72004
72005
72007
72008
72012
72017
72020
72021
72024
72032
72040
72043
72044
72046
72047
72048
72050
72052
72055
72065
72069
72070
72071
72075
72076
72080
7208

75528
75529
75533
75534
75536
75537
75538
75542
75549
75550
75551
75552
75553
75554
75556
75557
75558
75559
75561
75565
75568
75569
75572
75574
75575
75576
75577
75578
75582
75583
75584
75585
75592
75593
75596
75601
75602
75603
75605
75609
75614
75618
75619
75623
75624
75625
75626
75633
75649
75650
75655
75656
75657
75658
75660
75674
75675
75676
75682
75686
75688
75690
75693
75705
75718
75720
75724
75727
75730
75734
75735
75736
75738
75741
75742
75744
75745
75746
75747
75748
75749
75750
75751
75752
75753
75756
75762
75763
75765
75766
75767
75768
75769
75770
75771
75772
75775
75776
75779
75781
75782
75783
75787
75790
75792
75793
75796
75798
75802
75807
75808
75811
75812
75813
75817
75820
75822
75823
75824
75826
75827
75831
75838
75841
75842
75845
75851
75856
75865
75870
75873
75879
75882
75883
75884
75890
75892
75893
75894
75896
75897
75898
75899
75912
75916
75917
75919
75921
75922
75924
75926
75927
75929
75930
75936
75937
75939
75940
75941
75942
75944
75952
75953
75960
75961
75962
7596

79337
79338
79339
79350
79352
79355
79365
79368
79371
79377
79378
79380
79381
79387
79388
79390
79391
79409
79410
79413
79415
79416
79421
79423
79424
79425
79426
79427
79428
79429
79430
79431
79436
79437
79438
79439
79441
79442
79445
79448
79452
79457
79469
79487
79493
79497
79502
79516
79520
79522
79524
79526
79530
79531
79532
79536
79542
79543
79546
79549
79557
79564
79565
79567
79571
79574
79577
79578
79579
79584
79585
79586
79587
79588
79591
79592
79593
79594
79596
79597
79599
79600
79602
79603
79604
79605
79606
79610
79611
79612
79613
79617
79618
79619
79620
79621
79623
79624
79625
79628
79629
79631
79632
79633
79635
79636
79637
79638
79641
79642
79644
79646
79647
79649
79655
79656
79660
79662
79668
79669
79672
79675
79676
79677
79679
79681
79691
79693
79694
79701
79706
79707
79708
79709
79710
79711
79712
79715
79718
79719
79721
79722
79724
79726
79727
79728
79729
79730
79738
79743
79744
79745
79749
79752
79753
79755
79761
79765
79766
79775
79776
79777
79778
79783
79792
79793
7979

83721
83722
83723
83728
83731
83732
83733
83738
83744
83745
83746
83747
83750
83751
83755
83757
83761
83762
83765
83768
83769
83771
83774
83775
83779
83785
83788
83790
83791
83792
83793
83795
83796
83801
83803
83804
83810
83813
83817
83818
83819
83820
83821
83822
83823
83824
83825
83827
83831
83836
83841
83842
83845
83847
83848
83850
83853
83854
83857
83866
83868
83869
83873
83874
83876
83877
83878
83879
83880
83883
83884
83885
83886
83890
83896
83899
83901
83902
83903
83904
83909
83910
83912
83917
83925
83934
83936
83937
83941
83942
83945
83947
83952
83954
83955
83964
83968
83970
83972
83973
83974
83975
83976
83977
83979
83980
83981
83984
83985
83986
83988
83989
83990
83991
83992
84002
84003
84004
84008
84010
84011
84012
84014
84017
84018
84024
84025
84034
84038
84039
84041
84043
84047
84050
84052
84060
84061
84062
84063
84068
84082
84083
84084
84085
84089
84090
84091
84097
84104
84116
84118
84123
84128
84132
84142
84147
84151
84154
84156
84157
84160
84169
84172
84175
84176
84177
8418

88042
88044
88048
88049
88050
88051
88052
88053
88058
88060
88061
88062
88063
88066
88067
88068
88069
88071
88074
88078
88079
88082
88083
88084
88088
88094
88099
88106
88108
88110
88113
88116
88117
88120
88124
88132
88138
88140
88142
88148
88149
88150
88153
88154
88156
88160
88172
88173
88174
88179
88181
88191
88192
88194
88195
88196
88197
88199
88200
88202
88203
88205
88206
88209
88210
88211
88213
88215
88217
88218
88220
88221
88222
88223
88224
88225
88228
88229
88230
88232
88235
88237
88240
88242
88243
88246
88247
88248
88249
88251
88252
88253
88255
88261
88262
88266
88267
88275
88279
88280
88281
88284
88287
88288
88291
88295
88307
88309
88314
88331
88333
88340
88356
88357
88358
88368
88369
88370
88371
88372
88373
88374
88375
88378
88379
88380
88381
88382
88383
88384
88385
88386
88387
88388
88390
88391
88396
88397
88398
88399
88403
88404
88405
88408
88412
88413
88414
88415
88416
88419
88420
88421
88429
88438
88439
88441
88444
88445
88447
88448
88450
88452
88454
88456
88457
88458
8846

93039
93040
93041
93045
93053
93054
93055
93058
93061
93071
93076
93086
93094
93096
93106
93107
93108
93110
93111
93115
93116
93120
93122
93125
93127
93129
93132
93133
93135
93136
93138
93145
93147
93148
93149
93151
93152
93154
93155
93164
93177
93180
93191
93192
93193
93196
93202
93213
93214
93215
93222
93230
93233
93234
93237
93238
93248
93254
93257
93260
93262
93264
93266
93267
93268
93271
93275
93278
93282
93286
93294
93297
93299
93303
93304
93308
93313
93316
93317
93319
93320
93325
93326
93329
93331
93332
93333
93334
93335
93336
93337
93342
93343
93344
93345
93346
93347
93348
93351
93353
93358
93359
93360
93361
93362
93363
93364
93365
93369
93372
93373
93374
93376
93377
93379
93380
93381
93382
93386
93387
93391
93393
93394
93403
93406
93412
93414
93416
93424
93425
93432
93433
93434
93437
93443
93448
93450
93453
93454
93455
93456
93457
93460
93462
93464
93465
93466
93469
93470
93472
93474
93475
93476
93479
93482
93483
93484
93487
93488
93491
93492
93493
93496
93497
93499
93502
9350

97095
97096
97099
97101
97104
97107
97108
97109
97113
97116
97123
97124
97127
97128
97134
97136
97137
97140
97142
97149
97150
97151
97152
97153
97154
97163
97178
97179
97180
97182
97185
97187
97190
97200
97201
97202
97207
97208
97209
97211
97212
97221
97224
97226
97227
97228
97230
97236
97239
97240
97244
97248
97249
97250
97255
97258
97270
97271
97273
97275
97276
97283
97285
97289
97291
97292
97293
97294
97295
97296
97297
97298
97302
97303
97304
97305
97306
97307
97308
97309
97310
97311
97312
97314
97315
97316
97317
97318
97319
97320
97322
97328
97329
97337
97338
97339
97341
97344
97347
97348
97349
97352
97355
97359
97363
97364
97365
97368
97369
97370
97371
97372
97377
97380
97382
97388
97390
97395
97396
97398
97399
97401
97402
97406
97423
97426
97427
97429
97433
97435
97447
97460
97466
97467
97476
97483
97484
97485
97486
97488
97489
97491
97492
97494
97495
97496
97499
97500
97502
97503
97507
97508
97510
97512
97513
97516
97517
97519
97527
97529
97530
97532
97533
97535
97537
97539
9754

101630
101631
101632
101633
101634
101635
101636
101637
101638
101641
101642
101643
101644
101645
101647
101650
101651
101654
101657
101660
101661
101662
101668
101669
101670
101672
101673
101675
101678
101679
101683
101686
101688
101690
101693
101694
101702
101705
101706
101718
101722
101726
101730
101733
101734
101745
101746
101748
101762
101763
101764
101765
101766
101767
101771
101772
101775
101780
101781
101782
101789
101790
101791
101792
101793
101795
101796
101797
101798
101799
101800
101801
101802
101803
101804
101805
101808
101814
101817
101820
101823
101824
101825
101831
101834
101835
101837
101840
101841
101842
101843
101845
101848
101852
101863
101867
101868
101870
101875
101879
101882
101889
101893
101899
101901
101905
101906
101910
101913
101917
101919
101922
101923
101924
101925
101926
101927
101928
101939
101948
101951
101952
101954
101956
101957
101958
101959
101960
101962
101963
101965
101966
101968
101971
101974
101977
101980
101982
101987
101991
102001
102010
102014

## Part 2: Exploratory Data Analysis

We can use the Pandas time and date slicing functions to group our data by either day, quarter, or time.  Have a look at [pd.between_time()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.between_time.html).  I want you to explore this data in some interesting ways.

### Problem 1.
Use cut and other Pandas functions to display the joint distribution of Age and Race.  This table should not have every age in it but break the age down into a reasonable number of sub groups.

Pick another pair of variables.  Display a joint or conditional distribution and explain **why** you chose it and what the take away message is.

### Problem 2.

Pick (at least) three nieghborhoods from the data, show the crime in 2011 versus 2012 for each of these neighborhoods on one plot.  Make sure that you use visaul features to distinguish the two years.

**Hint:** You may want to look back at the lab where we worked with baby names... and maybe the [unstack](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.unstack.html) function.

### Problem 3.

Show me one other interesting thing about the data.  It can be anything you find interesting but I'd encourage you to use an advanced method from class (regression, classification, hypothesis testing etc.).  If you can, maybe look at something like [the demographics of Balitmore](https://en.wikipedia.org/wiki/Baltimore) and compare those to what is in our data.



## Part 3: Interactive Maps.

Using the following code stub to start up an interactive map. You can find more information about folium here: https://github.com/python-visualization/folium/ and https://folium.readthedocs.org//


### Problem 5.

Add graphical elements to display the data. For instance, add circles, with colors indicating sex. Or circles with colors indicating race. Or anything else that strikes your fancy.  Plot some colors over the map to illustrate some joint or conditional distribution of the data.

**Explain using Markdown Cells** *what* you have shown in your map, *why* you have shown it in your map, and *how* a user should interpret this information.

In [3]:
map_osm = folium.Map(location=[39.29, -76.61], zoom_start=11)
map_osm

## Submission

Prepare a Jupyter notebook that includes for each Problem: (a) code to carry out the step discussed, (b) output showing the result of your code, and (c) a short prose description of how your code works. Remember, the writeup you are preparing is intended to communicate your data analysis effectively. Thoughtlessly showing large amounts of output in your writeup defeats that purpose.

All axes in plots should be labeled in an informative manner. Your answers to any question that refers to a plot should include both (a) a text description of your plot, and (b) a sentence or two of interpretation as it relates to the question asked.

Submit this completed notebook which contains your answers as markdown cells to [Canvas](https://tulane.instructure.com/)

## Grading Rubric

Note that code that does not work will not be graded and you will receive a 0 for that section.  We reserve the right to deduct points for things like general sloppiness of the notebook, poor labels, unlabeled axes, etc.  You should include markdown cells to break up your notebook and **clearly label** the problems and questions below.

* Part 0 Professionalism (10 points).
  * You have used both code comments and markdown cells to professionally and clearly document your work including having a clear and clean notebook; linking to resources and documents; and doing so with code that is reasonable and efficient.

* Part 1 Wrangling (20 Points).
  * (10 Points)  Data is loaded correctly and directions are followed for munging the data appropatly.
  * (10 Points) Questions are answered in a reasonable manner.  A suggested way to impute data is present along with code.
* Part 2 Exploratory Data Analysis (40 Points).
  * (20 Points) Problem 1: Distributions are compute correctly, tables are shown, explination is coherent and clear.
  * (10 Points) Problem 2: Graph is present, visual features are present to distingush the required elements.
  * (10 Points) Problem 3: Code is present to compute an interesting feature of the data.  The feature is interpreted in a written markdown cell.
* Part 3 Interactive Maps (30 Points).
  * (20 Points) Map is displayed of Baltimore, one or more interactive elements are present.  Displayed information is non-trivial and reveals something interesting about the data.
  * (10 Points) Explination of the above map is reasonable and clear.  Addresses all points.


* Total Score:

### Credits

Thanks to [John P. Dickerson](http://jpdickerson.com/) for the project idea!