# Exploring Data from 311 Service Calls in Chicago

On the [Chicago Data Portal](https://data.cityofchicago.org/), the city of Chicago provides a wealth of data about the city and its governance.  One of the many datasets available catalogs 311 service calls (311 is a telephone number through which the city provides non-emergency services), and of the many collections of data within that broader set, we can find a list of [service requests regarding abandoned vehicles](https://data.cityofchicago.org/Service-Requests/311-Service-Requests-Abandoned-Vehicles-No-Duplica/atid-bgws) collected since 2011.

In this exercise, we will work together as a class to explore this data and possibly, if there is time, answer such pressing questions as "What is the most popular color for abandoned cars in Chicago?" and "Ford or Chevy: which is abandoned more?"

***

The data is provided in a file that has been copied into this directory: `311_Service_Requests_-_Abandoned_Vehicles_-_No_Duplicates.csv`.  First, we have a bit of code to read the contents of the file into a single string:

In [47]:
with open("311_Service_Requests_-_Abandoned_Vehicles_-_No_Duplicates.csv") as f:
    file_contents = f.read()

Now it's up to us to dig into this data using the tools we're studying for manipulating strings and lists...  We'll create cells below to explore the data.

First, look at the first small piece of the string to get a sense of it.

In [48]:
print(file_contents[:1000])

Creation Date,Status,Completion Date,Service Request Number,Type of Service Request,License Plate,Vehicle Make/Model,Vehicle Color,Current Activity,Most Recent Action,How Many Days Has the Vehicle Been Reported as Parked?,Street Address,ZIP Code,Ward,Police District,Community Area,SSA
01/01/2011,Completed,01/05/2011,11-00001976,Abandoned Vehicle Complaint,H924236,Ford,White,,,60,6059 S KOMENSKY AVE,60629,13,8,65,3
01/01/2011,Completed,01/05/2011,11-00002291,Abandoned Vehicle Complaint,810 LYB    WISCONSIN PLATES,Mercury,Green,,,,4651 S WASHTENAW AVE,60632,12,9,58,
01/01/2011,Completed,01/05/2011,11-00002696,Abandoned Vehicle Complaint,368M783,Buick,Gold,,,10,6200 S MASSASOIT AVE,60638,13,8,64,
01/01/2011,Completed,01/05/2011,11-00003094,Abandoned Vehicle Complaint,000000000,Dodge,White,,,30,5816 S ALBANY AVE,60629,14,8,63,59
01/01/2011,Completed,01/05/2011,11-00003456,Abandoned Vehicle Complaint,TEXAS PLATE  -  SMALL FLATBED HITCH TRAILER  -  MISSING TIRES,,Black,,,,4559 S KEELER AVE,6

Split the giant string into a list of strings, with one per line/row.

In [49]:
lines = file_contents.split('\n')

Let's look at the first few entries in this list.

In [50]:
print(lines[0])
print()
print(lines[1])
print()
print(lines[100])

Creation Date,Status,Completion Date,Service Request Number,Type of Service Request,License Plate,Vehicle Make/Model,Vehicle Color,Current Activity,Most Recent Action,How Many Days Has the Vehicle Been Reported as Parked?,Street Address,ZIP Code,Ward,Police District,Community Area,SSA

01/01/2011,Completed,01/05/2011,11-00001976,Abandoned Vehicle Complaint,H924236,Ford,White,,,60,6059 S KOMENSKY AVE,60629,13,8,65,3

01/02/2012,Completed,01/23/2012,12-00003795,Abandoned Vehicle Complaint,L701065,Chevrolet,Gray,FVI - Outcome,Vehicle was moved from original address requested,14,10240 S EBERHART AVE,60628,9,5,49,41


We can get the color of any individual record from the file.

In [51]:
test_line = lines[54321]
print(test_line)
entries = test_line.split(',')
print(entries)
# The color is the 8th entry in this list
print(entries[7])

01/09/2014,Completed,01/15/2014,14-00034140,Abandoned Vehicle Complaint,P348315,Chevrolet,Red,FVI - Outcome,Return to Owner - Vehicle,10,3700 W 60TH PL,60629,13,8,65,
['01/09/2014', 'Completed', '01/15/2014', '14-00034140', 'Abandoned Vehicle Complaint', 'P348315', 'Chevrolet', 'Red', 'FVI - Outcome', 'Return to Owner - Vehicle', '10', '3700 W 60TH PL', '60629', '13', '8', '65', '']
Red


We can make a function to pull the color out of any line.

In [52]:
def find_color(line):
    entries = line.split(',')
    
    # Check for a bad line (that doesn't have a color value)
    if len(entries) < 8:
        return None
    
    color = entries[7]
    
    # Reject anything longer than ten characters (probably bad data)
    if len(color) > 10:
        return None
    
    return color

In [53]:
# Test the function, make sure it does what we want...
print(find_color(lines[45678]))

Black


Use that function to get every single color from the file (across all of the lines).

In [54]:
# build up a list of colors, starting from an empty list
color_list = []
for line in lines:
    color = find_color(line)
    color_list.append(color)

# check the list by printing the first ten entries
print(color_list[:10])
print(len(color_list))

[None, 'White', 'Green', 'Gold', 'White', 'Black', 'Purple', 'Blue', 'Black', 'White']
161435


Count how many times a particular color shows up in the list.

In [55]:
color_list.count("Black")

25294