# get-it-done-data-prep

Use this notebook to prepare the latest version of the Get It Done dataset that we use in lecture.

We need two CSVs.

- `raw_data/get-it-done-open.csv`: download the table [here](https://data.sandiego.gov/datasets/get-it-done-311/) as a CSV. This comes from their data file called "Open Get It Done Requets."
- `raw_data/get-it-done-closed.csv`: download the table [here](https://data.sandiego.gov/datasets/get-it-done-311/) as a CSV. This comes from their data file called "Get It Done Requests closed in 2023."

Update these CSVs if you want to update the dataset. Below, we're keeping all the requests opened in 2023 (some of which have been closed, some of which are still open).

In [1]:
import numpy as np
import pandas as pd

gid_open = pd.read_csv('raw_data/get-it-done-open.csv')
gid_open = gid_open[gid_open.get('date_requested').str.contains("2023-")] #all open requests made in 2023
#gid_open = gid_open.drop(columns='file_year_split') #this line was needed in Fall 2022 but not in Winter 2023
gid_open = gid_open.assign(status=['open']*gid_open.shape[0]) #replace status of any open request with "Open"

gid_closed = pd.read_csv('raw_data/get-it-done-closed.csv')
gid_closed = gid_closed[gid_closed.get('date_requested').str.contains("2023-")] #all closed requests made in 2023
gid_closed = gid_closed.assign(status=['closed']*gid_closed.shape[0]) #replace status of any closed request with "Closed"

In [2]:
gid_open.shape[0] + gid_closed.shape[0]

118379

In [5]:
keep_columns = ['service_request_id', 'date_requested', 'comm_plan_name', 'service_name', 'status', 'street_address', 'public_description']
gid_requests = gid_open.append(gid_closed).get(keep_columns).reset_index().drop(columns='index') #all open and closed requests in one DataFrame
gid_requests = gid_requests.assign(neighborhood=gid_requests.get('comm_plan_name')).assign(service=gid_requests.get('service_name')).drop(columns=['comm_plan_name', 'service_name'])
keep_columns_2 = ['service_request_id', 'date_requested', 'neighborhood', 'service', 'status', 'street_address', 'public_description']
gid_requests = gid_requests.get(keep_columns_2)

today = gid_requests[gid_requests.get('date_requested').str.contains('2023-04-08')] #get April 8 requests
today

  gid_requests = gid_open.append(gid_closed).get(keep_columns).reset_index().drop(columns='index') #all open and closed requests in one DataFrame


Unnamed: 0,service_request_id,date_requested,neighborhood,service,status,street_address,public_description
22772,4183116,2023-04-08T00:32:00,Downtown,Traffic Signal Issue,open,2ND AVE & G ST,"Signal, 2nd & G"
22773,4183117,2023-04-08T00:44:00,Mid-City:Eastern Area,Missed Collection,open,"4791 Seminole Dr, San Diego, CA 92115, USA",My Blue Recycle Bin was not collected. I'm th...
22774,4183118,2023-04-08T00:49:00,Navajo,Parking,open,"4728 Allied Rd, San Diego, CA 92120, USA",White van parked across my driveway. I will n...
22775,4183119,2023-04-08T01:04:00,Encanto Neighborhoods,Missed Collection,open,6066 Tempas Ct,Green container
22776,4183120,2023-04-08T01:10:00,Navajo,Pothole,open,7961?7979 Topaz Lake Ave,Potholes || LOCATION: Topaz Lake between Pear...
...,...,...,...,...,...,...,...
118374,4184220,2023-04-08T19:53:00,Barrio Logan,Other,closed,3718 Dalbergia St,Prostitution
118375,4184221,2023-04-08T19:53:00,Barrio Logan,Other,closed,3743 Dalbergia St,Prostitution
118376,4184223,2023-04-08T19:54:00,Barrio Logan,Other,closed,2120 Woden St,Prostitution
118377,4184225,2023-04-08T19:54:00,Barrio Logan,Other,closed,2005 Vesta St,Prostitution


In [6]:
crosstab = pd.crosstab([gid_requests.neighborhood, gid_requests.service], gid_requests.status).reset_index()
crosstab

status,neighborhood,service,closed,open
0,Balboa Park,Dead Animal,11,0
1,Balboa Park,Development Services - Code Enforcement,2,0
2,Balboa Park,Encampment,215,20
3,Balboa Park,Environmental Services Code Compliance,8,1
4,Balboa Park,Graffiti - Code Enforcement,2,1
...,...,...,...,...
1416,Via De La Valle,Encampment,0,1
1417,Via De La Valle,Pavement Maintenance,0,3
1418,Via De La Valle,Pothole,11,7
1419,Via De La Valle,Sidewalk Repair Issue,0,1


In [8]:
today.to_csv('../data/get-it-done-apr-08.csv', index=False)
crosstab.to_csv('../data/get-it-done-requests.csv', index=False)