# Request Type Analysis

Look at the request type values from 311.  Questions to consider:

  - Counts
  - Spatial (NCs) distribution
  - Time to complete
  - Time to complete by service provider
  - Spatial (service region) distr
  - Repeated addresses

Steps in this notebook:

1.  Setup
2.  Create geodataframe/dataframe from cleaned data and [census](https://data.lacity.org/Community-Economic-Development/Census-Data-by-Neighborhood-Council/nwj3-ufba)
3.  Examine the data
4.  Compute the measure
5.  Show measure as choropleth
6.  So what (next steps)

# 1 - Setup

In [1]:
%run start.py
from utils import read_new311_shape, dt_to_object

2021-12-01 16:07:39 Configured OSMnx 1.1.1
2021-12-01 16:07:39 HTTP response caching is on


# 2 - Get Data Files

Two data sets:

  1. extended311 for point features
  2. cleaned, certified NCs for polygons

In [2]:
%%time
extended311_gdf = read_new311_shape('../data/311/extended311-geo.zip/')

CPU times: user 2min 3s, sys: 6.74 s, total: 2min 9s
Wall time: 3min 15s


In [None]:
extended311_gdf.info()

Certified, cleaned neighborhoods is a common idiom at this stage so ...

In [3]:
neighborhoods_gdf = gpd.read_file('../data/neighborhoods/Neighborhood_Councils_(Certified)_cleaned.zip/')

neighborhoods_gdf.rename(columns={'NAME': 'name',
                        'NC_ID': 'nc_id',
                        'SERVICE_RE': 'service_region'},
              inplace=True);

In [None]:
neighborhoods_gdf.info()

# 3 - Some Data Massaging

Well, there's a discrepancy here.  The census data has 97 NC's and the certified dataset has 99 (I think the right number is 99).

Not going to agonize over this at this stage but want to understand things.  Adjusting for what matches as this stage should be good enough for now.

In [None]:
extended311_gdf.iloc[27]

In [None]:
extended311_gdf.iloc[27]['created_dt'].day_of_week

In [None]:
extended311_gdf.iloc[27]['created_dt'].date()

In [None]:
extended311_gdf['day_of_week'] = extended311_gdf['created_dt'].apply(lambda dt: dt.day_of_week)

In [None]:
extended311_gdf.day_of_week.value_counts()

In [None]:
extended311_gdf['date'] = extended311_gdf['created_dt'].apply(lambda dt: dt.date())

In [None]:
extended311_gdf['date'].value_counts(sort=False)

In [None]:
extended311_gdf['month'] = extended311_gdf['created_dt'].apply(lambda dt: dt.month)

In [None]:
extended311_gdf['month'].value_counts(sort=False)

In [None]:
extended311_gdf['quarter'] = extended311_gdf['created_dt'].apply(lambda dt: dt.quarter)

In [None]:
extended311_gdf['quarter'].value_counts(sort=False)

In [None]:
still_open_gdf = extended311_gdf[extended311_gdf['closed_dt'].isnull()].reset_index()

In [None]:
pd.options.display.max_rows

In [None]:
pd.set_option("max_rows", 200)
pd.set_option("min_rows", 20)
still_open_gdf['date'].value_counts(sort=False, dropna=False).to_frame().reset_index()
#pd.reset_option("max_rows")

In [None]:
extended311_gdf_info = Output(layout={'border': '1px solid black',
                            'width': '50%'})

still_open_gdf_info = Output(layout={'border': '1px solid black',
                            'width': '50%'})

with extended311_gdf_info:
    display(HTML('<center><b>created count</b></center>'))
    display(extended311_gdf['date'].value_counts(sort=False))

with still_open_gdf_info:
    display(HTML('<center><b>still open count</b></center>'))
    display(still_open_gdf['date'].value_counts(sort=False))

HBox([extended311_gdf_info, still_open_gdf_info])

In [None]:
f1 = extended311_gdf['date'].value_counts(sort=False).to_frame().reset_index().rename(columns={'index': 'day', 'date': 'created count'})
f2 = still_open_gdf['date'].value_counts(sort=False).to_frame().reset_index().rename(columns={'index': 'day', 'date': 'open count'})   

merged_counts = pd.merge(f1, f2, on="day")
merged_counts['percentage'] = merged_counts.apply(lambda row: row['open count']/row['created count'], axis=1)

In [None]:
merged_counts

In [4]:
graffiti_gdf = read_new311_shape('../data/311/graffiti.zip/')

In [5]:
graffiti_counts = graffiti_gdf['nc'].value_counts().to_frame().reset_index().rename(columns={'index': 'nc_id', 'nc': 'count'})

In [6]:
graffiti_counts

Unnamed: 0,nc_id,count
0,78,26836
1,50,18197
2,52,15584
3,125,13082
4,86,10799
...,...,...
94,63,238
95,64,221
96,126,217
97,114,140


In [11]:
len(graffiti_gdf)

315577

In [7]:
graffiti_merged = pd.merge(neighborhoods_gdf, graffiti_counts, how="left", on=["nc_id"])

In [None]:
graffiti_merged

# 4 - Compute the Measure

Computation is simple.  Use the geometry of the NC to compute area in miles squared.

For the density I'm simply using total population.  I suspect it would be interesting to examine some of the other ethnic measures?  Maybe a nice pull down to select?  Ah... for another day.

In [None]:
from pyproj import Geod

geod = Geod(ellps="WGS84")

def square_miles(geo):
    square_meters = abs(geod.geometry_area_perimeter(geo)[0])
    return (square_meters * 10.764) / 27878000

In [None]:
neighborhood_merged['sq_miles'] = neighborhood_merged.apply(lambda row: square_miles(row.geometry), axis=1)

In [None]:
neighborhood_merged['density'] = neighborhood_merged.apply(lambda row: row['Total Population'] / row['sq_miles'], axis=1)

Remember I like to look at one of the values.

In [None]:
neighborhood_merged.iloc[27]

Some sanity checking on the data before we generate the display.

In the real world we'll have to do some more work on this data!

In [None]:
neighborhood_merged.density.max()

In [None]:
neighborhood_merged.density.min()

In [None]:
len(neighborhood_merged)

# 5 - Display the Choropleth

In [23]:
graffiti_gdf['address'].value_counts()

2500 S HOOPER AVE, 90011        389
12843 W FOOTHILL BLVD, 91342    317
3600 S MAIN ST, 90007           211
3400 S MAIN ST, 90007           200
3500 S MAIN ST, 90007           176
                               ... 
1346 E 22ND ST, 90011             1
4232 S FIGUEROA ST, 90037         1
1301 E 46TH ST, 90011             1
950 S MARIPOSA AVE, 90006         1
6911 N BEN AVE, 91605             1
Name: address, Length: 100503, dtype: int64

In [29]:
graffiti_gdf[graffiti_gdf['nc_name'].notnull()].query(f"nc_name.str.contains('South Central')")['address'].value_counts()

2500 S HOOPER AVE, 90011        389
3600 S MAIN ST, 90007           211
3400 S MAIN ST, 90007           200
3500 S MAIN ST, 90007           176
3700 S MAIN ST, 90007           157
                               ... 
1924 S LOS ANGELES ST, 90011      1
251 3/4 E 29TH ST, 90011          1
103 W 39TH ST, 90037              1
3708 S MAPLE AVE, 90011           1
123 E 32ND ST, 90011              1
Name: address, Length: 3396, dtype: int64

In [12]:
from ipyleaflet import FullScreenControl

In [20]:
imagery = basemap_to_tiles(basemaps.Esri.WorldImagery)
imagery.base = True
osm = basemap_to_tiles(basemaps.OpenStreetMap.Mapnik)
osm.base = True


map_display = Map(center=(34.05, -118.25), zoom=11,
                  layers=[imagery, osm],
                  layout=Layout(height="900px"),
                  scroll_wheel_zoom=True)

#map_display.add_control(LayersControl())
#map_display += nc_layer

map_display.add_control(FullScreenControl())
map_display

Map(center=[34.05, -118.25], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom…

refer to : https://www.youtube.com/watch?v=wjzAy_yLrdA

In [21]:
from ipyleaflet import Choropleth, Map
from branca.colormap import linear
a_geojson = json.loads(graffiti_merged.to_json())

graffiti_density = dict(zip(graffiti_merged['name'].tolist(), graffiti_merged['count'].tolist()))
for i in a_geojson['features']:
    i['id'] = i['properties']['name']

layer = Choropleth(
                    geo_data=a_geojson,
                    choro_data=graffiti_density,
                    colormap=linear.YlOrRd_09, #linear.Blues_05,
                    style={'fillOpacity': 1.0, "color":"black"},)
                    #key_on="name")

map_display.add_layer(layer)

I need to revisit a tooltip type popup.  For now this will work.

In [22]:
geo_json = GeoJSON(
    data=a_geojson,
    style={
        'opacity': 1, 'dashArray': '9', 'fillOpacity': 0.6, 'weight': 1
    },
    hover_style={
        'color': 'white', 'dashArray': '0', 'fillOpacity': 0.5
    },
    name='NCs'
)

html = HTML('''Hover over a district''')
html.layout.margin = '0px 20px 20px 20 px'
control = WidgetControl(widget=html, position='bottomright')

def update_html(feature, **kwargs):
    html.value = '''<h3><b>NC: {}</b></h3>
                    <h4>Count: {}'''.format(feature['properties']['name'],
                                                           feature['properties']['count'])
    
map_display.add_control(control)  # does += work for this?

layer.on_hover(update_html)

# 6 - So What?

I say this tounge in cheeck.  Things to think about:

  1. Should we examine measures besides total population?
  2. Does it make sense to extend the 311 data as we did with the service regions?
  3. Do we just use this to select an NC then query 311 (or ...)?
  
