In [2]:
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
        'scroll': True,
})

{'scroll': True}

In [2]:

# Import libraries
import pandas as pd
import numpy as np
from datetime import datetime
import time

from IPython.display import HTML
import warnings
warnings.filterwarnings('ignore')

from arcgis.gis import GIS
import arcgis
import arcgis.network as network
from arcgis.features import Feature, FeatureSet
from arcgis.geoenrichment import *
from arcgis.features.manage_data import dissolve_boundaries
from arcgis.geometry import distance
from arcgis.geocoding import Geocoder, get_geocoders, geocode

gis = GIS(username="pgaddiso_UCSDOnline8")

events_fl = gis.content.get("c84fe2e023b54ecf82de782b1e765c68").layers[0]
events_sdf = events_fl.query(out_sr='3857').sdf

service_area_url = gis.properties.helperServices.serviceArea.url
sa_layer = network.ServiceAreaLayer(service_area_url, gis=gis)
travel_modes = sa_layer.retrieve_travel_modes()
car_mode = [t for t in travel_modes['supportedTravelModes'] if t['name'] == 'Driving Time'][0]

sref_3857 = {'latestWkid': 3857, 'wkid': 102100}

freeways = gis.content.get("5cdc4f0e9c47499aa67be8d6e0bf6091")
freeways_fl = freeways.layers[0]
freeways_sdf = freeways_fl.query().sdf

features_sdf = pd.read_csv("events-features1.csv")

Enter password: ········


# On the Impact of  Events on Traffic Flow

**Presented by**

- Enrique Sanchez
- Parker Addison

## What's the project?

*How does the context in which a major event is held affect the traffic conditions surrounding the event?*

**Features:**
- Time and date
- Location
- Estimated # of attendees
- Event type

**Study:**
- Traffic impact &leftarrow; (yes, that's pretty vague)

## Why's it important?

- Planning the optimal time/location of a future event
- Predicting the impact of an established event with greater attendance


- City can plan ahead for traffic
- City can establish ordinances for maximum traffic impact of events
- Event planners can minimize traffic for their attendees
- Event planners can make decisions on other factors, minimizing traffic through high-crime areas

## Original Plan

- Wanted to look at historical traffic on a street-segment basis
- Represent intersections as nodes, segments as edges


- Create 'baseline' traffic conditions (e.g. avg flow rate) conditional on day of week, hour of day
- Create 'traffic deltas' during events &rightarrow; these are our y-values


- Fix a graph structure (physically unchanging), then feed in node properties to predict edge properties

## Some issues...

- Couldn't get access to historical traffic data
- Couldn't find an ML model suited to work on a fixed graph structure

## Some solutions...

- Look at historical *Service Areas*
    - Less granular, but still takes into account each street behind the scenes
- Compare 'baseline areas' with 'impacted areas'

## More problems...

- Service area calculations only baselines beyond 12hrs

# A Plan Emerges

Set up the framework to allow for analysis if we can someday get historical service areas

**Features:** Provided
- Location
- Expected number of attendees
- Event type

**Features:** Engineered/Enriched
- Baseline service area; travel to or from event; conditional on date, day, and time
- Number of vehicles in baseline service area
- Attendance-to-Vehicle ratio
- Distance from event to highway

**Study:**

- Proportion change to service area, $\frac{\text{Impacted Area} - \text{Baseline Area}}{\text{Baseline Area}}$


- ...we're not able to get impacted area :'(

## Why these features?

These features were chosen based on previous research from various sources.

- Events closer to highways will have greater traffic flow through a single route/exit
    - This seems contrary to expectations; may depend on event size; would have loved to examine this
- The attendance-vehicle ratio gives a sense of how many cars may be entering the area vs. how many the area can handle
- The larger the baseline service area, the easier it is for people to navigate despite increased traffic

## Moving forward

We got an events dataset from DataSD (official), Highways from the community, and Service Areas will be calculated using ArcGIS.

In [3]:
m1 = gis.map("San DIego", zoomlevel=11)
m1.add_layer(events_fl)
m1

MapView(layout=Layout(height='400px', width='100%'), zoom=11.0)

## A coded example of feature generation

In [4]:
def calc_service_area(event, baseline=True):
    """
    Helper Function.
    
    Calculates the 5 minute service area for an event
    
    """

    date = event['date'].split('-')
    
    if baseline:
        # Predicting far into the past/future will generate a service area on typical traffic speeds.
        # Note that we want these baselines to be conditional on day and time.  We're not sure if Esri
        # takes the date into account as well as the day of week, so we can use the fact that calendars
        # will exactly repeat day-date combinations every 28 years (no matter if it's a leap year or not)
        #
        # Source:
        # https://www.answers.com/Q/How_often_in_years_do_calendars_repeat_with_the_same_day-date_combinations
        
        date[0] = str(int(date[0]) + 28)
        
    # Make sure that the time is still the same as the event!
    start_time = event['start'][:2]

    time = datetime(int(date[0]), int(date[1]), int(date[2]), int(start_time)).timestamp() * 1000
    location = str(event['longitude']) + ', ' + str(event['latitude'])
    
    service_area = sa_layer.solve_service_area(facilities=location, default_breaks=[5], travel_mode=car_mode,
                                               travel_direction='esriNATravelDirectionToFacility',
                                               time_of_day = time, time_of_day_is_utc=False,
                                               out_sr={'latestWkid': 3857, 'wkid': 102100})
    
    # This can be easily changed to work with end_time and TravelDirectionFromFacility to measure
    # traffic impact from people leaving the event!
    
    return service_area


def area_service_area(service_area):
    """
    Helper Function.
    
    Calculates the area of a service area in meters.
    
    """
    
    return service_area['saPolygons']['features'][0]['attributes']['Shape_Area']
    
    
def num_vehicles(service_area):
    """
    Helper Function.
    
    Calculates the number of vehicles in a service area
    
    """
    
    study_area = arcgis.geometry.Geometry(
        service_area['saPolygons']['features'][0]['geometry'], spatialReference=sref_3857
    )
    vehicles = enrich(study_areas=[study_area], data_collections=['AutomobilesAutomotiveProducts']) 
    num_vehicles = (vehicles.MP01002h_B + 2*vehicles.MP01003h_B + 3*vehicles.MP01004h_B)[0]
    
    return num_vehicles


def dist_to_highway(event):
    """
    Helper Function.
    
    Calculates the distance of an event to the highway.
    
    """
    
    distance_to_highway = distance(geometry1= event['SHAPE'], 
                               geometry2=freeways_sdf['SHAPE'].loc[0], 
                               spatial_ref={'latestWkid': 3857, 'wkid': 102100}, 
                               geodesic=True)['distance']
    
    return distance_to_highway


def generate_features(event):
    """
    Generates the following features for a particular event:
    
    1. The area of the 5 minute service area
    2. The number of veicles in a service area
    3. The distance of the event to the nearest highway
    
    Outputs a dataframe of three columns containing these values.
    
    """

    service_area = calc_service_area(event, baseline=True)
    area = area_service_area(service_area)
    vehicles = num_vehicles(service_area)
    distance = dist_to_highway(event)
    av_ratio = event["total_atte"] / vehicles
    
    features = pd.DataFrame({'service_area':[area], 'num_vehicles':[vehicles], 'av_ratio':[av_ratio], 'dist_to_highway':[distance]})
    
    return features

In [5]:
event0 = events_sdf.loc[0]
event0

FID                                                           1
SHAPE         {'x': -13042312.189404054, 'y': 3857633.435881...
date                                                 2018-08-08
end_                                                   14:00:00
id                                                        49813
latitude                                                32.7157
longitude                                              -117.161
start                                                  11:00:00
title         curbside bites food truck markets - downtown l...
total_atte                                                  330
type                                                    farmers
Name: 0, dtype: object

In [6]:
event0_sa = arcgis.geometry.Geometry(
    calc_service_area(event0, baseline=True)['saPolygons']['features'][0]['geometry'],
    spatialReference=sref_3857
)
m = gis.map("San Diego", zoomlevel=10)
m.draw(event0_sa)
m.add_layer(freeways)
events_sdf.iloc[[0]].spatial.plot(map_widget=m)
m

MapView(layout=Layout(height='400px', width='100%'), zoom=10.0)

In [7]:
generate_features(event0)

Unnamed: 0,service_area,num_vehicles,av_ratio,dist_to_highway
0,4516798.0,15931,0.020714,684.266175


We've generated features for over a third of our dataset... but managed to burn through 2000+ credits while doing so.

The results are already becoming clear though!

All we need to build a model now is historical service areas!

```
X = generate_features([events])

impacted_areas = area_service_area([
    calc_service_area([events], baseline=False)
])
y = (impacted_areas - X.service_areas) / X.service_areas

model.fit(X, y)
```

## A "real" example

There's an event tonight, the *DSC 170 Final Project Presentation Party* is located at *32.877651, -117.237256*, is of type *'exhibit'*, starts at *6:30* and has an expected attendance of *4,000* (!!!).

Calculating the service area at the start of this event, we find an impacted service area of *15,000,000* square meters.

In [8]:
dsc170 = pd.DataFrame(columns=["latitude", "longitude", "date", "start", "title", "total_atte", "type"],
                     data=[[32.877651, -117.237256, "2019-06-06", "18:30:00", "dsc 170 final presentation party", 4000, "exhibit"]]
)
dsc170_sdf = pd.DataFrame.spatial.from_xy(dsc170, "longitude", "latitude", sr=4326)
dsc170_sdf["SHAPE"] = arcgis.geometry.project([dsc170.SHAPE[0]], in_sr=4326, out_sr=3857)
dsc170_sdf.loc[0]

latitude                                                32.8777
longitude                                              -117.237
date                                                 2019-06-06
start                                                  18:30:00
title                          dsc 170 final presentation party
total_atte                                                 4000
type                                                    exhibit
SHAPE         {'x': -13050791.639920656, 'y': 3879075.417441...
Name: 0, dtype: object

In [9]:
X = (
    pd.concat([dsc170_sdf, generate_features(dsc170_sdf.loc[0])], axis=1)
    [["total_atte", "type", "service_area", "num_vehicles", "av_ratio", "dist_to_highway"]]
)
X

Unnamed: 0,total_atte,type,service_area,num_vehicles,av_ratio,dist_to_highway
0,4000,exhibit,17135530.0,15473,0.258515,788.020208


In [10]:
y = (15_000_000 - X.service_area[0]) / X.service_area[0]
y

-0.12462563567923099

## We're done

Although we don't have access to historical traffic data, our project built the framework on which analysis could be conducted.

- We underwent the cleaning process for events and service-area data
- We researched and selected features that are strong potential predictors of traffic-impact
- We generated features, using geo-analysis and geo-enrichment, which could be used to train a machine learning model
- We had a great time!

# Thank you