Predicting Trailhead Parking Fullness at JeffCo Open Space Trailheads

Introduction

I decided to build a model to predict parking lot % of capacity (ie number of spots taken divided by total number of spots) at Jefferson County Open Space trailheads in Colorado. This would be useful to:

Hikers: When is the best time to go for a hike? Will there be parking available?
Open Space managers: How should we plan/allocate resources among the parks?

Data

LotSpot Parking Data

The data was shared by Lot Spot, which JeffCo Open Space has contracted with since August 2019 to monitor parking at seven of their popular trailheads:

East Mount Falcon
West Mount Falcon
East Three Sisters
West Three Sisters
East White Ranch
Lair o' the Bear
Mount Galbraith

You can see real-time parking availability for these parks with LotSpot's mobile app. A camera located at the entrance to the parking lot detects when a vehicle enters or exits the lot. The raw data was not evenly spaced (there is a datapoint whenever a car enters/exits a lot), so the raw data was resampled to a regularly spaced timeseries (1 hr intervals) for analysis and modeling. The time range of the data was from 2019-08-30 to 2020-05-06.

Given the time constraints, I chose to first focus on a single park: East Mount Falcon. This is one of my personal favorites, had very few data gaps, and I know from experience can reach capacity.

Weather Data

Powered by Dark Sky Historical weather data was obtained from the Dark Sky API for the time period of the LotSpot observations. The API takes a location (lat/lon) and returns both daily and hourly observations for the date requested. The data contains many fields; for the purpose of this analysis I was interested in the following:

Temperature
UV Index
Cloud Cover
Precipitation Intensity
Wind Gust

EDA

Timeseries of total visitors(cars) per day

A little bit of seasonal pattern, but not as much as expected.
Note general increase after March 2020 - likely Covid-19 related, though can't be sure.

Hourly Pattern

Daily Pattern

Signficant difference between weekdays/weekends

Weather Timeseries

Percent Capacity Vs. Weather

Modeling

Target

The target I am trying to predict is the hourly percent of capacity of the parking Lot (ie 0-100%)

Features

Day of week: Converted into Is-Weekend binary category.
Temperature
UV Index
Cloud Cover
Precipitation Intensity
Hour of day - turned into dummy variables.

Train/test split

Use only pre-Covid19 data (before March 1, 2020)
Use only hours 6am to 8pm
80/20 Train/Test Split
Train and tune model on training data, then evaluate on test-set.
Measure performance by R^2 and RMSE

Baseline Model: Predict the mean

Test-set RMSE : 28.9

Random Forest

Default Parameters

Looks like default model is overfitting on training data

Performance:

Train-set R^2 : 0.95
Test-set R^2 : 0.54
Test-set RMSE : 19.5

Tuned w/ GridSearchCV

Performance:

Train-set R^2 : 0.75
Test-set R^2 : 0.64
Test-set RMSE : 17.2

Best Parameters:

n_estimators : 100
max_depth : 10
max_features : 'log2'
min_samples_split : 10

Feature Importance

Weather variabes and day of week most important

Partial Dependence Plot

Temperature shows generally positive dependence. Note breaks around freezing and around 55 degrees. What happens at higher temps?
Cloud cover shows weaker negative trend, and there appears to be a larger negative shift at values > 0.5 .
Precipitation Intensity shows slight negative trend, but a lot weaker than I expected.
UV Index: Big break at value of ~2.

Results/Conclusions

A random forest model predicts hourly parking lot % capacity with R^2 of 0.64 and RMSE of 17.2 .
Weather is really important!
Need more data: Observe all seasons and weather conditions, as well as be able to isolate Covid-19 effects.

Next Steps

Test different models and add/engineer more features (snow storms, holidays, etc.).
Apply to different parks.
Also predict # visitors, probability of lot being full, or waiting times.
Compare observed vs. forecasted weather?

Credits/Acknowledgments

Thanks to Hunter Berge and Connor McCormick at Lot Spot for sharing their data.
Thanks to the Galvanize team (Frank, Kayla, Mike, Travis) and capstone group for feedback and support.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
images		images
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Predicting Trailhead Parking Fullness at JeffCo Open Space Trailheads

Introduction

Data

LotSpot Parking Data

Weather Data

EDA

Timeseries of total visitors(cars) per day

Hourly Pattern

Daily Pattern

Weather Timeseries

Percent Capacity Vs. Weather

Modeling

Target

Features

Train/test split

Baseline Model: Predict the mean

Random Forest

Default Parameters

Tuned w/ GridSearchCV

Feature Importance

Partial Dependence Plot

Results/Conclusions

Next Steps

Credits/Acknowledgments

About

Releases

Packages

Languages

andypicke/JeffCo-OpenSpace-LotSpot-Analysis

Folders and files

Latest commit

History

Repository files navigation

Predicting Trailhead Parking Fullness at JeffCo Open Space Trailheads

Introduction

Data

LotSpot Parking Data

Weather Data

EDA

Timeseries of total visitors(cars) per day

Hourly Pattern

Daily Pattern

Weather Timeseries

Percent Capacity Vs. Weather

Modeling

Target

Features

Train/test split

Baseline Model: Predict the mean

Random Forest

Default Parameters

Tuned w/ GridSearchCV

Feature Importance

Partial Dependence Plot

Results/Conclusions

Next Steps

Credits/Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages