Urban Data Science & Smart Cities <br>
URSP688Y <br>
Instructor: Chester Harvey <br>
Urban Studies & Planning <br>
National Center for Smart Growth <br>
University of Maryland

[<img src="https://colab.research.google.com/assets/colab-badge.svg">](https://colab.research.google.com/github/ncsg/ursp688y_sp2024/blob/main/exercises/exercise07/exercise07.ipynb)

# Exercise 7

## Problem

In week 7, you learned how to extend tabular data with geospatial information: points, linestrings, and polygons.

For this next exercise, please ask a planning-related question with a spatial component, then find data and apply any data science methods you have learned so-far (or can Google!) to answer that question.

## Data

You are welcome to use any data you would like, including data used in previous demos and exercises.

## A Few Pointers
- Choose a straightforward question that requires a reasonable amount of data! Don't shoot for the moon. This exercise is intended to give you a chance to practice finding and analyzing spatial data, not to address the world's greatest challenges.
- Consider using this exercise to get a head start on your final project or explore options for it. Your project doesn't need to focus on spatial analysis for it to play a role. Are there datasets you might join together based on spatial locations?
- Don't go overboard. If you're hitting a wall with coding, write pseudocode and turn that in. Don't let the perfect be the enemy of the done. But if you're energized and having fun by chasing down a thorny solution to a coding problem, by all means feel free to keep at it!



In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [17]:
import pandas as pd
import os
import geopandas as gpd
import numpy as np

In [18]:
os.chdir('/content/drive/MyDrive/Colab Notebooks/Exercise07_Bardsley')

In [None]:
#This is a test of a distance decay function that I can eventually use
#on my final project. Since I haven't yet cleaned that data,
#I created a test dataset just to see if the function works.

In [19]:
#The data representing the origin polygons (strip malls)
data_origin = {
    'X':[-77.019,-76.995,-77.038],
    'Y':[38.932,38.858,38.932]
}

In [20]:
#The data representing the destination polygons (residences)
data_dest = {
    'X':[-76.988,-77.016,-76.972,-77.034,-76.975],
    'Y':[38.895,38.974,38.851,38.924,38.87]
}

In [21]:
#change data into dataframe
df_origin = pd.DataFrame(data_origin)

In [22]:
df_dest = pd.DataFrame(data_dest)

In [23]:
#add the name column to the origin data
df_origin["Name"] = ['A','B','C']

In [24]:
#setting up the geometry for a geodataframe
geoms_origin = gpd.points_from_xy(df_origin['X'],df_origin['Y'], crs=4326)

In [25]:
geoms_dest = gpd.points_from_xy(df_dest['X'],df_dest['Y'], crs=4326)

In [26]:
#creating geodataframes
gdf_origin = gpd.GeoDataFrame(df_origin, geometry=geoms_origin, crs=4326)

In [27]:
gdf_dest = gpd.GeoDataFrame(df_dest, geometry=geoms_dest, crs=4326)

In [None]:
gdf_dest.head()

In [None]:
gdf_origin.head()

In [28]:
#changing to a CRS that allows distance measurement
#(probably could have done this in the first place)
gdf_origin = gdf_origin.to_crs(epsg=26918)

In [29]:
gdf_dest = gdf_dest.to_crs(epsg=26918)

In [None]:
#A function to measure the distance and create a new
#column with the weighted "customer" count.
def dist_decay(origins, dest):
  #a dictionary to contain the lists of distances from each polygon
  dict_of_lists = {}
  for polygon in origins['geometry']:
    #a list to contain the distances from each polygon
    dist_list = []
    for point in dest['geometry']:
      dist_list.append(point.distance(polygon))
    #converting to kilometers
    dist_list = [num / 1000 for num in dist_list]
    #a new list for the weighted values
    dist_list_w = []
    #locations less than 2 kilometers away will have the value 1
    #while others will decay
    for num in dist_list:
      if num <= 2:
        dist_list_w.append(1)
      elif num > 2:
        dist_list_w.append(1/(num - 1))
    #summing the values for each origin polygon
    dist_list_w_sum = sum(dist_list_w)
    #creating a new dictionary key, and adding the sum
    #from above as the value
    dict_of_lists[polygon] = dist_list_w_sum
    #creating a new column that maps the results of the
    #dictionary above by the geometry column
    origins['Customers'] = origins['geometry'].map(dict_of_lists)
  #returns the original table with the new column
  return origins

In [30]:
dist_decay(gdf_origin,gdf_dest)

NameError: name 'dist_decay' is not defined