# Notebook 3: Individual Parks

The next step in our process was to scale down our analysis to the park level and examine CES servies offered by park. Since our methodology closely follows the Hale (2019) aricle, we needed to hand-code a select set of the tags into Hale's predefined CES buckets. To do this, we first coded the top 200 tags across all parks to get a sense of the types of services offered across all parks. The top 100 tags did not yield enough codeable tags, so we expanded that selection for a more robust sample. 

Tags themeselves are coded into one of the following categories (we did not cross-code, for parsimony, though we recognize that future analysis would yield more comprehensive results if tags could be cross-coded): existence, recreation, social relations, aesthetics, spritual, knowledge systems, inspiration, cultural heritage, education, sense of place, and culutral diversity. 

We coded parks that had at least 50 unique tags associated with that specific park. The coded parks are: MacArthur, Franklin Canyon, Hancock,Rio de Los Angeles, Runyon Canyon, Coldwater Creek, Cheviot Hills, and Angels Gate. We coded tags that fit into each of the above cateogories, but did not code tags that did not equate to a CES (i.e., nonsensical tags, vague tags, or tags for unidentifiable built infrastructure). 

In [1]:
#importing parks csv
import pandas as pd




In [2]:
#import parks shapefile
import geopandas as gpd
parks_shape = gpd.read_file('/Users/jacquelineadams/Documents/GitHub/LaParks_NLP_5/LaParks_NLP/ces_laparks.shp')


#import shapefile as shp
#/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp
# opening the vector map
#shp_path = "/Users/jacquelineadams/Documents/GitHub/LaParks_NLP_5/ces_laparks.shp"
#assert os.path.exists(shp_path), "Input file does not exist."
# reading the shape file by using reader function of the shape lib
#sf = shp.Reader(shp_path)

parks_shape.head(10)

Unnamed: 0,ACCESS_TYP,PARK_NAME,ACRES,Area_HA,geometry
0,Open Access,Cheviot Hills Park and Recreation Center,34.169,13.827699,"POLYGON ((146843.465 -439567.433, 147202.288 -..."
1,Open Access,Franklin Canyon Park,594.314,199.616109,"MULTIPOLYGON (((146712.011 -433272.368, 146723..."
2,Open Access,Angels Gate Park,70.474,28.520115,"POLYGON ((158289.475 -476568.553, 158303.385 -..."
3,Open Access,Coldwater Canyon Park,41.807,16.918901,"POLYGON ((147572.303 -430391.873, 147566.046 -..."
4,Open Access,Hancock Park,23.098,9.34742,"POLYGON ((151869.126 -437840.075, 151846.907 -..."
5,Open Access,MacArthur Park,31.716,12.834977,"POLYGON ((159231.294 -438119.410, 159070.409 -..."
6,Open Access,Runyon Canyon Park,133.203,53.905646,"MULTIPOLYGON (((152208.876 -431794.069, 152175..."
7,Open Access,Rio de Los Angeles State Park State Recreation...,54.853,22.198228,"MULTIPOLYGON (((163040.145 -433874.729, 162822..."


In [None]:
import nltk
import re
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.corpus import stopwords

swords = [re.sub(r"[^A-z\s]", "", sword) for sword in stopwords.words('english')]
swords += ['losangeles', 'la', 'losangelesca', 'ca', 'macarthur', 'macarthurpark', 'woodley', 'riodelosangeles', 'runyoncanyon', 
           'temescalgateway', 'heidelbergpark', 'hancockpark', 'franklincanyonpark', 'franklincanyonpark', 'angelsgate', 
           'coldwatercanyon', 'chatsworthparksouth','cheviothills', 'california', 'usa', 'southerncalifornia', 'park', 'parklabrea', 
          'unitedstates', 'america']

def clean_string(text):
    # remove punctuation
    text = re.sub(r"[^A-z\s]", "", text)
    
    cleaned_list_of_words = [word for word in word_tokenize(text.lower()) if word not in swords] #return a string or apply to all tags
    
    return cleaned_list_of_words

#calling the function to only apply to the tags column 
parks_data['tags'] = parks_data['tags'].apply(clean_string)


In [None]:
parks_data
parks_data.parkname.unique()

In [None]:
category_map = { 'Existence': ['westlake', 'lake','palmtrees','palms','elks','parkplaza','birds','palmtree', 'santamonicamountains','franklincanyonlake','losangelesmountains','mayberrylake','myerslake','grass','lake','trees','ducks','water','evergreens','frog','woods', 'ice', 'fluids','iceblocks','blocksofice','wallofice','harbor', 'sky', 'weather','tree','cloudy','garden','parks','lariver','losangelesriver','grass','mountains','canyons','hills','mountains','hill','horse','tujungawashgreenway','tujungawash','pacificocean','sky','weather','tree','cloudy'], 
                'Recreation': ['music','bikes','lilihaydn','bicycles','violin','ciclovia','loslobos','event', 'rustythedog','canine','chihuahuamix','mutt','dog','pet','weeksfordogs','urbanhiking', 'costumes', 'costume', 'cosplay', 'boomerang', 'lighthouse','westerncup','sports', 'quidditch', 'dog', 'puggle', 'puppy', 'referee', 'boat', 'nikon', 'nikond','gardentour','fish','campout','hiking','hike','sunriserunyon','sunriseinrunyon','sunriseinrunyoncanyonpark','trail','observatory','run','jog','summerolympics','westerncup','sports','quidditch','nikon','nikond','dog','puggle','puppy','referee','boat'], 
                'Social Relations': ['ciclavia','rally','protest','keepfamliestogethor','tamale','asada','alpastor','march','carnitas','eltaurino','burrito','thegreattacohunt','lasantacon','people','tacos','food','harrypotter', 'wand', 'geeks', 'geek','gardenparty','people','picnik','walkathon','earthday','zurbulon','harrypotter','wand'],
                'Aesthetics': ['colorful','green', 'landscape','textures','texturemaps','texturemap','texture','sunset','sky','skyline','clouds','sun','weather','sunrise','pacificocean','panorama','color','overlook','sunset','viewpoint','scenicoverview','mulhollandscenicoverview','brown','barbaraafineoverlook','lasunset','green','landscape'],
                'Spiritual': ['signs','sanity','nature','neature','harborinterfaith','outside','church','littlebrownchurchinthevalley'], 
                'Inspiration':['art','portraitsofhope','publicart', 'artonthewaterfront', 'artist', 'sculpture', 'printmaking', 'portrait', 'contemporaryart', 'painting', 'polaroid', 'draw', 'photographer', 'studioartist', 'prints','mural'],
                'Cultural Heritage': ['landmark','monument', 'curlettandbeelman','fortmacarthur','warreinactment', 'agcc', 'angelsgateculturalcenter', 'openstudios', 'allankaprow', 'gallerya', 'artgallery', 'gallery', 'artexhibition', 'slobodandimitrov', 'culturalcenter', 'exhibition', 'downstairsgallery', 'installation', 'hillarybradfield', 'festival','parlance','sculpture','art','statue','treasuresoflosangelesarchitecture','losangelesstatehistoricpark','midcenturymodernhomes','charliechaplin','parlance'],
                'Sense of Place': ['neighborhood','community','eccideasclub', 'sanpedro', 'neighborhood'], 
                'Cultural Diversity': ['mexican','lengua','march','immigration','czechart','lapride','westhollywoodpride','lagaypride','westhollywoodgaypride','losangelespride','losangelesgaypride','pride','gaypride'],
                'Knowledge Systems': ['historyofsanpedropunk', 'belleepoque','lahistory','californiahistory'],
                'Education': ['portoflosangeles', 'port','portofla','marshallastor','berth','georgecpagemuseum','museums','losangelescountymuseumofart','iceage','pleistocene','skeletons','skulls','pit','tarpits','labreatarpits','labrea','museum','pagemuseum','fossils','bones','paleontology','lacma','animalsmammoths','excavation','sabretooth','tigers','giantgroundsloths','gettyhouse','tar','sabretoothtigers','olympusem','skeleton','fossil','mammoth','mastodon','environmentaljustice','urbanparkmovement','losangelespubliclibrary'
                             ]}

# Import the csv of frequencies for each park under consideration
fns = ['top_tags_angelsgate.csv','top_tags_cheviothills.csv','top_tags_coldwatercanyon.csv',
       'top_tags_franklincanyonpark.csv','top_tags_hancockpark.csv','top_tags_macarthur.csv',
       'top_tags_riodelosangeles.csv','top_tags_runyoncanyon.csv']
parks_frequency = []
for fn in fns: 
    parks_frequency.append(pd.read_csv(fn))

#print(parks_frequency)

# Create a function to loop over the categories and sum the words associated with each category  
def getCategories(frequencyDf):
    
    category_frequencies = dict.fromkeys(category_map,0)
    
    for index, row in frequencyDf.iterrows():
        #print(row['tags'],row['value'])
        for category_name in category_map:
            wordlist = category_map[category_name]
            if row['tags'] in wordlist:
                category_frequencies[category_name]+=row['value']
            
    return category_frequencies
    
# Create a list of dictionaries with the frequencies by category for each park
cat_frequencies = []
for park in parks_frequency:
    cat_frequencies.append(getCategories(park))

print(cat_frequencies)


### Rio de Los Angeles

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
riodelosangeles = parks_data['parkname']=='riodelosangeles'
riodelosangeles.head()

In [None]:
riodelosangeles_tags = parks_data[riodelosangeles]
print(riodelosangeles_tags)
riodelosangeles_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = riodelosangeles_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# add a geometry column
import geopandas as gpd
riodelosangelesgdf = gpd.GeoDataFrame(
    macarthur_photos, geometry=gpd.points_from_xy(macarthur_photos.longitude, macarthur_photos.latitude, crs='EPSG:4326'))


In [None]:
# Mapped tags associated with MacArthur Park
import contextily as ctx
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(5,10))
macarthurgdf.to_crs('EPSG:3857').plot(color='r', ax=ax) # remember, 3857 is Web Mercator
# let's add a basemap using the contextily library
ctx.add_basemap(ax, zoom=12)
# and we really don't need the axis ticks and labels, so we set them to an empty list
ax.set_xticks([])
ax.set_yticks([])

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_riodelosangeles.csv', index=True)

### Runyon Canyon

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
runyoncanyon = parks_data['parkname']=='runyoncanyon'
runyoncanyon.head()

In [None]:
runyoncanyon_tags = parks_data[runyoncanyon]
print(runyoncanyon_tags)
runyoncanyon_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = runyoncanyon_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_runyoncanyon.csv', index=True)

### Temescal Gateway

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
temescalgateway = parks_data['parkname']=='temescalgateway'
temescalgateway.head()

In [None]:
temescalgateway_tags = parks_data[temescalgateway]
print(temescalgateway_tags)
temescalgateway_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = temescalgateway_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_temescalgateway.csv', index=True)

### Heidelberg Park

### Hancock Park

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
hancockpark = parks_data['parkname']=='hancockpark'
hancockpark.head()

In [None]:
hancockpark_tags = parks_data[hancockpark]
print(hancockpark_tags)
hancockpark_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = hancockpark_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_hancockpark.csv', index=True)

### Franklin Canyon Park

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
franklincanyonpark = parks_data['parkname']=='franklincanyonpark'
franklincanyonpark.head()

In [None]:
franklincanyonpark_tags = parks_data[franklincanyonpark]
print(franklincanyonpark_tags)
franklincanyonpark_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = franklincanyonpark_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_franklincanyonpark.csv', index=True)

### Angels Gate

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
angelsgate = parks_data['parkname']=='angelsgate'
angelsgate.head()

In [None]:
angelsgate_tags = parks_data[angelsgate]
print(angelsgate_tags)
angelsgate_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = angelsgate_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_angelsgate.csv', index=True)

### Coldwater Canyon

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
coldwatercanyon = parks_data['parkname']=='coldwatercanyon'
coldwatercanyon.head()

In [None]:
coldwatercanyon_tags = parks_data[coldwatercanyon]
print(coldwatercanyon_tags)
coldwatercanyon_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = coldwatercanyon_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_coldwatercanyon.csv', index=True)

### Cheviot Hills

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
cheviothills = parks_data['parkname']=='cheviothills'
cheviothills.head()

In [None]:
cheviothills_tags = parks_data[cheviothills]
print(cheviothills_tags)
cheviothills_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = cheviothills_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_cheviothills.csv', index=True)

### Chatsworth Park South