# Notebook 3: Individual Parks

The next step in our process was to scale down our analysis to the park level and examine CES servies offered by park. Since our methodology closely follows the Hale (2019) aricle, we needed to hand-code a select set of the tags into Hale's predefined CES buckets. To do this, we first coded the top 200 tags across all parks to get a sense of the types of services offered across all parks. The top 100 tags did not yield enough codeable tags, so we expanded that selection for a more robust sample. 

Tags themeselves are coded into one of the following categories (we did not cross-code, for parsimony, though we recognize that future analysis would yield more comprehensive results if tags could be cross-coded): existence, recreation, social relations, aesthetics, spritual, knowledge systems, inspiration, cultural heritage, education, sense of place, and culutral diversity. 

We coded parks that had at least 50 unique tags associated with that specific park. The coded parks are: MacArthur, Franklin Canyon, Hancock,Rio de Los Angeles, Runyon Canyon, Coldwater Creek, Cheviot Hills, and Angels Gate. We coded tags that fit into each of the above cateogories, but did not code tags that did not equate to a CES (i.e., nonsensical tags, vague tags, or tags for unidentifiable built infrastructure). 

# clean version

1. Get CES frequences by park

In [1]:
import pandas as pd

category_map = { 'Existence': ['westlake', 'lake','palmtrees','palms','elks','parkplaza','birds','palmtree', 'santamonicamountains','franklincanyonlake','losangelesmountains','mayberrylake','myerslake','grass','lake','trees','ducks','water','evergreens','frog','woods', 'ice', 'fluids','iceblocks','blocksofice','wallofice','harbor', 'sky', 'weather','tree','cloudy','garden','parks','lariver','losangelesriver','grass','mountains','canyons','hills','mountains','hill','horse','tujungawashgreenway','tujungawash','pacificocean','sky','weather','tree','cloudy'], 
                'Recreation': ['music','bikes','lilihaydn','bicycles','violin','ciclovia','loslobos','event', 'rustythedog','canine','chihuahuamix','mutt','dog','pet','weeksfordogs','urbanhiking', 'costumes', 'costume', 'cosplay', 'boomerang', 'lighthouse','westerncup','sports', 'quidditch', 'dog', 'puggle', 'puppy', 'referee', 'boat', 'nikon', 'nikond','gardentour','fish','campout','hiking','hike','sunriserunyon','sunriseinrunyon','sunriseinrunyoncanyonpark','trail','observatory','run','jog','summerolympics','westerncup','sports','quidditch','nikon','nikond','dog','puggle','puppy','referee','boat'], 
                'Social Relations': ['ciclavia','rally','protest','keepfamliestogethor','tamale','asada','alpastor','march','carnitas','eltaurino','burrito','thegreattacohunt','lasantacon','people','tacos','food','harrypotter', 'wand', 'geeks', 'geek','gardenparty','people','picnik','walkathon','earthday','zurbulon','harrypotter','wand'],
                'Aesthetics': ['colorful','green', 'landscape','textures','texturemaps','texturemap','texture','sunset','sky','skyline','clouds','sun','weather','sunrise','pacificocean','panorama','color','overlook','sunset','viewpoint','scenicoverview','mulhollandscenicoverview','brown','barbaraafineoverlook','lasunset','green','landscape'],
                'Spiritual': ['signs','sanity','nature','neature','harborinterfaith','outside','church','littlebrownchurchinthevalley'], 
                'Inspiration':['art','portraitsofhope','publicart', 'artonthewaterfront', 'artist', 'sculpture', 'printmaking', 'portrait', 'contemporaryart', 'painting', 'polaroid', 'draw', 'photographer', 'studioartist', 'prints','mural'],
                'Cultural Heritage': ['landmark','monument', 'curlettandbeelman','fortmacarthur','warreinactment', 'agcc', 'angelsgateculturalcenter', 'openstudios', 'allankaprow', 'gallerya', 'artgallery', 'gallery', 'artexhibition', 'slobodandimitrov', 'culturalcenter', 'exhibition', 'downstairsgallery', 'installation', 'hillarybradfield', 'festival','parlance','sculpture','art','statue','treasuresoflosangelesarchitecture','losangelesstatehistoricpark','midcenturymodernhomes','charliechaplin','parlance'],
                'Sense of Place': ['neighborhood','community','eccideasclub', 'sanpedro', 'neighborhood'], 
                'Cultural Diversity': ['mexican','lengua','march','immigration','czechart','lapride','westhollywoodpride','lagaypride','westhollywoodgaypride','losangelespride','losangelesgaypride','pride','gaypride'],
                'Knowledge Systems': ['historyofsanpedropunk', 'belleepoque','lahistory','californiahistory'],
                'Education': ['portoflosangeles', 'port','portofla','marshallastor','berth','georgecpagemuseum','museums','losangelescountymuseumofart','iceage','pleistocene','skeletons','skulls','pit','tarpits','labreatarpits','labrea','museum','pagemuseum','fossils','bones','paleontology','lacma','animalsmammoths','excavation','sabretooth','tigers','giantgroundsloths','gettyhouse','tar','sabretoothtigers','olympusem','skeleton','fossil','mammoth','mastodon','environmentaljustice','urbanparkmovement','losangelespubliclibrary'
                             ]}

# Import the csv of frequencies for each park under consideration
fns = ['top_tags_angelsgate.csv','top_tags_cheviothills.csv','top_tags_coldwatercanyon.csv',
       'top_tags_franklincanyonpark.csv','top_tags_hancockpark.csv','top_tags_macarthur.csv',
       'top_tags_riodelosangeles.csv','top_tags_runyoncanyon.csv']
parks_frequency = []
for fn in fns: 
    parks_frequency.append(pd.read_csv(fn))

#print(parks_frequency)

# Create a function to loop over the categories and sum the words associated with each category  
def getCategories(frequencyDf):
    
    category_frequencies = dict.fromkeys(category_map,0)
    
    for index, row in frequencyDf.iterrows():
        #print(row['tags'],row['value'])
        for category_name in category_map:
            wordlist = category_map[category_name]
            if row['tags'] in wordlist:
                category_frequencies[category_name]+=row['value']
            
    return category_frequencies
    
# Create a list of dictionaries with the frequencies by category for each park
cat_frequencies = []
for park in parks_frequency:
    cat_frequencies.append(getCategories(park))

print(cat_frequencies)

[{'Existence': 344, 'Recreation': 51, 'Social Relations': 0, 'Aesthetics': 0, 'Spiritual': 14, 'Inspiration': 517, 'Cultural Heritage': 1698, 'Sense of Place': 63, 'Cultural Diversity': 13, 'Knowledge Systems': 30, 'Education': 119}, {'Existence': 4, 'Recreation': 73, 'Social Relations': 40, 'Aesthetics': 5, 'Spiritual': 0, 'Inspiration': 0, 'Cultural Heritage': 13, 'Sense of Place': 2, 'Cultural Diversity': 0, 'Knowledge Systems': 0, 'Education': 0}, {'Existence': 37, 'Recreation': 1, 'Social Relations': 2, 'Aesthetics': 9, 'Spiritual': 2, 'Inspiration': 16, 'Cultural Heritage': 2, 'Sense of Place': 0, 'Cultural Diversity': 0, 'Knowledge Systems': 31, 'Education': 1}, {'Existence': 56, 'Recreation': 44, 'Social Relations': 0, 'Aesthetics': 1, 'Spiritual': 11, 'Inspiration': 4, 'Cultural Heritage': 0, 'Sense of Place': 0, 'Cultural Diversity': 0, 'Knowledge Systems': 0, 'Education': 0}, {'Existence': 39, 'Recreation': 30, 'Social Relations': 197, 'Aesthetics': 0, 'Spiritual': 0, 'Inspi

2. Convert ces frequency list into a dataframe

In [2]:
parksdf = pd.DataFrame(cat_frequencies)
print(parksdf)

   Existence  Recreation  Social Relations  Aesthetics  Spiritual  \
0        344          51                 0           0         14   
1          4          73                40           5          0   
2         37           1                 2           9          2   
3         56          44                 0           1         11   
4         39          30               197           0          0   
5        468         187               415          25         39   
6         14           6                13           0          1   
7         82         169                 0         178          8   

   Inspiration  Cultural Heritage  Sense of Place  Cultural Diversity  \
0          517               1698              63                  13   
1            0                 13               2                   0   
2           16                  2               0                   0   
3            4                  0               0                   0   
4           7

3. Create a list of the park names as they appear in the shape file in the order of the list
4. Make this into a dataframe

In [3]:
parknames = ['Angels Gate Park','Cheviot Hills Park and Recreation Center','Coldwater Canyon Park','Franklin Canyon Park','Hancock Park','MacArthur Park','Rio de Los Angeles State Park State Recreation Area','Runyon Canyon Park']

parknamesDf = pd.DataFrame(parknames)
parknamesDf['PARK_NAME']=parknames
parknamesDf

Unnamed: 0,0,PARK_NAME
0,Angels Gate Park,Angels Gate Park
1,Cheviot Hills Park and Recreation Center,Cheviot Hills Park and Recreation Center
2,Coldwater Canyon Park,Coldwater Canyon Park
3,Franklin Canyon Park,Franklin Canyon Park
4,Hancock Park,Hancock Park
5,MacArthur Park,MacArthur Park
6,Rio de Los Angeles State Park State Recreation...,Rio de Los Angeles State Park State Recreation...
7,Runyon Canyon Park,Runyon Canyon Park


5. Join the dataframes so the ces frequency dataframe has associated park

In [4]:
parksjoinDf = parksdf.join(parknamesDf, how = 'left')
parksjoinDf

Unnamed: 0,Existence,Recreation,Social Relations,Aesthetics,Spiritual,Inspiration,Cultural Heritage,Sense of Place,Cultural Diversity,Knowledge Systems,Education,0,PARK_NAME
0,344,51,0,0,14,517,1698,63,13,30,119,Angels Gate Park,Angels Gate Park
1,4,73,40,5,0,0,13,2,0,0,0,Cheviot Hills Park and Recreation Center,Cheviot Hills Park and Recreation Center
2,37,1,2,9,2,16,2,0,0,31,1,Coldwater Canyon Park,Coldwater Canyon Park
3,56,44,0,1,11,4,0,0,0,0,0,Franklin Canyon Park,Franklin Canyon Park
4,39,30,197,0,0,76,154,0,1168,0,3988,Hancock Park,Hancock Park
5,468,187,415,25,39,76,169,112,85,0,0,MacArthur Park,MacArthur Park
6,14,6,13,0,1,0,6,0,0,0,12,Rio de Los Angeles State Park State Recreation...,Rio de Los Angeles State Park State Recreation...
7,82,169,0,178,8,0,0,0,0,0,0,Runyon Canyon Park,Runyon Canyon Park


6. Create a list with the top CES for each park and add it as a column

In [5]:
Top_CES = ["Recreation","Existence","Cultural_Heritage","Existence","Education","Social Relations","Aesthetics","Existence"]
parksjoinDf['Top_CES']=Top_CES

7. Import shape file

In [6]:
# Import necessary modules
import geopandas as gpd

# Set filepath (fix path relative to yours)
#fp = '/Users/jacquelineadams/Documents/GitHub/LaParks_NLP_6/ces_laparks.shp"

shapefile = gpd.read_file('/Users/jacquelineadams/Documents/GitHub/LaParks_NLP_6/ces_laparks.shp')


# Read file using gpd.read_file()
#parks_sf = gpd.read_file(parks_shape)
#type(parks_sf)

8. Join parksjoinDf to the geodataframe parks_sf

In [15]:
park_shapes = shapefile.merge(parksjoinDf, on='PARK_NAME')

park_shapes.to_csv('park_shapes.csv') 

<class 'geopandas.geodataframe.GeoDataFrame'>


CPLE_AppDefinedError: b'sqlite3_exec(CREATE TABLE gpkg_extensions (table_name TEXT,column_name TEXT,extension_name TEXT NOT NULL,definition TEXT NOT NULL,scope TEXT NOT NULL,CONSTRAINT ge_tce UNIQUE (table_name, column_name, extension_name))) failed: attempt to write a readonly database'

Exception ignored in: 'fiona._shim.gdal_flush_cache'
Traceback (most recent call last):
  File "fiona/_err.pyx", line 201, in fiona._err.GDALErrCtxManager.__exit__
fiona._err.CPLE_AppDefinedError: b'sqlite3_exec(CREATE TABLE gpkg_extensions (table_name TEXT,column_name TEXT,extension_name TEXT NOT NULL,definition TEXT NOT NULL,scope TEXT NOT NULL,CONSTRAINT ge_tce UNIQUE (table_name, column_name, extension_name))) failed: attempt to write a readonly database'


In [None]:
import contextily as cx

f, ax = plt.subplots(1,1,figsize=(20,20))
park_shapes.dropna(subset=['geometry'], axis=0).plot('geometry', ax=ax, cmap='plasma')
ax.set_facecolor('k')

In [None]:
# Tags on map
import contextily as ctx
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(20,20))
park_shapes.to_crs('EPSG:3857').plot(color='r', ax=ax) # remember, 3857 is Web Mercator

# let's add a basemap using the contextily library
ctx.add_basemap(ax, zoom=12)

# and we really don't need the axis ticks and labels, so we set them to an empty list
ax.set_xticks([])
ax.set_yticks([])

In [None]:
def plot_shape(id, s=None):
    """ PLOTS A SINGLE SHAPE """
    plt.figure()
    ax = plt.axes()
    ax.set_aspect('equal')
    shape_ex = sf.shape(id)
    x_lon = np.zeros((len(shape_ex.points),1))
    y_lat = np.zeros((len(shape_ex.points),1))
    for ip in range(len(shape_ex.points)):
        x_lon[ip] = shape_ex.points[ip][0]
        y_lat[ip] = shape_ex.points[ip][1]
    plt.plot(x_lon,y_lat) 
    x0 = np.mean(x_lon)
    y0 = np.mean(y_lat)
    plt.text(x0, y0, s, fontsize=10)
    # use bbox (bounding box) to set plot limits
    plt.xlim(shape_ex.bbox[0],shape_ex.bbox[2])
    return x0, y0

In [None]:
#import contextily as ctx
import matplotlib.pyplot as plt
%matplotlib inline

fig, ax = plt.subplots(1,1, figsize=(25,25))

# basemaps are typically in Web Mercator (projection 3857), so we need to reproject our dataframe to this
# alpha sets the transparency
park_shapes.to_crs('EPSG:3857').plot('Top_CES', cmap='plasma', legend=True, ax = ax, alpha=.75)


# let's add a basemap using the contextily library
# a bell and whistle I added here was to use a different basemap, so that the income data are not overwhelmed
# see here for examples: https://contextily.readthedocs.io/en/latest/providers_deepdive.html
ctx.add_basemap(ax, zoom=12, source=ctx.providers.Stamen.TonerLite)

# and we really don't need the axis ticks and labels, so we set them to an empty list
ax.set_xticks([])
ax.set_yticks([])

ax.set(title='Cultural Ecosystem Services by Park')

# save to a file
#fig.savefig('med_inc.png')

Looking at the map of donimant CES 

# Past attempts

In [None]:
#importing parks csv
import pandas as pd

# You might need to add a path as well
fn = 'parks_data.csv'
parks_data = pd.read_csv(fn)
parks_data.head()

In [None]:
import nltk
import re
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.corpus import stopwords

swords = [re.sub(r"[^A-z\s]", "", sword) for sword in stopwords.words('english')]
swords += ['losangeles', 'la', 'losangelesca', 'ca', 'macarthur', 'macarthurpark', 'woodley', 'riodelosangeles', 'runyoncanyon', 
           'temescalgateway', 'heidelbergpark', 'hancockpark', 'franklincanyonpark', 'franklincanyonpark', 'angelsgate', 
           'coldwatercanyon', 'chatsworthparksouth','cheviothills', 'california', 'usa', 'southerncalifornia', 'park', 'parklabrea', 
          'unitedstates', 'america']

def clean_string(text):
    # remove punctuation
    text = re.sub(r"[^A-z\s]", "", text)
    
    cleaned_list_of_words = [word for word in word_tokenize(text.lower()) if word not in swords] #return a string or apply to all tags
    
    return cleaned_list_of_words

#calling the function to only apply to the tags column 
parks_data['tags'] = parks_data['tags'].apply(clean_string)


In [None]:
parks_data
parks_data.parkname.unique()

# category map codes by park

In [None]:
category_CES = ['Existence','Recreation','Social Relations','Aesthetics', 'Knowledge Systems',
              'Inspiration', 'Cultural Heritage','Education', 'Sense of Place','Cultural Diversity','Spiritual']

In [None]:
# Franklin Canyon Park
category_map_franklincanyon = { 'Existence': ['santamonicamountains','franklincanyonlake','losangelesmountains','mayberrylake','myerslake','grass','lake','trees','ducks','water','evergreens','frog','woods'], 
                               'Recreation': ['rustythedog','canine','chihuahuamix','mutt','dog','pet','weeksfordogs','urbanhiking'],
                               'Social Relations':[],
                               'Aesthetics':['sky'], 
                               'Knowledge Systems':[],
                               'Inspiration':[], 
                               'Cultural Heritage':[],
                               'Education':[], 
                               'Sense of Place':[],
                               'Cultural Diversity':[],
                               'Spiritual':['nature','neature']}

In [None]:
# Hancock park
category_map_hancockpark = { 'Existence': ['garden','parks',], 
                               'Recreation': ['gardentour','fish'],
                               'Social Relations':['gardenparty','people','picnik'],
                               'Aesthetics':[], 
                               'Knowledge Systems':[],
                               'Inspiration':[], 
                               'Cultural Heritage':['sculpture','art','statue'],
                               'Education':['georgecpagemuseum','museums','losangelescountymuseumofart','iceage','pleistocene','skeletons',
                                            'skulls','pit','tarpits','labreatarpits','labrea','museum','pagemuseum','fossils','bones','paleontology',
                                           'lacma','animalsmammoths','excavation','sabretooth','tigers','giantgroundsloths','gettyhouse',
                                           'tar','sabretoothtigers','olympusem','skeleton','fossil','mammoth','mastodon'], 
                               'Sense of Place':[],
                               'Cultural Diversity':['lapride','westhollywoodpride','lagaypride','westhollywoodgaypride','losangelespride',
                                                     'losangelesgaypride','pride','gaypride'],
                               'Spiritual':['treasuresoflosangelesarchitecture']}

In [None]:
# Rio del Los Angeles Park
category_map_riodellosangeles = { 'Existence': ['lariver','losangelesriver','grass'], 
                               'Recreation': ['campout'],
                               'Social Relations':['walkathon','earthday'],
                               'Aesthetics':[], 
                               'Knowledge Systems':[],
                               'Inspiration':[], 
                               'Cultural Heritage':['losangelesstatehistoricpark'],
                               'Education':['environmentaljustice','urbanparkmovement'], 
                               'Sense of Place':[],
                               'Cultural Diversity':[],
                               'Spiritual':['outside']}
#note: I recategorized grass into existence instead of aesthetics....

In [None]:
# Runyon Canyon Park
category_map_runyon = { 'Existence': ['mountains','canyons','hills','mountains','hill','horse'], 
                               'Recreation': ['hiking','hike','sunriserunyon','sunriseinrunyon','sunriseinrunyoncanyonpark',
                                              'trail','observatory','run','jog'],
                               'Social Relations':[],
                               'Aesthetics':['textures','texturemaps','texturemap','texture','sunset','sky','skyline',
                                             'clouds','sun','weather','sunrise','pacificocean','panorama','color'], 
                               'Knowledge Systems':[],
                               'Inspiration':[], 
                               'Cultural Heritage':[],
                               'Education':[], 
                               'Sense of Place':[],
                               'Cultural Diversity':[],
                               'Spiritual':[]}

In [None]:
# Coldwater Canyon Park
category_map_runyon = { 'Existence': ['tujungawashgreenway','tujungawash','pacificocean'], 
                               'Recreation': ['summerolympics'],
                               'Social Relations':['zurbulon'],
                               'Aesthetics':['overlook','sunset','viewpoint','scenicoverview','mulhollandscenicoverview',
                                             'brown','barbaraafineoverlook','lasunset','midcenturymodernhomes'], 
                               'Knowledge Systems':['lahistory','californiahistory'],
                               'Inspiration':['mural'], 
                               'Cultural Heritage':['charliechaplin'],
                               'Education':['losangelespubliclibrary'], 
                               'Sense of Place':[],
                               'Cultural Diversity':[],
                               'Spiritual':['church','littlebrownchurchinthevalley']}

In [None]:
# Cheviot Hills Park
category_map_cheviothills = { 'Existence': ['sky','weather','tree','cloudy'], 
                               'Recreation': ['westerncup','sports','quidditch','nikon','nikond','dog','puggle',
                                              'puppy','referee','boat'],
                               'Social Relations':['harrypotter','wand'],
                               'Aesthetics':['green','landscape'], 
                               'Knowledge Systems':[],
                               'Inspiration':[], 
                               'Cultural Heritage':['parlance'],
                               'Education':[], 
                               'Sense of Place':['geeks','geek','neighborhood'],
                               'Cultural Diversity':[],
                               'Spiritual':[]}

# importing shape file
https://www.earthdatascience.org/workshops/gis-open-source-python/intro-vector-data-python/

In [None]:
import shapefile as shp
#/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp

# opening the vector map
shp_path = "/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp"
assert os.path.exists(shp_path), "Input file does not exist."

# reading the shape file by using reader function of the shape lib
sf = shp.Reader(shp_path)

# CES visualization 

In [None]:
category_map = { 'Existence': ['westlake', 'lake','palmtrees','palms','elks','parkplaza','birds','palmtree', 'santamonicamountains','franklincanyonlake','losangelesmountains','mayberrylake','myerslake','grass','lake','trees','ducks','water','evergreens','frog','woods', 'ice', 'fluids','iceblocks','blocksofice','wallofice','harbor', 'sky', 'weather','tree','cloudy','garden','parks','lariver','losangelesriver','grass','mountains','canyons','hills','mountains','hill','horse','tujungawashgreenway','tujungawash','pacificocean','sky','weather','tree','cloudy'], 
                'Recreation': ['music','bikes','lilihaydn','bicycles','violin','ciclovia','loslobos','event', 'rustythedog','canine','chihuahuamix','mutt','dog','pet','weeksfordogs','urbanhiking', 'costumes', 'costume', 'cosplay', 'boomerang', 'lighthouse','westerncup','sports', 'quidditch', 'dog', 'puggle', 'puppy', 'referee', 'boat', 'nikon', 'nikond','gardentour','fish','campout','hiking','hike','sunriserunyon','sunriseinrunyon','sunriseinrunyoncanyonpark','trail','observatory','run','jog','summerolympics','westerncup','sports','quidditch','nikon','nikond','dog','puggle','puppy','referee','boat'], 
                'Social Relations': ['ciclavia','rally','protest','keepfamliestogethor','tamale','asada','alpastor','march','carnitas','eltaurino','burrito','thegreattacohunt','lasantacon','people','tacos','food','harrypotter', 'wand', 'geeks', 'geek','gardenparty','people','picnik','walkathon','earthday','zurbulon','harrypotter','wand'],
                'Aesthetics': ['colorful','green', 'landscape','textures','texturemaps','texturemap','texture','sunset','sky','skyline','clouds','sun','weather','sunrise','pacificocean','panorama','color','overlook','sunset','viewpoint','scenicoverview','mulhollandscenicoverview','brown','barbaraafineoverlook','lasunset','green','landscape'],
                'Spiritual': ['signs','sanity','nature','neature','harborinterfaith','outside','church','littlebrownchurchinthevalley'], 
                'Inspiration':['art','portraitsofhope','publicart', 'artonthewaterfront', 'artist', 'sculpture', 'printmaking', 'portrait', 'contemporaryart', 'painting', 'polaroid', 'draw', 'photographer', 'studioartist', 'prints','mural'],
                'Cultural Heritage': ['landmark','monument', 'curlettandbeelman','fortmacarthur','warreinactment', 'agcc', 'angelsgateculturalcenter', 'openstudios', 'allankaprow', 'gallerya', 'artgallery', 'gallery', 'artexhibition', 'slobodandimitrov', 'culturalcenter', 'exhibition', 'downstairsgallery', 'installation', 'hillarybradfield', 'festival','parlance','sculpture','art','statue','treasuresoflosangelesarchitecture','losangelesstatehistoricpark','midcenturymodernhomes','charliechaplin','parlance'],
                'Sense of Place': ['neighborhood','community','eccideasclub', 'sanpedro', 'neighborhood'], 
                'Cultural Diversity': ['mexican','lengua','march','immigration','czechart','lapride','westhollywoodpride','lagaypride','westhollywoodgaypride','losangelespride','losangelesgaypride','pride','gaypride'],
                'Knowledge Systems': ['historyofsanpedropunk', 'belleepoque','lahistory','californiahistory'],
                'Education': ['portoflosangeles', 'port','portofla','marshallastor','berth','georgecpagemuseum','museums','losangelescountymuseumofart','iceage','pleistocene','skeletons','skulls','pit','tarpits','labreatarpits','labrea','museum','pagemuseum','fossils','bones','paleontology','lacma','animalsmammoths','excavation','sabretooth','tigers','giantgroundsloths','gettyhouse','tar','sabretoothtigers','olympusem','skeleton','fossil','mammoth','mastodon','environmentaljustice','urbanparkmovement','losangelespubliclibrary'
                             ]}

# Import the csv of frequencies for each park under consideration
fns = ['top_tags_angelsgate.csv','top_tags_cheviothills.csv','top_tags_coldwatercanyon.csv',
       'top_tags_franklincanyonpark.csv','top_tags_hancockpark.csv','top_tags_macarthur.csv',
       'top_tags_riodelosangeles.csv','top_tags_runyoncanyon.csv']
parks_frequency = []
for fn in fns: 
    parks_frequency.append(pd.read_csv(fn))

#print(parks_frequency)

# Create a function to loop over the categories and sum the words associated with each category  
def getCategories(frequencyDf):
    
    category_frequencies = dict.fromkeys(category_map,0)
    
    for index, row in frequencyDf.iterrows():
        #print(row['tags'],row['value'])
        for category_name in category_map:
            wordlist = category_map[category_name]
            if row['tags'] in wordlist:
                category_frequencies[category_name]+=row['value']
            
    return category_frequencies
    
# Create a list of dictionaries with the frequencies by category for each park
cat_frequencies = []
for park in parks_frequency:
    cat_frequencies.append(getCategories(park))

print(cat_frequencies)

In [None]:
import pandas as pd

parksdf = pd.DataFrame(cat_frequencies)
print(parksdf)

In [None]:
# create list of park names as they appear in shp file

parknames = ['Angels Gate Park','Cheviot Hills Park and Recreation Center','Coldwater Canyon Park','Franklin Canyon Park','Hancock Park','MacArthur Park','Rio de Los Angeles State Park State Recreation Area','Runyon Canyon Park']

parknamesDf = pd.DataFrame(parknames)
parknamesDf['PARK_NAME']=parknames
parknamesDf

In [None]:
# add parknamesDf as a column to parksDf
parksjoinDf = parksdf.join(parknamesDf, how = 'left')
parksjoinDf

# join parksjoinDf to parks geodataframe

In [None]:

# https://towardsdatascience.com/mapping-geograph-data-in-python-610a963d2d7f

import shapefile as shp
#/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp

# opening the vector map
shp_path = "/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp"
assert os.path.exists(shp_path), "Input file does not exist."

# reading the shape file by using reader function of the shape lib
sf = shp.Reader(shp_path)

def read_shapefile(sf):
    """
    Read a shapefile into a Pandas dataframe with a 'coords' 
    column holding the geometry information. This uses the pyshp
    package
    """
    fields = [x[0] for x in sf.fields][1:]
    records = sf.records()
    shps = [s.points for s in sf.shapes()]
    df = pd.DataFrame(columns=fields, data=records)
    df = df.assign(coords=shps)
    return df

df = read_shapefile(sf)
df.shape

In [None]:
df.sample(5)

In [None]:
#set index to srprec for both dfs

#df.set_index('PARK_NAME', inplace=True)
#parksjoinDf.set_index('PARK_NAME', inplace=True)

# join
parksmap = df.join(parksjoinDf, on = 'PARK_NAME', how = 'left')
parksmap
parksmap.head()

In [None]:
Top_CES = ["Recreation","Existence","Cultural_Heritage","Existence","Education","Social Relations","Aesthetics","Existence"]
parksmap['Top_CES']=Top_CES
parksmap

In [None]:
#turn parkmaps into a geodataframe
#parksmap['coords'].head()
#parksmap.plot(column='Top_CES')
type(parksmap)

In [None]:
# convert coords to string
parksmap['coords'] = parksmap['coords'].astype(str)

In [None]:
parksmap['coords'] = gpd.GeoSeries.from_wkt(parksmap['coords'])
my_geo_df = gpd.GeoDataFrame(parksmap, geometry='coords')

In [None]:

#import contextily as ctx
import matplotlib.pyplot as plt
%matplotlib inline

fig, ax = plt.subplots(1,1, figsize=(25,25))

# basemaps are typically in Web Mercator (projection 3857), so we need to reproject our dataframe to this
# alpha sets the transparency
parkmaps.to_crs('EPSG:3857').plot('Top_CES', cmap='plasma', legend=True, ax = ax, alpha=.2)

In [None]:
# Import necessary modules
import geopandas as gpd

# Set filepath (fix path relative to yours)
fp = "/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp"

# Read file using gpd.read_file()
data = gpd.read_file(fp)

In [None]:
ces_list = ["Recreation","Existence","Cultural_Heritage","Existence","Education","Social Relations","Aesthetics","Existence"]

cesDf = pd.DataFrame(ces_list)
cesDf['Top_CES']=cesDf
cesDf

# add park name column to cesDf
cesDf['PARK_NAME'] = parknames
cesDf

In [None]:
#cesDf.set_index('PARK_NAME', inplace=True)
#parksmap.set_index('PARK_NAME',inplace=True)
parksmap2 = parksmap(cesDf, on = 'PARK_NAME', how = 'left')

In [None]:
# create a list of corresponding dominant CES and add it as a column

# create a list of conditions
conditions = [
    (parksmap['Existence']>parksmap['Recreation']) | (parksmap['Existence']>parksmap['Social Relations'])
    | (parksmap['Existence']>parksmap['Social Relations']) | (parksmap['Existence']>parksmap['Aesthetics'])
    | (parksmap['Existence']>parksmap['Spiritual']) | (parksmap['Existence']>parksmap['Inspiration']) | 
    (parksmap['Existence']>parksmap['Cultural Heritage']) | (parksmap['Existence']>parksmap['Sense of Place'])
    | (parksmap['Existence']>parksmap['Cultural Diversity']) | (parksmap['Existence']>parksmap['Knowledge Systems'])
    | (parksmap['Existence']>parksmap['Education'])
    (parksmap['Recreation']>parksmap['Existence']) | (parksmap['Recreation']>parksmap['Social Relations'])
    | (parksmap['Recreation']>parksmap['Social Relations']) | (parksmap['Recreation']>parksmap['Aesthetics'])
    | (parksmap['Recreation']>parksmap['Spiritual']) | (parksmap['Recreation']>parksmap['Inspiration']) | 
    (parksmap['Recreation']>parksmap['Cultural Heritage']) | (parksmap['Recreation']>parksmap['Sense of Place'])
    | (parksmap['Recreation']>parksmap['Cultural Diversity']) | (parksmap['Recreation']>parksmap['Knowledge Systems'])
    | (parksmap['Recreation']>parksmap['Education'])
    #(parksmap['Recreation']>parksmap['Existence'],parksmap['Social Relations'],parksmap['Social Relations'],
    #parksmap['Aesthetics'],parksmap['Spiritual'],parksmap['Inspiration'],parksmap['Cultural Heritage'],
    #parksmap['Sense of Place'],parksmap['Cultural Diversity'],parksmap['Knowledge Systems'],parksmap['Education']))
]
values =['Existence','Recreation']
#values =['Existence','Recreation']

parksmap['Top_CES'] = np.select(conditions,values)
parksmap

In [None]:
# this didn't work :/
# change column name to PARK_NAME

parksjoinDf.rename(columns={"0":"PARK_NAME"})

In [None]:
from itertools import chain

#print list(chain(*cat_frequencies))
# or better: (available since Python 2.6)
print list(chain.from_iterable(cat_frequencies))

# visualize shape file

In [None]:
pip install seaborn

In [None]:
import numpy as np
import pandas as pd
import shapefile as shp
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# https://towardsdatascience.com/mapping-geograph-data-in-python-610a963d2d7f

import shapefile as shp
#/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp

# opening the vector map
shp_path = "/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp"
assert os.path.exists(shp_path), "Input file does not exist."

# reading the shape file by using reader function of the shape lib
sf = shp.Reader(shp_path)

def read_shapefile(sf):
    """
    Read a shapefile into a Pandas dataframe with a 'coords' 
    column holding the geometry information. This uses the pyshp
    package
    """
    fields = [x[0] for x in sf.fields][1:]
    records = sf.records()
    shps = [s.points for s in sf.shapes()]
    df = pd.DataFrame(columns=fields, data=records)
    df = df.assign(coords=shps)
    return df

df = read_shapefile(sf)
df.shape

In [None]:
df.sample(5)
df.coords

In [None]:
df.PARK_NAME

In [None]:
def plot_shape(id, s=None):
    """ PLOTS A SINGLE SHAPE """
    plt.figure()
    ax = plt.axes()
    ax.set_aspect('equal')
    shape_ex = sf.shape(id)
    x_lon = np.zeros((len(shape_ex.points),1))
    y_lat = np.zeros((len(shape_ex.points),1))
    for ip in range(len(shape_ex.points)):
        x_lon[ip] = shape_ex.points[ip][0]
        y_lat[ip] = shape_ex.points[ip][1]
    plt.plot(x_lon,y_lat) 
    x0 = np.mean(x_lon)
    y0 = np.mean(y_lat)
    plt.text(x0, y0, s, fontsize=10)
    # use bbox (bounding box) to set plot limits
    plt.xlim(shape_ex.bbox[0],shape_ex.bbox[2])
    return x0, y0

In [None]:
def plot_map(sf, x_lim = None, y_lim = None, figsize = (20,40)):
    '''
    Plot map with lim coordinates
    '''
    plt.figure(figsize = figsize)
    id=0
    for shape in sf.shapeRecords():
        x = [i[0] for i in shape.shape.points[:]]
        y = [i[1] for i in shape.shape.points[:]]
        plt.plot(x, y, 'k')
        
        if (x_lim == None) & (y_lim == None):
            x0 = np.mean(x)
            y0 = np.mean(y)
            plt.text(x0, y0, id, fontsize=10)
        id = id+1
    
    if (x_lim != None) & (y_lim != None):     
        plt.xlim(x_lim)
        plt.ylim(y_lim)

In [None]:
plot_map(sf)
type(plot_map)

# join frequency data to shp file frame to get dominate CES

In [None]:
# https://towardsdatascience.com/mapping-geograph-data-in-python-610a963d2d7f

import shapefile as shp
#/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp

# opening the vector map
shp_path = "/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp"
assert os.path.exists(shp_path), "Input file does not exist."

# reading the shape file by using reader function of the shape lib
sf = shp.Reader(shp_path)

def read_shapefile(sf):
    """
    Read a shapefile into a Pandas dataframe with a 'coords' 
    column holding the geometry information. This uses the pyshp
    package
    """
    fields = [x[0] for x in sf.fields][1:]
    records = sf.records()
    shps = [s.points for s in sf.shapes()]
    df = pd.DataFrame(columns=fields, data=records)
    df = df.assign(coords=shps)
    return df

df = read_shapefile(sf)
df.shape

In [None]:
df.sample(5)

# past mapping attempts

In [None]:
# Tags on map
import contextily as ctx
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(20,20))
sf.to_crs('EPSG:3857').plot(color='r', ax=ax) # remember, 3857 is Web Mercator

# let's add a basemap using the contextily library
ctx.add_basemap(ax, zoom=12)

# and we really don't need the axis ticks and labels, so we set them to an empty list
ax.set_xticks([])
ax.set_yticks([])

In [None]:
print(cat_frequencies.keys())

In [None]:
# Source: https://plotly.com/python/radar-chart/
import plotly.graph_objects as go

fig = go.Figure()

for category in cat_frequencies:
    # compute frequency total per park
    park_freq_total = sum(list(category.values()))
    r_values = [x/park_freq_total for x in list(category.values())]
    fig.add_trace(go.Scatterpolar(
          r=r_values, #This is a method
          theta=list(category.keys()),
          fill='toself',
          name=''
    ))

fig.update_layout(
  polar=dict(
    radialaxis=dict(
      visible=True,
      range=[0, 0.8]
    )),
  showlegend=True
)
    
fig.show()

# Failed attempts at importing shape file

In [None]:
import geopandas as gdp 
import pandas as pd

#read in csv file

fn = 'ces_park_shapes.csv'
parks_sf = pd.read_csv(fn)
parks_sf.head()

In [None]:
f, ax = plt.subplots(1,1,figsize=(20,20))
parks_sf.dropna(subset=['geometry'], axis=0).plot('geometry', ax=ax, cmap='plasma')
ax.set_facecolor('k')

In [None]:
import fiona
sjer_plot_locations = gpd.read_file('Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp')

In [None]:
import shapefile

sf = shapefile.Reader("shapefiles/ces_laparks.shp")

In [None]:
import geopandas as gpd
sf = gpd.read_file('/Users/morganrogers/Documents/GitHub/LaParks_NLP/ces_laparks.shp')

In [None]:
# import necessary packages
import os
import matplotlib.pyplot as plt
import geopandas as gpd
import earthpy as et

In [None]:
#download data
os.chdir(os.path.join(et.io.HOME, 'ces_laparks'))

### MacArthur Park

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
macarthur = parks_data['parkname']=='macarthur'
macarthur.head()

In [None]:
macarthur_tags = parks_data[macarthur]
print(macarthur_tags)
macarthur_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = macarthur_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_macarthur.csv', index=True)

# Analysis of spread of CES and strength of each CES - macarthur park

In [None]:
import pandas as pd

# import macarthur top tags
fn = 'top_tags_macarthur.csv'
macarthur_tags = pd.read_csv(fn)
macarthur_tags.head()

# import CES code breakdown for macarthur park
fn2 = 'codes_macarthur2.csv'
macarthur_ces = pd.read_csv(fn2)
macarthur_ces.head()

In [None]:
# replace NaNs with 0

macarthur_ces.fillna(0)

1. group by CES and count size
2. assign to new df to visualize

In [None]:
macarthur_ces.dtypes

In [None]:
# get list of columns from the original dataframe, excluding the ones that aren't about ces
cols = [col for col in macarthur_ces.columns if col not in ['Words']]

# normalize the data by dividing each column by the total tag counts (1682)
for col in cols:
    macarthur_ces[col] = macarthur_ces[col].sum()
    
ces = macarthur_ces[cols]
print(ces)

In [None]:
# get list of columns from the original dataframe, excluding the ones that aren't about ces
cols = [col for col in macarthur_ces.columns if col not in ['Words']]

# normalize the data by dividing each column by the total tag counts (1682)
for col in cols:
    macarthur_ces[col] = macarthur_ces[col].sum()
    
ces = macarthur_ces[cols]/1682*100
print(ces)

Having trouble getting overall percentage for each group --> trying a different approach

In [None]:
#drop words column
macarthur_ces.drop(columns=['Words'])

#sum of each ces
sum_ces = macarthur_ces.sum(axis=0)
print(sum_ces)

In [None]:
# convert series to dataframe and keep index
# https://www.geeksforgeeks.org/convert-given-pandas-series-into-a-dataframe-with-its-index-as-another-column-on-the-dataframe/

cesDf = sum_ces.to_frame().reset_index()
print(cesDf)
list(cesDf.columns)

In [None]:
cols = [col for col in macarthur_ces.columns if col not in ['index']]

ces_total = cesDf[cols]/1682*100
print(ces_total)

In [None]:
#clean up df by droping row 0 and renaming columns

#drop row by index
cesDf.drop(labels=["Words"],axis=0,inplace=False)
print(cesDf)


In [None]:
#divide each column by total tag counts 1682 and multiply by 100 to normalize the data


### Woodley Park


In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
woodley = parks_data['parkname']=='woodley'
woodley.head()

In [None]:
woodley_tags = parks_data[woodley]
print(woodley_tags)
woodley_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = woodley_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_woodley.csv', index=True)

### Rio de Los Angeles

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
riodelosangeles = parks_data['parkname']=='riodelosangeles'
riodelosangeles.head()

In [None]:
riodelosangeles_tags = parks_data[riodelosangeles]
print(riodelosangeles_tags)
riodelosangeles_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = riodelosangeles_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_riodelosangeles.csv', index=True)

### Runyon Canyon

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
runyoncanyon = parks_data['parkname']=='runyoncanyon'
runyoncanyon.head()

In [None]:
runyoncanyon_tags = parks_data[runyoncanyon]
print(runyoncanyon_tags)
runyoncanyon_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = runyoncanyon_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_runyoncanyon.csv', index=True)

### Temescal Gateway

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
temescalgateway = parks_data['parkname']=='temescalgateway'
temescalgateway.head()

In [None]:
temescalgateway_tags = parks_data[temescalgateway]
print(temescalgateway_tags)
temescalgateway_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = temescalgateway_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_temescalgateway.csv', index=True)

### Heidelberg Park

### Hancock Park

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
hancockpark = parks_data['parkname']=='hancockpark'
hancockpark.head()

In [None]:
hancockpark_tags = parks_data[hancockpark]
print(hancockpark_tags)
hancockpark_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = hancockpark_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_hancockpark.csv', index=True)

### Franklin Canyon Park

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
franklincanyonpark = parks_data['parkname']=='franklincanyonpark'
franklincanyonpark.head()

In [None]:
franklincanyonpark_tags = parks_data[franklincanyonpark]
print(franklincanyonpark_tags)
franklincanyonpark_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = franklincanyonpark_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_franklincanyonpark.csv', index=True)

### Angels Gate

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
angelsgate = parks_data['parkname']=='angelsgate'
angelsgate.head()

In [None]:
angelsgate_tags = parks_data[angelsgate]
print(angelsgate_tags)
angelsgate_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = angelsgate_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_angelsgate.csv', index=True)

### Coldwater Canyon

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
coldwatercanyon = parks_data['parkname']=='coldwatercanyon'
coldwatercanyon.head()

In [None]:
coldwatercanyon_tags = parks_data[coldwatercanyon]
print(coldwatercanyon_tags)
coldwatercanyon_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = coldwatercanyon_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_coldwatercanyon.csv', index=True)

### Cheviot Hills

In [None]:
#Need to figure out how to filter by parkname and retain the tags info by photo
cheviothills = parks_data['parkname']=='cheviothills'
cheviothills.head()

In [None]:
cheviothills_tags = parks_data[cheviothills]
print(cheviothills_tags)
cheviothills_tags.parkname.unique()

In [None]:
cols = ['tags', 'parkname']
tag_park = cheviothills_tags[cols].explode('tags', ignore_index=True)

In [None]:
#create a column with count of each tag 
tag_park['value'] = [1] * tag_park.shape[0]

#return top 100 most used tags sorted by value
top_100_tags = tag_park.groupby('tags').sum().sort_values('value', ascending=False).head(100)

#so we can view all tags
pd.set_option('display.max_rows', 100)

print(top_100_tags)

In [None]:
# exporting top 100 tags to a csv for hand coding 
top_100_tags.to_csv('top_tags_cheviothills.csv', index=True)

### Chatsworth Park South