<h1>Sacramento's Best Trails</h1>
<h2>Data Cleanup and Exploration</h2>
<em>brought to you by:</em> Joy, Ruben, and Nancy

> #### Data Availability
>
> - Limited number of available trail specific and **free** APIs.
> - Decided to use REI Hiking Project API.
>
>  *Caveat* is this dataset is small, so this may not be the most representative sample. Proceed with caution.

<h2>Step 1: What kinds of data can we pull from the API?</h2>
(i.e. explore API documentation)

Hiking Project API:
<https://www.hikingproject.com/data>

In [1]:
# Dependencies
import requests
import pandas as pd
from config import api_key

url = "https://www.hikingproject.com/data/get-trails?"

Based on documentation, we then adjust our query parameters for information on trails within 150 miles of Sacramento, CA.

In [2]:
lat = 38.5816
lon = -121.4944
maxDistance = 150
key = api_key

In [3]:
# Query
query_url = f"{url}lat={lat}&lon={lon}&maxDistance={maxDistance}&key={key}"
query_url

'https://www.hikingproject.com/data/get-trails?lat=38.5816&lon=-121.4944&maxDistance=150&key=200684868-72bf127ca9e9ba5924d3b7d37a05d12c'

In [4]:
# Request data
data = requests.get(query_url).json()

In [5]:
data

{'trails': [{'id': 7005207,
   'name': 'Half Dome',
   'type': 'Hike',
   'summary': 'THE premier route in Yosemite. Hike to the top of the most iconic granite dome in the USA.',
   'difficulty': 'black',
   'stars': 4.9,
   'starVotes': 187,
   'location': 'Yosemite Valley, California',
   'url': 'https://www.hikingproject.com/trail/7005207/half-dome',
   'imgSqSmall': 'https://cdn-files.apstatic.com/hike/7003987_sqsmall_1554236134.jpg',
   'imgSmall': 'https://cdn-files.apstatic.com/hike/7003987_small_1554236134.jpg',
   'imgSmallMed': 'https://cdn-files.apstatic.com/hike/7003987_smallMed_1554236134.jpg',
   'imgMedium': 'https://cdn-files.apstatic.com/hike/7003987_medium_1554236134.jpg',
   'length': 14.5,
   'ascent': 4457,
   'descent': -4457,
   'high': 8476,
   'low': 4083,
   'longitude': -119.5583,
   'latitude': 37.7325,
   'conditionStatus': 'All Clear',
   'conditionDetails': '',
   'conditionDate': '2019-11-17 23:03:06'},
  {'id': 7004777,
   'name': 'Vernal and Nevada Fal

<h2> Step 2: Modify request to API based on research needs </h2>

In [6]:
len(data["trails"])

10

>Uh oh. Looks like the query defaults to 10 trails. 
>Let's fix that by adding a "maxResults" parameter to the query.
>
>And while we are at it, include sorting based on distance from Sacramento!

In [7]:
lat = 38.5816
lon = -121.4944
maxDistance = 150
maxResults = 500
sort = 'distance'
key = api_key

In [8]:
# Query
query_url = f"{url}lat={lat}&lon={lon}&maxDistance={maxDistance}&maxResults={maxResults}&sort={sort}&key={key}"
query_url

'https://www.hikingproject.com/data/get-trails?lat=38.5816&lon=-121.4944&maxDistance=150&maxResults=500&sort=distance&key=200684868-72bf127ca9e9ba5924d3b7d37a05d12c'

In [9]:
# Request data
data = requests.get(query_url).json()

In [10]:
data

{'trails': [{'id': 7021990,
   'name': 'Seasonal Wetland Loop',
   'type': 'Hike',
   'summary': 'A scenic loop around one of the wetland areas in Yolo Basin on a wide path with wildlife viewing.',
   'difficulty': 'green',
   'stars': 3.5,
   'starVotes': 2,
   'location': 'Davis, California',
   'url': 'https://www.hikingproject.com/trail/7021990/seasonal-wetland-loop',
   'imgSqSmall': 'https://cdn-files.apstatic.com/hike/7063351_sqsmall_1569428182.jpg',
   'imgSmall': 'https://cdn-files.apstatic.com/hike/7063351_small_1569428182.jpg',
   'imgSmallMed': 'https://cdn-files.apstatic.com/hike/7063351_smallMed_1569428182.jpg',
   'imgMedium': 'https://cdn-files.apstatic.com/hike/7063351_medium_1569428182.jpg',
   'length': 3.1,
   'ascent': 10,
   'descent': -12,
   'high': 17,
   'low': 11,
   'longitude': -121.6261,
   'latitude': 38.5505,
   'conditionStatus': 'Unknown',
   'conditionDetails': None,
   'conditionDate': '1970-01-01 00:00:00'},
  {'id': 7089186,
   'name': 'Nature Loop

In [11]:
len(data["trails"])

500

>Ah, much better.

<h2> Step 3: Utilize Pandas to better view and manipulate data </h2>

In [12]:
# Storing returned api data as a Pandas dataframe.

trails_df = pd.DataFrame(data["trails"])

In [13]:
trails_df

Unnamed: 0,id,name,type,summary,difficulty,stars,starVotes,location,url,imgSqSmall,...,length,ascent,descent,high,low,longitude,latitude,conditionStatus,conditionDetails,conditionDate
0,7021990,Seasonal Wetland Loop,Hike,A scenic loop around one of the wetland areas ...,green,3.5,2,"Davis, California",https://www.hikingproject.com/trail/7021990/se...,https://cdn-files.apstatic.com/hike/7063351_sq...,...,3.1,10,-12,17,11,-121.6261,38.5505,Unknown,,1970-01-01 00:00:00
1,7089186,Nature Loop,Hike,A common loop that explores the park.,greenBlue,5.0,1,"Carmichael, California",https://www.hikingproject.com/trail/7089186/na...,https://cdn-files.apstatic.com/hike/7065291_sq...,...,1.6,21,-20,77,69,-121.3127,38.6168,Unknown,,1970-01-01 00:00:00
2,7023706,Cosumnes Nature Loop,Hike,A nature-watcher's delight that follows trails...,green,4.0,9,"Thornton, California",https://www.hikingproject.com/trail/7023706/co...,https://cdn-files.apstatic.com/hike/7057450_sq...,...,4.0,88,-84,51,9,-121.4403,38.2657,Unknown,,1970-01-01 00:00:00
3,7016942,Lake Natoma Loop,Hike,Hike around Lake Natoma on bike paths with an ...,greenBlue,4.2,9,"Folsom, California",https://www.hikingproject.com/trail/7016942/la...,https://cdn-files.apstatic.com/hike/7016940_sq...,...,12.0,293,-294,194,110,-121.1804,38.6766,Unknown,,1970-01-01 00:00:00
4,7027546,Traylor Ranch Bird Sanctuary,Hike,There are lots of birds to see and lots of loo...,green,2.0,3,"Loomis, California",https://www.hikingproject.com/trail/7027546/tr...,https://cdn-files.apstatic.com/hike/7044187_sq...,...,3.2,65,-65,430,396,-121.2058,38.8485,Unknown,,1970-01-01 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,7086303,Lower Chaparral Trail,Trail,"Beautiful, difficult, and technical. Be cautio...",black,3.0,1,"Clayton, California",https://www.hikingproject.com/trail/7086303/lo...,,...,0.3,0,-270,1042,772,-121.8650,37.9523,Unknown,,1970-01-01 00:00:00
496,7000151,Gray Pine Trail,Trail,Spectacular scenery along a high ridgeline.,blue,4.0,1,"Kenwood, California",https://www.hikingproject.com/trail/7000151/gr...,https://cdn-files.apstatic.com/hike/7000266_sq...,...,2.6,9,-1399,2711,1321,-122.5092,38.4574,Unknown,,1970-01-01 00:00:00
497,7015046,Ridge Trail,Trail,"A roller coaster, ridge-top fire road with gre...",blueBlack,3.7,3,"Clayton, California",https://www.hikingproject.com/trail/7015046/ri...,,...,2.9,213,-802,1253,580,-121.8647,37.9509,Unknown,,1970-01-01 00:00:00
498,7086306,Chaparral Loop Trail,Trail,"This is a beautiful, albeit difficult trail al...",black,3.0,1,"Clayton, California",https://www.hikingproject.com/trail/7086306/ch...,,...,0.9,211,-381,1169,806,-121.8687,37.9535,Unknown,,1970-01-01 00:00:00


>Uh oh. Check out the 'difficulty' category. Looks like it is color coded. 
>Let's replace values to be more intuitive for our analysis.
>
>Need to convert color names to difficulty equivalents based on Hiking Trails API documentation
>
>(green = easy ; blue = intermediate ; black = difficult ; and any combination thereof)
>
>Note: We also ran initial script and used groupby('difficulty') to identify missed categories.
>Script shown below has beeen modified accordingly.

In [14]:
trails_df['difficulty'] = trails_df['difficulty'].replace('green','Easy')
trails_df['difficulty'] = trails_df['difficulty'].replace('greenBlue','Easy/Intermediate')
trails_df['difficulty'] = trails_df['difficulty'].replace('blue','Intermediate')
trails_df['difficulty'] = trails_df['difficulty'].replace('blueBlack','Intermediate/Difficult')
trails_df['difficulty'] = trails_df['difficulty'].replace('black','Difficult')
trails_df['difficulty'] = trails_df['difficulty'].replace('dblack','Extremely Difficult')
trails_df

Unnamed: 0,id,name,type,summary,difficulty,stars,starVotes,location,url,imgSqSmall,...,length,ascent,descent,high,low,longitude,latitude,conditionStatus,conditionDetails,conditionDate
0,7021990,Seasonal Wetland Loop,Hike,A scenic loop around one of the wetland areas ...,Easy,3.5,2,"Davis, California",https://www.hikingproject.com/trail/7021990/se...,https://cdn-files.apstatic.com/hike/7063351_sq...,...,3.1,10,-12,17,11,-121.6261,38.5505,Unknown,,1970-01-01 00:00:00
1,7089186,Nature Loop,Hike,A common loop that explores the park.,Easy/Intermediate,5.0,1,"Carmichael, California",https://www.hikingproject.com/trail/7089186/na...,https://cdn-files.apstatic.com/hike/7065291_sq...,...,1.6,21,-20,77,69,-121.3127,38.6168,Unknown,,1970-01-01 00:00:00
2,7023706,Cosumnes Nature Loop,Hike,A nature-watcher's delight that follows trails...,Easy,4.0,9,"Thornton, California",https://www.hikingproject.com/trail/7023706/co...,https://cdn-files.apstatic.com/hike/7057450_sq...,...,4.0,88,-84,51,9,-121.4403,38.2657,Unknown,,1970-01-01 00:00:00
3,7016942,Lake Natoma Loop,Hike,Hike around Lake Natoma on bike paths with an ...,Easy/Intermediate,4.2,9,"Folsom, California",https://www.hikingproject.com/trail/7016942/la...,https://cdn-files.apstatic.com/hike/7016940_sq...,...,12.0,293,-294,194,110,-121.1804,38.6766,Unknown,,1970-01-01 00:00:00
4,7027546,Traylor Ranch Bird Sanctuary,Hike,There are lots of birds to see and lots of loo...,Easy,2.0,3,"Loomis, California",https://www.hikingproject.com/trail/7027546/tr...,https://cdn-files.apstatic.com/hike/7044187_sq...,...,3.2,65,-65,430,396,-121.2058,38.8485,Unknown,,1970-01-01 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,7086303,Lower Chaparral Trail,Trail,"Beautiful, difficult, and technical. Be cautio...",Difficult,3.0,1,"Clayton, California",https://www.hikingproject.com/trail/7086303/lo...,,...,0.3,0,-270,1042,772,-121.8650,37.9523,Unknown,,1970-01-01 00:00:00
496,7000151,Gray Pine Trail,Trail,Spectacular scenery along a high ridgeline.,Intermediate,4.0,1,"Kenwood, California",https://www.hikingproject.com/trail/7000151/gr...,https://cdn-files.apstatic.com/hike/7000266_sq...,...,2.6,9,-1399,2711,1321,-122.5092,38.4574,Unknown,,1970-01-01 00:00:00
497,7015046,Ridge Trail,Trail,"A roller coaster, ridge-top fire road with gre...",Intermediate/Difficult,3.7,3,"Clayton, California",https://www.hikingproject.com/trail/7015046/ri...,,...,2.9,213,-802,1253,580,-121.8647,37.9509,Unknown,,1970-01-01 00:00:00
498,7086306,Chaparral Loop Trail,Trail,"This is a beautiful, albeit difficult trail al...",Difficult,3.0,1,"Clayton, California",https://www.hikingproject.com/trail/7086306/ch...,,...,0.9,211,-381,1169,806,-121.8687,37.9535,Unknown,,1970-01-01 00:00:00


In [15]:
trails_df.columns

Index(['id', 'name', 'type', 'summary', 'difficulty', 'stars', 'starVotes',
       'location', 'url', 'imgSqSmall', 'imgSmall', 'imgSmallMed', 'imgMedium',
       'length', 'ascent', 'descent', 'high', 'low', 'longitude', 'latitude',
       'conditionStatus', 'conditionDetails', 'conditionDate'],
      dtype='object')

> We don't need image files or urls, so let's drop from analysis.

In [16]:
trails_dropped = pd.DataFrame(trails_df[['id', 'name', 'type', 'summary', 'difficulty', 'stars', 'starVotes',
       'location', 'length', 'ascent', 'descent', 'high', 'low', 'longitude', 'latitude',
       'conditionStatus', 'conditionDetails', 'conditionDate']])

In [17]:
trails_dropped

Unnamed: 0,id,name,type,summary,difficulty,stars,starVotes,location,length,ascent,descent,high,low,longitude,latitude,conditionStatus,conditionDetails,conditionDate
0,7021990,Seasonal Wetland Loop,Hike,A scenic loop around one of the wetland areas ...,Easy,3.5,2,"Davis, California",3.1,10,-12,17,11,-121.6261,38.5505,Unknown,,1970-01-01 00:00:00
1,7089186,Nature Loop,Hike,A common loop that explores the park.,Easy/Intermediate,5.0,1,"Carmichael, California",1.6,21,-20,77,69,-121.3127,38.6168,Unknown,,1970-01-01 00:00:00
2,7023706,Cosumnes Nature Loop,Hike,A nature-watcher's delight that follows trails...,Easy,4.0,9,"Thornton, California",4.0,88,-84,51,9,-121.4403,38.2657,Unknown,,1970-01-01 00:00:00
3,7016942,Lake Natoma Loop,Hike,Hike around Lake Natoma on bike paths with an ...,Easy/Intermediate,4.2,9,"Folsom, California",12.0,293,-294,194,110,-121.1804,38.6766,Unknown,,1970-01-01 00:00:00
4,7027546,Traylor Ranch Bird Sanctuary,Hike,There are lots of birds to see and lots of loo...,Easy,2.0,3,"Loomis, California",3.2,65,-65,430,396,-121.2058,38.8485,Unknown,,1970-01-01 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,7086303,Lower Chaparral Trail,Trail,"Beautiful, difficult, and technical. Be cautio...",Difficult,3.0,1,"Clayton, California",0.3,0,-270,1042,772,-121.8650,37.9523,Unknown,,1970-01-01 00:00:00
496,7000151,Gray Pine Trail,Trail,Spectacular scenery along a high ridgeline.,Intermediate,4.0,1,"Kenwood, California",2.6,9,-1399,2711,1321,-122.5092,38.4574,Unknown,,1970-01-01 00:00:00
497,7015046,Ridge Trail,Trail,"A roller coaster, ridge-top fire road with gre...",Intermediate/Difficult,3.7,3,"Clayton, California",2.9,213,-802,1253,580,-121.8647,37.9509,Unknown,,1970-01-01 00:00:00
498,7086306,Chaparral Loop Trail,Trail,"This is a beautiful, albeit difficult trail al...",Difficult,3.0,1,"Clayton, California",0.9,211,-381,1169,806,-121.8687,37.9535,Unknown,,1970-01-01 00:00:00


> Adding a column that would help us measure 'best popular' trails...

In [18]:
trails_dropped['popular_vote'] = trails_dropped['stars']*trails_dropped['starVotes']
trails_dropped

Unnamed: 0,id,name,type,summary,difficulty,stars,starVotes,location,length,ascent,descent,high,low,longitude,latitude,conditionStatus,conditionDetails,conditionDate,popular_vote
0,7021990,Seasonal Wetland Loop,Hike,A scenic loop around one of the wetland areas ...,Easy,3.5,2,"Davis, California",3.1,10,-12,17,11,-121.6261,38.5505,Unknown,,1970-01-01 00:00:00,7.0
1,7089186,Nature Loop,Hike,A common loop that explores the park.,Easy/Intermediate,5.0,1,"Carmichael, California",1.6,21,-20,77,69,-121.3127,38.6168,Unknown,,1970-01-01 00:00:00,5.0
2,7023706,Cosumnes Nature Loop,Hike,A nature-watcher's delight that follows trails...,Easy,4.0,9,"Thornton, California",4.0,88,-84,51,9,-121.4403,38.2657,Unknown,,1970-01-01 00:00:00,36.0
3,7016942,Lake Natoma Loop,Hike,Hike around Lake Natoma on bike paths with an ...,Easy/Intermediate,4.2,9,"Folsom, California",12.0,293,-294,194,110,-121.1804,38.6766,Unknown,,1970-01-01 00:00:00,37.8
4,7027546,Traylor Ranch Bird Sanctuary,Hike,There are lots of birds to see and lots of loo...,Easy,2.0,3,"Loomis, California",3.2,65,-65,430,396,-121.2058,38.8485,Unknown,,1970-01-01 00:00:00,6.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,7086303,Lower Chaparral Trail,Trail,"Beautiful, difficult, and technical. Be cautio...",Difficult,3.0,1,"Clayton, California",0.3,0,-270,1042,772,-121.8650,37.9523,Unknown,,1970-01-01 00:00:00,3.0
496,7000151,Gray Pine Trail,Trail,Spectacular scenery along a high ridgeline.,Intermediate,4.0,1,"Kenwood, California",2.6,9,-1399,2711,1321,-122.5092,38.4574,Unknown,,1970-01-01 00:00:00,4.0
497,7015046,Ridge Trail,Trail,"A roller coaster, ridge-top fire road with gre...",Intermediate/Difficult,3.7,3,"Clayton, California",2.9,213,-802,1253,580,-121.8647,37.9509,Unknown,,1970-01-01 00:00:00,11.1
498,7086306,Chaparral Loop Trail,Trail,"This is a beautiful, albeit difficult trail al...",Difficult,3.0,1,"Clayton, California",0.9,211,-381,1169,806,-121.8687,37.9535,Unknown,,1970-01-01 00:00:00,3.0


<h2> Step 4: Saving our modified dataframe to CSV. </h2>

In [19]:
trails_dropped.to_csv('trails_dropped.csv')

> Additionally, we could further modify based on specific questions from our study....