Skip to content
wurambo edited this page Nov 4, 2016 · 24 revisions

#Pillars#

##Parks## Parks are the backbone of Sweet Outdoors. Each park has several attributes behind the scenes: id, name, latitude, longitude, address, phone number, rating, website URL, zipcode, zipregion, and photo URL, as well as a foreign key to the state, and a one to many relationship to events. These turn into the attributes shown in the Parks table: a name, an address, phone number, a website, an event happening nearby soon, and the state the park is in. Each park instance also contains these details, and the page highlights the state the park is located in, with a description, with a list of nearby upcoming events, and a nearby campground if there is one.
Data Source: Google Places

To use Google Places, we scraped a list of national parks from Wikipedia, and a list of state parks from www.stateparks.com. We then used these lists to query Google Places for national and state parks in the US. Once we get the request from Google Places, we check the status attribute to make sure we our query worked. After that, we have to check to make sure that at least 1 result is returned. We then parse the JSON to get all the data for our database. We parse out the rating but if the rating doesn't exist, we parse through all the reviews to get the average rating. We also use try except blocks to store our data, because our API had incomplete data and was missing attributes which would raise KeyError exceptions. After all our data is collected, we add the data into our database.

##Campgrounds## Campgrounds give people a chance to travel away from home and to stay in lodgings so they can find escape in nature. Each campground has the following attributes in the database: ID, name, description, latitude, longitude, directions, phone, email, and zipcode, as well as foreign keys to the park and state the campground is located in. These turn into the attributes shown in the Campgrounds table: a name, phone number, email, zipcode, nearby park, and state. Each campgrounds instance also has these details, other than latitude and longitude, and the page links to what park and state the campground belongs to.
Data Sources: Recreation Information Database, Bing

Scraping: The RIDB API documentation was very helpful. We actually switched API sources for this phase due to difficulties with keys in our original website, as well as that website’s general sense of being hard to work with and not well formatted. In contrast, this site lists all attributes for each type of data they have, and it explains the relationships between things like Campsites and Facilities, which combined makes our Campground attribute. We used the requests library for python to make requests and turn it into a json dictionary that is easy to parse through with Python. We got most of our attributes this way, but the only location data we got was latitude and longitude, and we wanted zipcode and state for linking different pillars together. Originally, we used the Google Geocoding API to get this information, but the free version only allowed 2,500 requests per day, and we had 8,000 campgrounds to process, so we switched over to also using Bing’s Map API. We did some basic error processing to make sure we had all the data we wanted. One other filter we applied before even adding the data to the database was to check if the zipcode of the campground matched a park, as we wanted there to always be a link to a park from a campground, so while scraping the Parks data, we printed out a set of the zipcodes, stored that as a list in the campground scraping file, and checked if the campground’s zip was in that list. If it is, we inserted it into the database.

##States## States are important for categorization, because users of the website are likely to want to view parks, campgrounds, and events near them. Each state has 6 attributes: name, highest elevation point, population, description, and total area, as well as one to many relationships to parks and events. These turn into the attributes shown in the States table: a name, highest point, population, total area, recommended park, and an upcoming event in the state. Each state instance also has these details, and the page highlights the recommended park in the state with a description, along with an upcoming event if there is one.
Data Source: Wikipedia's MediaWiki API

We used wptools and wikipedia which are wrappers for the MediaWiki API. WPTOOLS allows you to obtain the the information from the infobox that is located on the side of many Wikipedia pages. We were able to extract our attributes from this with the following code:

state = wptools.page("Texas").get_parse() #saying state.infobox at this point gets you all the content from the infobox

landarea = state.infobox["TotalAreaUS"] #This gets you Texas' area

We used wikipedia to get the description of the state. The code for that was :

wikstate = wikipedia.page("Texas") #gets the page corresponding to what you pass into page

summary = wikstate.content #Gets you almost all the text from the corresponding Wikipedia page, in this case Texas

Then using some coding wizardry we got it down to just the first paragraph which provides some basic info for the state. We then bundle all of our attributes into a state object and add it to the database and we do this for every state.

##Events## Events show user's activities such as charity runs and festivals coming up in or near state and federal parks. Each event has several attributes in the database: ID, organization name (which we are using for the event name), latitude, longitude, topic list, start date, end date, picture url, contact phone number, city, zipcode, and zipregion, as well as a foreign key to the park and state the event is located in. These turn into the attributes shown in the Events table: a name, category (from topics), date, organization, city, state, and park. Each event instance also has these details, and the page links to the state the event is in and the closest park to the event. Data Source: Active Access's Activity Search API V2

This data was retrieved using calls to the Activity Search API (V2) provided by Active Access. From this api, we were able to make get requests with filters that allowed us to select events relevant to the parks that we are hosting on SWEetOutdoors.me. An example of our API get request is as follows:

http://api.amp.active.com/v2/search?per_page=25&current_page=1&start_date=2016-08-01..2017-12-31&category=trail%20heads&api_key=kww96xbnrt8a3dj6ndfkdyzx

This query specifies the category type Trail Head, so that all of our events take place outdoors on a trail. Furthermore, we specify the date our events take place in well in to the future starting from the beginning of November 2016. With the data we retrieve from this get request, we then store specific data for our website to display. The key variable we store is zipcode because we use the zipcode to determine if an event is in the same regional area as our parks. We do this by comparing the first three digits of our park and event zipcodes. This works because the first three digits of a zipcode designates the local region of the US we are in.

Clone this wiki locally