<a href="https://colab.research.google.com/github/usm-cos422-522/Working/blob/main/OOLab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Object Oriented Lab

Creating classes to pull data from a Recreation API


### Exercise (Setting the stage)

*   Sign up for an account at https://ridb.recreation.gov/landing
*   Select profile in upper right hand corner and get an api key (https://ridb.recreation.gov/profile). Copy it somewhere accessible (we will need it)
*   Experiment with a few of the api's at : https://ridb.recreation.gov/docs



In [2]:
# this makes sure files are reloaded (so latest version is used)
%load_ext autoreload
%autoreload 2


In [4]:
!pip install config      # not available by default

Collecting config
  Downloading config-0.5.1-py2.py3-none-any.whl (20 kB)
Installing collected packages: config
Successfully installed config-0.5.1


In [15]:
# Bring in some libraries 
import pandas as pd
import requests
import json
import numpy as np
import config


### Exercise

We do not want to put our api key in the source code of our notebook (private information). The process is to put the api key in a file and then read it in.  If you did not do this, then you would have to substitute your api directly into the code. 
*   Create a file on your local machine called config.cfg and put one line in the file  API_KEY:'your-key'
*   upload the file to Google colab :  
Access local files through the file-explorer. Click on the three dots visible when you hover above the directory. Select the “upload” option.
Select the file(s) you wish to upload from the “File Upload” dialog window.

*   run the code below and verify that the correct key is displayed
*   When you verify it works, then you can delete the cell



In [None]:
cfg = config.Config('config.cfg')
cfg['API_KEY']

In [None]:
# When we start out, we are often in experiment mode ... just get something to work
# Verify that the following code runs : 
ridb_params = {'limit': 50, 'state' : 'ME, NH','apikey': cfg['API_KEY']}
ridb_facilities_endpoint = 'https://ridb.recreation.gov/api/v1/facilities'
response = requests.get(ridb_facilities_endpoint, params=ridb_params)
data = json.loads(response.text)
df = pd.json_normalize(data['RECDATA'])
df.head()

In [16]:
# let's clean up some of the data
df = df.replace('', np.nan)
df.columns = df.columns.str.replace('.*Latitude', 'Latitude')
df.columns = df.columns.str.replace('.*Longitude', 'Longitude')
df = df.dropna(subset=['Latitude','Longitude'])

In [17]:
# There are a lot of APIs and we may want to access them, so let's have a convienent 
# function to load and clean up data 
def load_clean_ridb_data(endpoint,url_params):
   response = requests.get(url = endpoint, params = url_params)
   data = json.loads(response.text)
   df = pd.json_normalize(data['RECDATA'])
   df = df.replace('', np.nan)
   df.columns = df.columns.str.replace('.*Latitude', 'Latitude')
   df.columns = df.columns.str.replace('.*Longitude', 'Longitude')
   df = df.dropna(subset=['Latitude','Longitude'])

   return df


In [None]:
# we check that it works and is the same as above
ridb_df = get_ridb_data(ridb_facilities_endpoint, ridb_params)
ridb_df.head()

### Exercise 
Data loading and transformation is often done by a class in Data Science projects.  We will use our simple example to practice this. The code in the above method should be organized into a class with the following structure 

RidbData : Name of class 

*   has an attribute named df that is a dataframe to hold the loaded data
*   initialization method with parameters for name for df, url endpoint, url_params
*   clean method to do data transformations
*   load data method to pull the data from the url with parameters and puts in the df attribute

Verify that the class works on the facilities by creating an object instance, loading the data, then print the shape of the df attribute. Next display the head rows, then clean the data, re-display head

Finally, verify it works on the campsite data

### Exercise 
We extend and override the clean method to accommodate the media api

The RIDB Media Endpoint is per facility, so we have to provide the facility ID in the endpoint URL:
https://ridb.recreation.gov/api/v1/facilities/{facilityID}/media/

*   Use your class to attempt to get the media for facility 10118160  verify it works and the head
*   Now try to clean the class. What happened and why ? 
*   Create a new class RidbMedia that has all the characteristics of the previous class except the clean method only selects MediaType = 'Image' (note that you should first check to make sure the dataframe is not empty).  

Verify that it works for a few facilities 


### Exercise (Graduate Students or Extra Credit)

Passing in the FacilityID as part of the URL was a bit clunky for the user. Support the ability to pass in a list of facilities and collect all the media entries for those facilities (up to the limit).  


### Handin 
Completed notebook to Brightspace