## Notebook Objectives

The objective of this notebook is to transform rest area data for the United States and Canada downloaded from [poi factory](http://www.poi-factory.com/poifiles) for later use in the RV Nav API. This CSV was partly cleaned in MS Excel due to its smaller size. MS Excel is still a great tool for data cleaning!

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('../Data/RestAreasCombined_USA.csv',header=None, names=['Longitude', 'Latitude', 'State', 'Roadway', 'Direction', 'Name', 'Mile Marker', 'Amenities'])

In [3]:
df.head()

Unnamed: 0,Longitude,Latitude,State,Roadway,Direction,Name,Mile Marker,Amenities
0,-147.814902,61.800549,AK,AK-1,EB/WB,MATANUSKA GLACIER STATE REC PARK REST AREA,MM101,"[RR,PT,,Pets,HF]|SCENIC VIEW"
1,-87.4185,30.57584,AL,I-10,WB,BALDWIN COUNTY WELCOME CENTER,MM65.8,"[RR,PT,VM,Pets,HF]|"
2,-88.393032,30.477238,AL,I-10,EB,GRAND BAY WELCOME CENTER,MM0.45,"[RR,PT,VM,Pets,HF]|RV Dump"
3,-85.369424,33.667729,AL,I-20,WB,CLEBURNE WELCOME CENTER,MM213,"[RR,PT,VM,Pets,HF]|"
4,-86.309572,33.850692,AL,I-59,NB,ST CLAIR COUNTY REST AREA,MM164,"[RR,PT,VM,Pets,HF]|RV Dump"


In [4]:
# Transform amenities column into unique binary columns for each feature 
df['Restrooms'] = df['Amenities'].str.contains('RR', regex=False)
df['Restrooms(Vault Toilet)'] = df['Amenities'].str.contains('RR(VT)', regex=False)
df['Restrooms(Portapotties)'] = df['Amenities'].str.contains('RR(PP)', regex=False)
df['Picnic Tables'] = df['Amenities'].str.contains('PT', regex=False)
df['Vending Machines'] = df['Amenities'].str.contains('VM', regex=False)
df['Pet Walking Area'] = df['Amenities'].str.contains('Pets', regex=False)
df['Handicapped Facilities'] = df['Amenities'].str.contains('HF', regex=False)
df['RV Dump'] = df['Amenities'].str.contains('RV Dump', regex=False)
df['Restaurant'] = df['Amenities'].str.contains('Food', regex=False)
df['Gas'] = df['Amenities'].str.contains('Gas', regex=False)
df['Scenic View'] = df['Amenities'].str.contains('SCENIC VIEW', regex=False)

In [5]:
df.head()

Unnamed: 0,Longitude,Latitude,State,Roadway,Direction,Name,Mile Marker,Amenities,Restrooms,Restrooms(Vault Toilet),Restrooms(Portapotties),Picnic Tables,Vending Machines,Pet Walking Area,Handicapped Facilities,RV Dump,Restaurant,Gas,Scenic View
0,-147.814902,61.800549,AK,AK-1,EB/WB,MATANUSKA GLACIER STATE REC PARK REST AREA,MM101,"[RR,PT,,Pets,HF]|SCENIC VIEW",True,False,False,True,False,True,True,False,False,False,True
1,-87.4185,30.57584,AL,I-10,WB,BALDWIN COUNTY WELCOME CENTER,MM65.8,"[RR,PT,VM,Pets,HF]|",True,False,False,True,True,True,True,False,False,False,False
2,-88.393032,30.477238,AL,I-10,EB,GRAND BAY WELCOME CENTER,MM0.45,"[RR,PT,VM,Pets,HF]|RV Dump",True,False,False,True,True,True,True,True,False,False,False
3,-85.369424,33.667729,AL,I-20,WB,CLEBURNE WELCOME CENTER,MM213,"[RR,PT,VM,Pets,HF]|",True,False,False,True,True,True,True,False,False,False,False
4,-86.309572,33.850692,AL,I-59,NB,ST CLAIR COUNTY REST AREA,MM164,"[RR,PT,VM,Pets,HF]|RV Dump",True,False,False,True,True,True,True,True,False,False,False


In [16]:
# Drop amenities column 
df = df.drop(columns='Amenities', axis=1)

In [11]:
# Ensure Lat/Long values are all present
df['Longitude'].isnull().value_counts()

False    2882
Name: Longitude, dtype: int64

In [12]:
# Ensure Lat/Long values are all present
df['Latitude'].isnull().value_counts()

False    2882
Name: Latitude, dtype: int64

In [17]:
# One last verification
df.head()

Unnamed: 0,Longitude,Latitude,State,Roadway,Direction,Name,Mile Marker,Restrooms,Restrooms(Vault Toilet),Restrooms(Portapotties),Picnic Tables,Vending Machines,Pet Walking Area,Handicapped Facilities,RV Dump,Restaurant,Gas,Scenic View
0,-147.814902,61.800549,AK,AK-1,EB/WB,MATANUSKA GLACIER STATE REC PARK REST AREA,MM101,True,False,False,True,False,True,True,False,False,False,True
1,-87.4185,30.57584,AL,I-10,WB,BALDWIN COUNTY WELCOME CENTER,MM65.8,True,False,False,True,True,True,True,False,False,False,False
2,-88.393032,30.477238,AL,I-10,EB,GRAND BAY WELCOME CENTER,MM0.45,True,False,False,True,True,True,True,True,False,False,False
3,-85.369424,33.667729,AL,I-20,WB,CLEBURNE WELCOME CENTER,MM213,True,False,False,True,True,True,True,False,False,False,False
4,-86.309572,33.850692,AL,I-59,NB,ST CLAIR COUNTY REST AREA,MM164,True,False,False,True,True,True,True,True,False,False,False


Our dataset is now ready to be used in our API!

In [20]:
df.to_csv('Rest_Stop_API.csv')