# Employee and Bus Stops

## Context

A company XYZ intends to provide a bus shuttle service that would help its employees commute to the office. The company is based in Mountain View and the shuttle would provide transportation for employees based in San Francisco.

The city of San Francisco has given the company a list of potential bus stops that it may use. However, the company may use no more than 10 of these bus stops for its shuttle service.

The company XYZ is asking you to come up with the **10 most efficient bus stops** that would best serve its employees. Generally speaking, these "efficient" stops would result in the least walking distances between the employees' homes and their respective bus stops. To that end, you were given the following data:
- the list of bus stops provided by the city of San Francisco, `Bus_Stops.csv`
- a list of its employees' home addresses, `Employee_Addresses.csv`

Since trying out all possible combinations of 10 bus stops would take a prohibitively long time, the boss of XYZ has told you that you may simplify the problem and come up with 10 reasonable bus stops that are probably efficient.

## Objectives

- Explore and analyze the data. Provide comments on the outputs of your code and document your code well. 
- Feel free to show off your map visualization skills.
- Write an algorithm that produces the 10 best stops in your opinion. Also, please explain the rationale behind the algorithm. 
- Please the calculate the average walking distance per employee to their respective stops and report it at the end of your work.
- You may code the solution for this task in either Python or R. If you are coding in Python, you may enter your solution at the bottom of this notebook. Otherwise, you may create a new R jupyter notebook and copy this problem description over there. Either way, your solution is to be in the form of a jupyter notebook, regardless of the programming language used.
- Submit your work along with the data files used (`Bus_Stops.csv` & `Employee_Addresses.csv`) in a single ZIP file named as follows: `<FirstName>_<LastName>.zip`

## Evaluation

***Your solution will be evaluated on:***
- ***The soundness of the algorithm used to select the bus stops.***
- ***How much your code is neat, clear and well-documented.***
- ***Quality of narrative and commentary with interesting analyses and visuals.***

## Supplementary Notes

Prior to writing the requested algorithm, you will need to *geocode* the employees' home addresses and bus stops. You may use the [HERE REST APIs](https://developer.here.com/develop/rest-apis) for that purpose. Following are some links to help you in your task:
- To generate a free HereMaps account and an API Key to use for geocoding the addresses:  
    - https://developer.here.com/documentation/identity-access-management/dev_guide/topics/plat-using-apikeys.html
- Sections pertaining to *geocoding* in the documentation:  
    - https://developer.here.com/documentation/geocoder/dev_guide/topics/example-geocoding-free-form.html  
    - https://developer.here.com/documentation/geocoder/dev_guide/topics/example-geocoding-intersection.html
- Programmatically perform GET requests
    - Python: https://realpython.com/python-requests/
    - R: https://www.rdocumentation.org/packages/httr/versions/1.4.4

Note that HereMaps allows a maximum of 1000 requests per day, so it will take more than a single day to do all the geocoding. As a start, you may use all bus stops (~120 stops) and a few hundred employee addresses to start developing your algorithm. Save whatever you geocode so you would not need to geocode it again. Once the geocoding is done for all addresses, run the algorithm one last time and finalize your work.

*Side Note:*  
*The use of HereMaps API for geocoding is just one suggestion. If you are more comfortable using GoogleMaps API or OpenStreetMaps API, then you may use that as well. There are no constraints as to what you may use for geocoding of the addresses.*

<BR>
<BR>
<center><b><u>Finally, note that we will re-run your code (without the geocoding part) to make sure that your work is reproducible.</u></b></center>

<BR>
<center>
<H2>*** GOOD LUCK ***</H2>
</center>

--------

## Importing packages

In [2]:
import pandas as pd 

## Read Data 

In [3]:
bus_stops = pd.read_csv("Bus_Stops.csv")
employee_addresses = pd.read_csv("Employee_Addresses.csv")

In [5]:
print("Bus stops shape: ", bus_stops.shape)
bus_stops.head()

Bus stops shape:  (119, 2)


Unnamed: 0,Street_One,Street_Two
0,MISSION ST,ITALY AVE
1,MISSION ST,NEW MONTGOMERY ST
2,MISSION ST,01ST ST
3,MISSION ST,20TH ST
4,MISSION ST,FREMONT ST


In [6]:
print("Employee addresses shape: ", bus_stops.shape)
employee_addresses.head()

Employee addresses shape:  (119, 2)


Unnamed: 0,address,employee_id
0,"98 Edinburgh St, San Francisco, CA 94112, USA",206
1,"237 Accacia St, Daly City, CA 94014, USA",2081
2,"1835 Folsom St, San Francisco, CA 94103, USA",178
3,"170 Cambridge St, San Francisco, CA 94134, USA",50
4,"16 Roanoke St, San Francisco, CA 94131, USA",1863


In [8]:
import requests 

apiKey = "Fw5VNn0p2uS6vbH90MI6q_1ocMhR6iN_X5ZVGxBGQDc"

address = "98 Edinburgh St, San Francisco, CA 94112, USA"

params = {
    'apiKey': apiKey,
    'searchtext': address
}

base_url = "https://geocoder.ls.hereapi.com/6.2/geocode.json?"

response = requests.get(base_url, params=params)
print(response.json)
print(response.text)

<bound method Response.json of <Response [200]>>
{"Response":{"MetaInfo":{"Timestamp":"2022-12-05T23:05:54.267+0000"},"View":[{"_type":"SearchResultsViewType","ViewId":0,"Result":[{"Relevance":1.0,"MatchLevel":"houseNumber","MatchQuality":{"Country":1.0,"State":1.0,"City":1.0,"Street":[1.0],"HouseNumber":1.0,"PostalCode":1.0},"MatchType":"interpolated","Location":{"LocationId":"NT_jItIZxj3c35.gcKXRmsFCD_5gD","LocationType":"point","DisplayPosition":{"Latitude":37.7276421,"Longitude":-122.4273147},"NavigationPosition":[{"Latitude":37.7275504,"Longitude":-122.4271897}],"MapView":{"TopLeft":{"Latitude":37.7287663,"Longitude":-122.428736},"BottomRight":{"Latitude":37.726518,"Longitude":-122.4258934}},"Address":{"Label":"98 Edinburgh St, San Francisco, CA 94112, United States","Country":"USA","State":"CA","County":"San Francisco","City":"San Francisco","District":"Excelsior","Street":"Edinburgh St","HouseNumber":"98","PostalCode":"94112","AdditionalData":[{"value":"United States","key":"Cou

In [26]:
print(response.text)

{"Response":{"MetaInfo":{"Timestamp":"2022-12-05T23:05:54.267+0000"},"View":[{"_type":"SearchResultsViewType","ViewId":0,"Result":[{"Relevance":1.0,"MatchLevel":"houseNumber","MatchQuality":{"Country":1.0,"State":1.0,"City":1.0,"Street":[1.0],"HouseNumber":1.0,"PostalCode":1.0},"MatchType":"interpolated","Location":{"LocationId":"NT_jItIZxj3c35.gcKXRmsFCD_5gD","LocationType":"point","DisplayPosition":{"Latitude":37.7276421,"Longitude":-122.4273147},"NavigationPosition":[{"Latitude":37.7275504,"Longitude":-122.4271897}],"MapView":{"TopLeft":{"Latitude":37.7287663,"Longitude":-122.428736},"BottomRight":{"Latitude":37.726518,"Longitude":-122.4258934}},"Address":{"Label":"98 Edinburgh St, San Francisco, CA 94112, United States","Country":"USA","State":"CA","County":"San Francisco","City":"San Francisco","District":"Excelsior","Street":"Edinburgh St","HouseNumber":"98","PostalCode":"94112","AdditionalData":[{"value":"United States","key":"CountryName"},{"value":"California","key":"StateName

In [27]:
response_txt = {
    "Response":{
        "MetaInfo":{"Timestamp":"2022-12-05T23:05:54.267+0000"},
        
        "View":[{"_type":"SearchResultsViewType","ViewId":0,
                 
                 "Result":[{"Relevance":1.0,"MatchLevel":"houseNumber",
                            
                            "MatchQuality":{"Country":1.0,"State":1.0,"City":1.0,"Street":[1.0],
                                            "HouseNumber":1.0,"PostalCode":1.0},
                            
                            "MatchType":"interpolated",
                            
                            "Location":{"LocationId":"NT_jItIZxj3c35.gcKXRmsFCD_5gD",
                                        "LocationType":"point",
                                        "DisplayPosition":{"Latitude":37.7276421,"Longitude":-122.4273147},
                                        "NavigationPosition":[{"Latitude":37.7275504,"Longitude":-122.4271897}],
                                        "MapView":{"TopLeft":{"Latitude":37.7287663,"Longitude":-122.428736},
                                                   "BottomRight":{"Latitude":37.726518,"Longitude":-122.4258934}},
                                        "Address":{"Label":"98 Edinburgh St, San Francisco, CA 94112, United States",
                                                   "Country":"USA","State":"CA","County":"San Francisco",
                                                   "City":"San Francisco","District":"Excelsior",
                                                   "Street":"Edinburgh St","HouseNumber":"98","PostalCode":"94112",
                                                   "AdditionalData":[{"value":"United States","key":"CountryName"},
                                                                     {"value":"California","key":"StateName"},
                                                                     {"value":"San Francisco","key":"CountyName"},
                                                                     {"value":"N","key":"PostalCodeType"}
                                                                    ]
                                                  }
                                       }
                           }
                          ]
                }
               ]
    }
}

In [54]:
import plotly.graph_objects as go

# import pandas as pd

# df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_february_us_airport_traffic.csv')
# df['text'] = df['airport'] + '' + df['city'] + ', ' + df['state'] + '' + 'Arrivals: ' + df['cnt'].astype(str)

fig = go.Figure(data=go.Scattergeo(
        lon = [-122.4273147],
        lat = [37.7276421],
        text = "98 Edinburgh St",
        mode = 'markers',
        
        ))

fig.update_layout(
        title = 'Most trafficked US airports<br>(Hover for airport names)',
        geo=dict(
            scope='usa',
            projection_type='albers usa',
            showland=True,
        )
    )
fig.show()

In [57]:
import plotly.graph_objects as go

mapbox_access_token = open(".mapbox_token").read()

fig = go.Figure(go.Scattermapbox(
        lat=['37.7276421'],
        lon=['-122.4273147'],
        mode='markers',
        marker=go.scattermapbox.Marker(
            size=14
        ),
        text=['98 Edinburgh St'],
    ))

fig.update_layout(
    hovermode='closest',
    mapbox=dict(
        accesstoken=mapbox_access_token,
        bearing=0,
        center=go.layout.mapbox.Center(
            lat=37,
            lon=-122
        ),
        pitch=0,
        zoom=5
    )
)

fig.show()