### Route Visualization using Geopy, Polyline, Folium
Using this code, we can create an interactive route visualization using Geopy, Polyline, and Folium. 

Originally in our project proposal we said we would use the Google Maps API, but through experimentation it proved to be difficult to work with. You can obtain an API key from Google, but a lot of features cannot be accessed unless you provide billing information to Google. If you provide billing information, there is a free version you can use, but you have a limited number of geocode requests you can make. Requests over that threshold will charge you using your billing information. That's not something that I think we want to deal with for a school project, so I tried to find some alternate methods.

### Modules
The modules we'll be using throughout the code are Pandas and Requests.
- **Pandas**: We use Pandas to create the dataframe containing all of the addresses and their corresponding longitude and latitude coordinates. This dataframe will also be used later in selecting which specific locations we want to visualize.
- **requests**: This module allows us to make http requests in Python. We use requests to query from OSRM and get coordinates for each of our addresses.

### Web Scraping

In [69]:
# main modules to use
import pandas as pd # create the locations dataframe
import requests # send http requests
from bs4 import BeautifulSoup as soup
import import_ipynb
import webscraping_code_touristsite_top150 as scraper
import webscraping_code_hotel_all as hotel_scraper

importing Jupyter notebook from webscraping_code_touristsite_top150.ipynb
Which tourist sites do you want to go? (*please input the names of the tourist sites separated by a period)
importing Jupyter notebook from webscraping_code_hotel_all.ipynb
Which hotel do you want to live?


In [103]:
html_main = requests.get("https://www.tripadvisor.com/Attractions-g32655-Activities-a_allAttractions.true-Los_Angeles_California.html")
bsobj_main = soup(html_main.content, "lxml")

In [104]:
rank = []
tourist_site_name = []
tourist_site_link = []
for s in bsobj_main.find_all("section", class_="_2TabEHya _3YhIe-Un"):
    rank.append(s.find("div", class_="_1gpq3zsA _1zP41Z7X").text.split(". ")[0])
    tourist_site_name.append(s.find("div", class_="_1gpq3zsA _1zP41Z7X").text.split(". ")[1])
    tourist_site_link.append(s.find("div", class_="_3W_31Rvp _1nUIPWja _1l7Rsl_O _3ksqqIVm _2b3s5IMB").a["href"])

In [105]:
top_150 = list(range(30, 150, 30))

In [106]:
for num in top_150:
    html_main = requests.get("https://www.tripadvisor.com/Attractions-g32655-Activities-oa" + str(num) + "-Los_Angeles_California.html")
    bsobj_main = soup(html_main.content, "lxml")
    
    for s in bsobj_main.find_all("section", class_="_2TabEHya _3YhIe-Un"):
        rank.append(s.find("div", class_="_1gpq3zsA _1zP41Z7X").text.split(". ")[0])
        tourist_site_name.append(s.find("div", class_="_1gpq3zsA _1zP41Z7X").text.split(". ")[1])
        tourist_site_link.append(s.find("div", class_="_3W_31Rvp _1nUIPWja _1l7Rsl_O _3ksqqIVm _2b3s5IMB").a["href"])

In [107]:
tourist_site_link_update = []
for site in tourist_site_link:
    tourist_site_link_update.append("https://www.tripadvisor.com/" + site)

In [108]:
data = {"Rank": rank, "Tourist Site Name": tourist_site_name, "Site Link": tourist_site_link_update}
df = pd.DataFrame.from_dict(data)
df

Unnamed: 0,Rank,Tourist Site Name,Site Link
0,1,The Getty Center,https://www.tripadvisor.com//Attraction_Review...
1,2,Griffith Observatory,https://www.tripadvisor.com//Attraction_Review...
2,3,Universal Studios Hollywood,https://www.tripadvisor.com//Attraction_Review...
3,4,Petersen Automotive Museum,https://www.tripadvisor.com//Attraction_Review...
4,5,The Wizarding World of Harry Potter,https://www.tripadvisor.com//Attraction_Review...
...,...,...,...
145,146,S.S,https://www.tripadvisor.com//Attraction_Review...
146,147,Kenneth Hahn State Recreation Area,https://www.tripadvisor.com//Attraction_Review...
147,148,Geffen Playhouse,https://www.tripadvisor.com//Attraction_Review...
148,149,Los Angeles Police Museum,https://www.tripadvisor.com//Attraction_Review...


In [109]:
tourist_site_name

['The Getty Center',
 'Griffith Observatory',
 'Universal Studios Hollywood',
 'Petersen Automotive Museum',
 'The Wizarding World of Harry Potter',
 'Battleship USS Iowa Museum',
 'The Broad',
 'Staples Center',
 'Griffith Park',
 'The Grove',
 'La Brea Tar Pits and Museum',
 'Walt Disney Concert Hall',
 'Natural History Museum of Los Angeles County',
 'Runyon Canyon Park',
 'Venice Canals Walkway',
 'The Nethercutt Collection',
 'Dodger Stadium',
 'Union Station',
 'Hollywood Sign',
 'Bradbury Building',
 'Lake Hollywood Park',
 'Universal CityWalk Hollywood',
 'Los Angeles County Museum of Art',
 'Madame Tussauds Hollywood',
 'Abbot Kinney Boulevard',
 'Angels Flight Railway',
 'University of California, Los Angeles (UCLA)',
 'Hollywood Forever Cemetery',
 'Citadel Outlets',
 'Pantages Theatre',
 'The Hollywood Museum',
 'OUE Skyspace LA',
 'Los Angeles Central Library',
 'Olvera Street',
 'Pierce Brothers Westwood Village Memorial Park',
 'Dolby Theatre',
 'The Greek Theatre',
 'Ci

In [110]:
want_to_go_name = str(input("Which tourist sites do you want to go? (*please input the names of the tourist sites separated by a period)"))

Which tourist sites do you want to go? (*please input the names of the tourist sites separated by a period)The Broad.Staples Center.Griffith Park.The Grove


In [111]:
visit_html = scraper.visit_tourist(df, want_to_go_name)

In [112]:
locations = scraper.find_location(visit_html)
locations

['221 S Grand Ave, Los Angeles, CA 90012-3020',
 '1111 S Figueroa St , Los Angeles, CA 90015-1300"',
 '4730 Crystal Springs Dr, Los Angeles, CA 90027-1401',
 '189 The Grove Drive, Los Angeles, CA 90036-6222']

In [70]:
html_main = requests.get("https://www.tripadvisor.com/Hotels-g32655-Los_Angeles_California-Hotels.html")
bsobj_main = soup(html_main.content, "lxml")

In [71]:
rank = []
hotel_name = []
rate_score = []
hotel_link = []
for s1 in bsobj_main.find_all("div", class_ = "ui_column is-8 main_col allowEllipsis"):
    if s1.find("span", class_ = "ui_merchandising_pill sponsored_v2") == None:
        for s2 in s1.find_all("div", class_ = "listing_title"):
            hotel_name.append(s1.a.text[6:])
            hotel_link.append(s1.a["href"])
        for s3 in s1.find_all("div", class_ = "info-col"):
            if s3.a.text == "0 reviews":
                rate_score.append("0 of 5 bubbles")
            else:
                rate_score.append(s3.a["alt"])

for s4 in bsobj_main.find_all("div", class_ = "popindex"):    
    rank.append(s4.text.split(" ")[0][1:])

In [72]:
top_600 = list(range(30, 600, 30))

In [73]:
for num in top_600:
    html_main = requests.get("https://www.tripadvisor.com/Hotels-g32655-oa" + str(num) + "-Los_Angeles_California-Hotels.html")
    bsobj_main = soup(html_main.content, "lxml")
    
    for s1 in bsobj_main.find_all("div", class_ = "ui_column is-8 main_col allowEllipsis"):
        if s1.find("span", class_ = "ui_merchandising_pill sponsored_v2") == None:
            for s2 in s1.find_all("div", class_ = "listing_title"):
                hotel_name.append(s1.a.text[6:])
                hotel_link.append(s1.a["href"])
            for s3 in s1.find_all("div", class_ = "info-col"):
                if s3.a.text == "0 reviews":
                    rate_score.append("0 of 5 bubbles")
                else:
                    rate_score.append(s3.a["alt"])

    for s4 in bsobj_main.find_all("div", class_ = "popindex"):    
        rank.append(s4.text.split(" ")[0][1:])

In [74]:
hotel_link_update = []
for site in hotel_link:
    hotel_link_update.append("https://www.tripadvisor.com/" + site)

In [75]:
data = {"Rank": rank[:450], "Hotel Name": hotel_name[:450], "Rate": rate_score[:450], "Site Link": hotel_link_update[:450]}
hotels = pd.DataFrame.from_dict(data)
hotels

Unnamed: 0,Rank,Hotel Name,Rate,Site Link
0,1,The Hollywood Roosevelt,4.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
1,2,Hilton Los Angeles/Universal City,4 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
2,3,Hotel Erwin,4.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
3,4,Hotel Figueroa,4.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
4,5,Hollywood Hotel,4 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
...,...,...,...,...
445,305,Studio Inn Van Nuys,3 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
446,313,Stars Inn,2.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
447,317,Lincoln Inn,2.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...
448,321,Vista Motel,1.5 of 5 bubbles,https://www.tripadvisor.com//Hotel_Review-g326...


In [133]:
hotel_want_to_go_name = str(input("Which hotel do you want to live?"))
hotel_visit_html = hotel_scraper.visit_hotel(hotels, hotel_want_to_go_name)

Which hotel do you want to live?Hollywood Hotel


In [134]:
hotel_location = hotel_scraper.find_location(hotel_visit_html)
hotel_location

['1160 North Vermont Avenue District of Hollywood, Los Angeles, CA 90029-1729']

### Geocoding
Another module we will be using is **Geopy**, which is a Python client for geocoding. What is geocoding?

**Geocoding** is the process of taking some text-based description of a location (usually the address) and transforming it into coordinates, using the longitude/latitude pair. We will use geocoding to transform the addresses of our hotel/attraction locations (obtained from webscraping) into longitude/latitude coordinates (as far as I know, TripAdvisor doesn't provide coordinates).

From the Geopy module, we will use **Nominatim**, which is a geocoder for **OpenStreetMap (OSM)** data. OSM is a map of the world that is open data, licensed under the Open Database License. As such, we will credit OSM at the end of the notebook. You can check out OSM here:

https://www.openstreetmap.org/

In [12]:
from geopy.geocoders import Nominatim # Geopy is a Python client for geocoding. We use the Nominatim geocoder for OpenStreetMap (OSM) data.

We create a sample locations dataframe using Nominatim geocoding. I will update this code later once we have a working web scraper. For now, I am using the top 3 attractions in Los Angeles from TripAdvisor: the Getty Center, the Griffith Observatory, and Universal Studios Hollywood. 

Something I noticed about OSM is that not all addresses work. For example, the address for the Getty Center provided by TripAdvisor is *1200 Getty Center Dr N Sepulveda Blvd & Getty Center Dr, Los Angeles, CA*. However, if you take this address and plug it into OSM's website, you will get no results. This causes problems for the geocoder, and you will not get the location. For this example, I simplified the address for the Getty Center as *1200 Getty Center, Los Angeles*, but for the long term we will have to find some workaround that can apply to all addresses. I think the solution might be some combination of regex and try/except blocks.

In [13]:
import re

In [17]:
# Nominatim geocoding and creating the Pandas dataframe

geolocator = Nominatim(user_agent = "m") # use custom user_agent to avoid violating Nominatim's usage policy
df = pd.DataFrame(columns = ["Address", "Longitude", "Latitude"]) # creating the Pandas dataframe

for location in locations:
    location = re.sub(r'\s\d{5}-\d{4}', "", location)
    if location == "1200 Getty Center Dr N Sepulveda Blvd & Getty Center Dr, Los Angeles, CA":
        location = "1200 Getty Center"
    loc = geolocator.geocode(location) # geocoding the location
    print(loc)
    df.loc[len(df.index)] = [loc, loc.longitude, loc.latitude] # adding the location, longitude, and latitude to the dataframe
    
df

Getty Research Institute, 1200, Getty Center Drive, Brentwood, Los Angeles, Los Angeles County, California, 90049, United States
Griffith Observatory, 2800, East Observatory Road, Griffith Park, Los Angeles, Los Angeles County, California, 90027, United States
Universal City Plaza, Lankershim Boulevard, Los Angeles, Los Angeles County, California, 91608, United States
Petersen Automotive Museum‎, 6060, Wilshire Boulevard, Mid-Wilshire, Los Angeles, Los Angeles County, California, 90036, United States


Unnamed: 0,Address,Longitude,Latitude
0,"(Getty Research Institute, 1200, Getty Center ...",-118.475712,34.076951
1,"(Griffith Observatory, 2800, East Observatory ...",-118.300293,34.118219
2,"(Universal City Plaza, Lankershim Boulevard, L...",-118.361879,34.138321
3,"(Petersen Automotive Museum‎, 6060, Wilshire B...",-118.361191,34.062315


In [18]:
# Cleans the addresses by removing the code at the end and then removing the last word of the address with each iteration of
# the loop. OSM sometimes has trouble recognizing addresses that have extra details in the middle of their addresses. 
def location_cleaner(locations):
    locations_copy = []
    for location in locations:
        location = re.sub(r'\s\d{5}-\d{4}', "", location)
        words = location.split()
        location_length = len(words)
        for i in range(location_length, 0, -1):
            if geolocator.geocode(location) != None:
                locations_copy.append(location)
                break
            location = location.rsplit(' ', 1)[0]
    return locations_copy

In [92]:
geolocator = Nominatim(user_agent = "pic16b")

In [113]:
df = pd.DataFrame(columns = ["Address", "Longitude", "Latitude"]) # creating the Pandas dataframe

locations = location_cleaner(locations)
for location in locations:
    loc = geolocator.geocode(location) # geocoding the location
    df.loc[len(df.index)] = [loc, loc.longitude, loc.latitude] # adding the location, longitude, and latitude to the dataframe
    
df

Unnamed: 0,Address,Longitude,Latitude
0,"(The Broad Museum, 221, South Grand Avenue, Bu...",-118.250557,34.054441
1,"(Figueroa Street, Financial District, Downtown...",-118.254802,34.054796
2,"(Griffith Park Boys Camp, 4730, Crystal Spring...",-118.296511,34.140604
3,"(American Girl Place, 189, The Grove Drive, Th...",-118.359047,34.0724


We can see that we have obtained the longitude and latitude for each address and we have successfully added the data to a Pandas dataframe. A quick Google search tells us that our coordinates are pretty accurate, up to about 3 decimal places.

In [135]:
# may or may not need a dataframe depending on if we allow a user to stay at multiple hotels
hotel = pd.DataFrame(columns = ["Address", "Longitude", "Latitude"]) # creating the Pandas dataframe

hotel_locations = location_cleaner(hotel_location)
print(hotel_locations)
for location in hotel_locations:
    loc = geolocator.geocode(location) # geocoding the location
    hotel.loc[len(df.index)] = [loc, loc.longitude, loc.latitude] # adding the location, longitude, and latitude to the dataframe
    
hotel

['1160 North Vermont Avenue District of Hollywood, Los Angeles, CA']


Unnamed: 0,Address,Longitude,Latitude
4,"(North Vermont Avenue, Bicycle District, East ...",-118.291714,34.079147


### Creating the Route Visualization
Now that we have our addresses and coordinates within a Pandas dataframe, we can begin to construct the route visualization. For this section, I mainly referenced this article: https://www.thinkdatascience.com/post/2020-03-03-osrm/osrm/

In the article, Mr. Yan makes use of **Open Source Routing Machine (OSRM)**, which is a C++ routing engine for the shortest paths in road networks. Essentially, you use the requests module to perform an http request, obtaining route data for the routes between your specified locations. You can learn more about OSRM and how to construct requests here: http://project-osrm.org/docs/v5.7.0/api/?language=Python#general-options

For the route visualization, we also import the **Folium** and **Polyline** modules. 
- **Folium**: The folium module takes location and route data and plots it on an interactive map. 
- **Polyline**: The polyline module is the Python implementation of Google's Encoded Polyline Algorithm Format, which allows you to store a series of coordinates as an encoded string. You can read about it here: https://developers.google.com/maps/documentation/utilities/polylinealgorithm

In [20]:
import folium # visualizes data on map; wondering if I can replace this with plotly express
import polyline # Python implementation of Google’s Encoded Polyline Algorithm Format

Let's attempt to get the route between the Griffith Observatory and Universal Studios Hollywood. We specify our request url to be the default OSRM url, followed by the mode of transportation (car) and the coordinates for our locations. The first pair of coordinates corresponds to the Griffith Observatory, and the second pair corresponds to Universal Studios Hollywood.

In [21]:
# The following code will be based on the code in the above article. I don't fully understand what it's doing yet, but I'm 
# hoping to get some more practice with it so I can understand it better.

url = "http://router.project-osrm.org/route/v1/car/-118.475712,34.076951;-118.300293,34.118219;-118.361879,34.138321" # we specify the long/lat coordinates of our starting point and our ending point, as well as mode of transportation; hopefully we can add more waypoints and get the route between multiple locations
r = requests.get(url) # getting the request
res = r.json()
res

{'code': 'Ok',
 'waypoints': [{'hint': 'mcqtiMbKrYgcAAAAAAAAAMUAAAAAAAAAWzWZQQAAAAAjAQpDAAAAABwAAAAAAAAAxQAAAAAAAAAYQwAAsTHw-GXzBwJANPD4F_kHAgwA3wL9RD_Z',
   'distance': 172.65425,
   'location': [-118.476367, 34.075493],
   'name': 'Firth Avenue'},
  {'hint': 'QB2wiP___38FAAAABQAAAJgAAAADAAAASbJlQAAAAACSqNJC1c3gPwUAAAAFAAAAmAAAAAMAAAAYQwAAKeHy-AeeCAJ74fL4S5oIAg4Azwb9RD_Z',
   'distance': 106.310396,
   'location': [-118.300375, 34.119175],
   'name': 'West Observatory Road'},
  {'hint': 'MzKhiP___38AAAAAHQAAAEgAAAArAAAARO68Pa9_oEELyklCJcPxQQAAAAAdAAAASAAAACsAAAAYQwAAivHx-KLqCALp8PH40egIAgIArwb9RD_Z',
   'distance': 53.674346,
   'location': [-118.361718, 34.138786],
   'name': 'Universal Hollywood Drive'}],
 'routes': [{'legs': [{'steps': [],
     'weight': 2209,
     'distance': 27094.1,
     'summary': '',
     'duration': 2209},
    {'steps': [],
     'weight': 924.9,
     'distance': 11467.6,
     'summary': '',
     'duration': 924.9}],
   'weight_name': 'routability',
   'geomet

In [22]:
print(res['waypoints'])

[{'hint': 'mcqtiMbKrYgcAAAAAAAAAMUAAAAAAAAAWzWZQQAAAAAjAQpDAAAAABwAAAAAAAAAxQAAAAAAAAAYQwAAsTHw-GXzBwJANPD4F_kHAgwA3wL9RD_Z', 'distance': 172.65425, 'location': [-118.476367, 34.075493], 'name': 'Firth Avenue'}, {'hint': 'QB2wiP___38FAAAABQAAAJgAAAADAAAASbJlQAAAAACSqNJC1c3gPwUAAAAFAAAAmAAAAAMAAAAYQwAAKeHy-AeeCAJ74fL4S5oIAg4Azwb9RD_Z', 'distance': 106.310396, 'location': [-118.300375, 34.119175], 'name': 'West Observatory Road'}, {'hint': 'MzKhiP___38AAAAAHQAAAEgAAAArAAAARO68Pa9_oEELyklCJcPxQQAAAAAdAAAASAAAACsAAAAYQwAAivHx-KLqCALp8PH40egIAgIArwb9RD_Z', 'distance': 53.674346, 'location': [-118.361718, 34.138786], 'name': 'Universal Hollywood Drive'}]


In [115]:
start_point = [res['waypoints'][0]['location'][1], res['waypoints'][0]['location'][0]]
start_point

[34.075493, -118.476367]

In [122]:
waypoints = []
for i in range(1, len(res['waypoints']) - 1):
    waypoints.append((res['waypoints'][i]['location'][1], res['waypoints'][i]['location'][0]))
waypoints

[(34.119175, -118.300375)]

We have successfully run the request! From the above results, we get what seems to be some interesting data but also a lot of gibberish. I believe those are the routes and locations that are encoded using Google's Polyline Algorithm. Now, let's attempt to break down the more interesting parts of the data (bear with me, this is new to me as well):
- **Waypoints**: A waypoint is an intermediate point or place on a route. In our case, we get a list of all the locations we travel to in our route. For this example, we only have two locations, our starting point and ending point, so only two locations show up in the waypoints list. This leads me to believe that it is highly possible to construct a route between multiple sightseeing locations and a hotel location, which is one of the goals of our project.
- **Routes**: The routes list contains the routes used to travel between the different locations. In our example, we only have two locations, our starting point and ending point, so there is only one route. I believe that if we add more locations, we will get more routes. 
    - **Geometry**: This encoding gives us the line representing the route. 
    - **Distance**: The distance specified per entry in the routes list gives the distance in meters that the route covers.
    - **Duration**: The duration specified per entry in the routes list gives the duration of the trip in seconds.

In [71]:
# We use the polyline module to decode the encoding into coordinates. Not entirely sure what all of the coordinates mean but 
# I'm guessing they represent locations of different streets when you're changing streets? I don't really know; this requires
# further investigation.

polyline.decode('yj~nEh|brUzG~AeAhH~IwBjL}Tx_@eXag@wIkDoGpFyHzkBo{Ahk@yT{@iGmTyhAuwD{iGu_AcjBy@whHacAJU}kBkd@a@aT_kBqZyI{Xf`@{\\hJlGdJZ|VnG_L|MkAqW}AeGyOz\\iJ`Y{^jZlHpSjkBbNXSxgBc_@~s@cz@rWshAfgAsVj{@qNpQcC}D')

[(34.07549, -118.47637),
 (34.07407, -118.47685),
 (34.07442, -118.47834),
 (34.07266, -118.47774),
 (34.07052, -118.47423),
 (34.06527, -118.4702),
 (34.07168, -118.46848),
 (34.07254, -118.46712),
 (34.07133, -118.46555),
 (34.05391, -118.45075),
 (34.04682, -118.44726),
 (34.04712, -118.44593),
 (34.05055, -118.43412),
 (34.0801, -118.39142),
 (34.09045, -118.37428),
 (34.09074, -118.32664),
 (34.10163, -118.3267),
 (34.10174, -118.30927),
 (34.10772, -118.3091),
 (34.11109, -118.29182),
 (34.1155, -118.29009),
 (34.11964, -118.29541),
 (34.12442, -118.29722),
 (34.12307, -118.29901),
 (34.12293, -118.30284),
 (34.12157, -118.30076),
 (34.11918, -118.30038),
 (34.12311, -118.29991),
 (34.12442, -118.29722),
 (34.11964, -118.29541),
 (34.11547, -118.29031),
 (34.11109, -118.29182),
 (34.1078, -118.30916),
 (34.10538, -118.30929),
 (34.10548, -118.32606),
 (34.11062, -118.33454),
 (34.12008, -118.33848),
 (34.13186, -118.35004),
 (34.13564, -118.3597),
 (34.13813, -118.36267),
 (34.13

In [150]:
row = df.sample()
row

Unnamed: 0,Address,Longitude,Latitude
2,"(Universal City Plaza, Lankershim Boulevard, L...",-118.361879,34.138321


In [151]:
long_lat = (row.iloc[0]["Longitude"], row.iloc[0]["Latitude"])
long_lat

(-118.36187858396794, 34.13832135)

In [169]:
df

Unnamed: 0,Address,Longitude,Latitude
0,"(Getty Research Institute, 1200, Getty Center ...",-118.475712,34.076951
1,"(Griffith Observatory, 2800, East Observatory ...",-118.300293,34.118219
2,"(Universal City Plaza, Lankershim Boulevard, L...",-118.361879,34.138321
3,"(Petersen Automotive Museum‎, 6060, Wilshire B...",-118.361191,34.062315


In [114]:
travel_length = input("How many days will your travel be? ")
coordinates = locations_per_day(df, travel_length)
coordinates

How many days will your travel be? 3


[[(-118.2548018, 34.054796)],
 [(-118.25055660000001, 34.0544412)],
 [(-118.359047, 34.0724002), (-118.29651118079025, 34.140603999999996)]]

In [31]:
import random

In [86]:
def locations_per_day(df, travel_length):
    travel_length = int(travel_length)
    loc_per_day = []
    number_of_locs = df.shape[0]
    while travel_length != 0:
        loc_per_day.append(number_of_locs // travel_length)
        number_of_locs -= (number_of_locs // travel_length)
        travel_length -= 1
    random.shuffle(loc_per_day) # obtain a random ordering of locations per day
    
    coordinates = [0] * len(loc_per_day)
    used_coordinates = []
    
    # creates nested lists of tuples within a list that indicate the coordinates of the places you visit per day
    for i in range(len(coordinates)):
        coordinates[i] = []
        for j in range(0, loc_per_day[i]):
            row = df.sample()
            long_lat = (row.iloc[0]["Longitude"], row.iloc[0]["Latitude"])
            while long_lat in used_coordinates:
                row = df.sample()
                long_lat = (row.iloc[0]["Longitude"], row.iloc[0]["Latitude"])
            coordinates[i].append(long_lat)
            used_coordinates.append(long_lat)
            
    
    return coordinates

In [90]:
# Putting what we did before into a function. We get the route between the two locations provided we have the coordinates
# for the locations. Ideally, I will get the coordinates from the Pandas dataframe, though I'm still figuring that out.
def get_route(lon_0, lat_0, lon_1, lat_1, lon_2, lat_2):
    
    loc = "{},{};{},{};{},{}".format(lon_0, lat_0, lon_1, lat_1, lon_2, lat_2)
    url = "http://router.project-osrm.org/route/v1/car/"
    r = requests.get(url + loc) # same thing as what we did before, getting the request
    print(r.status_code)
    if r.status_code!= 200: # I don't know what this means
        return {}
  
    res = r.json()   
    routes = polyline.decode(res['routes'][0]['geometry']) # the geometry specifies the polyline encoding; I'm also guessing that if we have more routes we would do this for each of the routes, with the index changing for each route
    start_point = [res['waypoints'][0]['location'][1], res['waypoints'][0]['location'][0]] # 0th waypoint corresponds to the starting location
    end_point = [res['waypoints'][1]['location'][1], res['waypoints'][1]['location'][0]] # 1st waypoint corresponds to the ending location
    distance = res['routes'][0]['distance']
    
    out = {'route':routes,
           'start_point':start_point,
           'end_point':end_point,
           'distance':distance
          } # returning a dictionary with the routes, starting point, ending point, and distance

    return out

##### New get_route Function
Run this function to add waypoints for a single trip.

In [162]:
def get_route(coordinates, hotel):
    loc = str(hotel.iloc[0]["Longitude"]) + "," + str(hotel.iloc[0]["Latitude"]) + ";"
    for i in range(len(coordinates)):
        if i == len(coordinates) - 1:
            loc += str(coordinates[i][0]) + "," + str(coordinates[i][1])
        else:
            loc += str(coordinates[i][0]) + "," + str(coordinates[i][1]) + ";"
    url = "http://router.project-osrm.org/route/v1/car/"
    r = requests.get(url + loc) # same thing as what we did before, getting the request
    if r.status_code!= 200: # I don't know what this means
        return {}
  
    res = r.json()   
    routes = polyline.decode(res['routes'][0]['geometry']) # the geometry specifies the polyline encoding; I'm also guessing that if we have more routes we would do this for each of the routes, with the index changing for each route
    start_point = [res['waypoints'][0]['location'][1], res['waypoints'][0]['location'][0]] # 0th waypoint corresponds to the starting location
    end_point = [res['waypoints'][len(res['waypoints']) - 1]['location'][1], res['waypoints'][len(res['waypoints']) - 1]['location'][0]] # 1st waypoint corresponds to the ending location
    waypoints = []
    for i in range(1, len(res['waypoints']) - 1):
        waypoints.append((res['waypoints'][i]['location'][1], res['waypoints'][i]['location'][0]))
    distance = res['routes'][0]['distance']

    out = {'route':routes,
           'start_point':start_point,
           'waypoints': waypoints,
           'end_point':end_point,
           'distance':distance
          } # returning a dictionary with the routes, starting point, ending point, and distance

    return out

In [128]:
coordinates

[[(-118.2548018, 34.054796)],
 [(-118.25055660000001, 34.0544412)],
 [(-118.359047, 34.0724002), (-118.29651118079025, 34.140603999999996)]]

In [132]:
hotel.iloc[0]["Latitude"]

34.054796

In [None]:
url = "http://router.project-osrm.org/route/v1/car/"
    r = requests.get(url + loc) # same thing as what we did before, getting the request
    if r.status_code!= 200: # I don't know what this means
        return {}
  
    res = r.json()   

In [120]:
# We specify our starting and ending coordinates using the rows from the Pandas dataframe. The first pair corresponds to the 
# Griffith Observatory and the second pair corresponds to Universal Studios Hollywood.

#pickup_lon, pickup_lat, dropoff_lon, dropoff_lat = -118.300293,34.118219,-118.361879,34.138321
coordinates = []
for i in range(df.shape[0]):
    lon = df.iloc[i]["Longitude"]
    lat = df.iloc[i]["Latitude"]
    coordinates.append((lon, lat))
#pickup_lon, pickup_lat, dropoff_lon, dropoff_lat = df.iloc[1]["Longitude"], df.iloc[1]["Latitude"], df.iloc[2]["Longitude"], df.iloc[2]["Latitude"]
#test_route = get_route(df.iloc[0]["Longitude"], df.iloc[0]["Latitude"], df.iloc[1]["Longitude"], df.iloc[1]["Latitude"], df.iloc[2]["Longitude"], df.iloc[2]["Latitude"])
test_route = get_route(coordinates)
test_route

-118.47571198123117,34.07695125;
-118.47571198123117,34.07695125;-118.30029332196601,34.11821875;
-118.47571198123117,34.07695125;-118.30029332196601,34.11821875;-118.36187858396794,34.13832135;
-118.47571198123117,34.07695125;-118.30029332196601,34.11821875;-118.36187858396794,34.13832135;-118.36119135387796,34.062315
[(34.119175, -118.300375), (34.138786, -118.361718)]


{'route': [(34.07549, -118.47637),
  (34.07407, -118.47685),
  (34.07442, -118.47834),
  (34.07266, -118.47774),
  (34.07052, -118.47423),
  (34.06527, -118.4702),
  (34.07168, -118.46848),
  (34.07254, -118.46712),
  (34.07133, -118.46555),
  (34.05391, -118.45075),
  (34.04682, -118.44726),
  (34.04712, -118.44593),
  (34.05055, -118.43412),
  (34.0801, -118.39142),
  (34.09045, -118.37428),
  (34.09074, -118.32664),
  (34.10163, -118.3267),
  (34.10174, -118.30927),
  (34.10772, -118.3091),
  (34.11109, -118.29182),
  (34.1155, -118.29009),
  (34.11964, -118.29541),
  (34.12442, -118.29722),
  (34.12307, -118.29901),
  (34.12293, -118.30284),
  (34.12157, -118.30076),
  (34.11918, -118.30038),
  (34.12311, -118.29991),
  (34.12442, -118.29722),
  (34.11964, -118.29541),
  (34.11547, -118.29031),
  (34.11109, -118.29182),
  (34.1078, -118.30916),
  (34.10538, -118.30929),
  (34.10548, -118.32606),
  (34.11062, -118.33454),
  (34.12008, -118.33848),
  (34.13186, -118.35004),
  (34.135

In [163]:
route_list = []
for i in range(len(coordinates)):
    test_route = get_route(coordinates[i], hotel)
    route_list.append(test_route)
route_list

[{'route': [(34.07915, -118.29171),
   (34.07634, -118.2917),
   (34.07628, -118.28778),
   (34.07402, -118.28632),
   (34.07279, -118.28439),
   (34.05992, -118.2553),
   (34.05917, -118.25417),
   (34.05557, -118.2576),
   (34.05434, -118.25692),
   (34.05354, -118.25598),
   (34.0548, -118.2548)],
  'start_point': [34.079147, -118.291714],
  'waypoints': [],
  'end_point': [34.054796, -118.254802],
  'distance': 5323.8},
 {'route': [(34.07915, -118.29171),
   (34.07634, -118.2917),
   (34.07628, -118.28778),
   (34.07402, -118.28632),
   (34.07279, -118.28439),
   (34.06105, -118.25782),
   (34.06047, -118.25823),
   (34.05888, -118.25668),
   (34.05472, -118.2503)],
  'start_point': [34.079147, -118.291714],
  'waypoints': [],
  'end_point': [34.054717, -118.250303],
  'distance': 5014.4},
 {'route': [(34.07915, -118.29171),
   (34.07634, -118.2917),
   (34.07623, -118.32645),
   (34.07571, -118.3285),
   (34.0762, -118.33472),
   (34.07612, -118.36145),
   (34.073, -118.36142),
  

In [181]:
distances = [route['distance'] for route in route_list]
distances

[5323.8, 5014.4, 24575]

In [187]:
list_colors = [
    "red",
    "orange",
    "yellow",
    "green",
    "blue",
    "purple"
]

In [None]:
color_dict

In [200]:
# Using folium to draw the route on an interactive map. Since none of us are really familiar with folium, it would be worth
# seeing if we can accomplish the same results but using plotly express instead.
def get_map(route, route_color):
    
    m = folium.Map(location=[(route['start_point'][0] + route['end_point'][0])/2, 
                             (route['start_point'][1] + route['end_point'][1])/2], 
                   zoom_start=13)

    folium.PolyLine(
        route['route'],
        weight=8,
        color=route_color,
        opacity=0.6
    ).add_to(m)

    folium.Marker(
        location=route['start_point'],
        icon=folium.Icon(icon='play', color='green')
    ).add_to(m)

    folium.Marker(
        location=route['end_point'],
        icon=folium.Icon(icon='stop', color='red')
    ).add_to(m)
    
    for i in range(len(route['waypoints'])):
        folium.Marker(
            location=route['waypoints'][i],
            icon=folium.Icon(icon='circle', color='blue')
        ).add_to(m)

    return m

In [204]:
# Getting the map for our test route
maps = []
for i in range(len(route_list)):
    maps.append(get_map(route_list[i], list_colors[i]))

In [206]:
maps[0]

In [207]:
maps[1]

This code was made possible by OSM (© OpenStreetMap contributors), as well as Michael Yan at Think.Data.Science.

Copyright info for OSM: https://www.openstreetmap.org/copyright