# Part 2 - Mapping Yelp Search Results

## Objective

- For this CodeAlong, we will be working with the Yelp API results from last class. 
- You will load in the .csv.gz of your yelp results and prepare the data for visualization.
- You will use Plotly Express to create an interactive map with all of the results.

## Tools You Will Use
- Part 1:
    - Yelp API:
        - Getting Started: 
            - https://www.yelp.com/developers/documentation/v3/get_started

    - `YelpAPI` python package
        -  "YelpAPI": https://github.com/gfairchild/yelpapi
- Part 2:

    - Plotly Express: https://plotly.com/python/getting-started/
        - With Mapbox API: https://www.mapbox.com/
        - `px.scatter_mapbox` [Documentation](https://plotly.com/python/scattermapbox/): 




### Applying Code From
- [Advanced Transformations with Pandas - Part 1](https://login.codingdojo.com/m/376/12529/88086)
- [Advanced Transformations with Pandas - Part 2](https://login.codingdojo.com/m/376/12529/88088)

### Goal

- We want to create a map with every restaurant plotted as a scatter plot with detailed information that appears when we hover over a business
- We will use plotly express's `px.scatter_mapbox` function to accomplish this.
    - https://plotly.com/python/scattermapbox/
    
    - We will need a Mapbox API token for some of the options:
        - https://studio.mapbox.com/
    

# Loading Data from Part 1

In [1]:
## Plotly is not included in your dojo-env
!pip install plotly



In [2]:
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import json

## importing plotly 
import plotly.express as px

In [3]:
## Load in csv.gz
df = pd.read_csv("Data/Minneapolis-sushi.csv.gz")
df.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,mTnoCM3BrLttWb7m9P5SQQ,wakame-sushi-and-asian-bistro-minneapolis,Wakame Sushi & Asian Bistro,https://s3-media1.fl.yelpcdn.com/bphoto/_PG51j...,False,https://www.yelp.com/biz/wakame-sushi-and-asia...,819,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.0,"{'latitude': 44.946802, 'longitude': -93.321968}",[],$$,"{'address1': '3070 Excelsior Blvd', 'address2'...",16128860000.0,(612) 886-2484,5431.153227
1,kcf7Bc1KKk-qoGJ2QIQVvw,billy-sushi-minneapolis,Billy Sushi,https://s3-media3.fl.yelpcdn.com/bphoto/ynRfeG...,False,https://www.yelp.com/biz/billy-sushi-minneapol...,214,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,"{'latitude': 44.984156, 'longitude': -93.26878...",['delivery'],$$$,"{'address1': '116 N First Ave', 'address2': No...",,,1627.114072
2,RFLSUYCsAAJAneScJl51gA,kura-revolving-sushi-bar-bloomington-3,Kura Revolving Sushi Bar,https://s3-media4.fl.yelpcdn.com/bphoto/cjVtPO...,False,https://www.yelp.com/biz/kura-revolving-sushi-...,39,"[{'alias': 'conveyorsushi', 'title': 'Conveyor...",4.0,"{'latitude': 44.851055585227364, 'longitude': ...",[],,"{'address1': '378 N Garden', 'address2': '', '...",,,13388.38405
3,FR9ZFGmwrrxCrolxZDy6NQ,sushi-train-minneapolis,Sushi Train,https://s3-media3.fl.yelpcdn.com/bphoto/wlQO7g...,False,https://www.yelp.com/biz/sushi-train-minneapol...,345,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",3.5,"{'latitude': 44.971904, 'longitude': -93.276914}","['delivery', 'pickup']",$$,"{'address1': '1200 Nicollet Mall', 'address2':...",16122600000.0,(612) 259-8488,1218.244125
4,ddpjLv0P6iu7p1dRGCPWWw,sushi-takatsu-minneapolis,Sushi Takatsu,https://s3-media1.fl.yelpcdn.com/bphoto/WVD-r_...,False,https://www.yelp.com/biz/sushi-takatsu-minneap...,142,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.5,"{'latitude': 44.9760103, 'longitude': -93.2709...","['delivery', 'pickup']",$,"{'address1': '733 Marquette Ave', 'address2': ...",16123400000.0,(612) 339-5981,946.209834


## Required Preprocessing 

- 1. We need to get the latitude and longitude for each business as separate columns.
- We also want to be able to show the restaurants:
    - name,
    - price range
    - address
    - and if they do delivery or takeout.

### Separating Latitude and Longitude

In [4]:
## use .apply pd.Series to convert a dict to columns
df['coordinates'].apply(pd.Series)

Unnamed: 0,0
0,"{'latitude': 44.946802, 'longitude': -93.321968}"
1,"{'latitude': 44.984156, 'longitude': -93.26878..."
2,"{'latitude': 44.851055585227364, 'longitude': ..."
3,"{'latitude': 44.971904, 'longitude': -93.276914}"
4,"{'latitude': 44.9760103, 'longitude': -93.2709..."
...,...
355,"{'latitude': 44.867687, 'longitude': -93.328011}"
356,"{'latitude': 45.091528, 'longitude': -93.435945}"
357,"{'latitude': 45.02836, 'longitude': -93.01963}"
358,"{'latitude': 45.038845, 'longitude': -93.020922}"


- Why didn't that work???

In [5]:
## slice out a single test coordinate
test_coord = df.loc[1, 'coordinates']
test_coord

"{'latitude': 44.984156, 'longitude': -93.2687831616949}"

- Its not a dictionary anymore!!! WTF??
    - CSV files cant store iterables (lists, dictionaries) so they get converted to strings.

### Fixing the String-Dictionaries

- The json module has another version of load and dump called `json.loads` and `json.dumps`
    - These are designed to process STRINGS instead of files. 
    
- If we use `json.loads` we can convert our string dictionary into an actual dictionary. 

In [6]:
## Use json.loads on the test coordinate
json.loads(test_coord)

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

- JSON requires double quotes!

In [7]:
## replace single ' with " 
test_coord = test_coord.replace("'", '"')
test_coord

'{"latitude": 44.984156, "longitude": -93.2687831616949}'

In [8]:
## Use json.loads on the test coordinate, again
json.loads(test_coord)

{'latitude': 44.984156, 'longitude': -93.2687831616949}

### Now, how can we apply this same process to the entire column??

In [9]:
## replace ' with " (entire column)
df['coordinates'] = df['coordinates'].str.replace("'", '"')
## apply json.loads
df['coordinates'] = df['coordinates'].apply(json.loads)

In [10]:
## slice out a single test coordinate
test_coord = df.loc[5, 'coordinates']
test_coord

{'latitude': 44.91163, 'longitude': -93.32865}

### Using Apply with pd.Series to convert a dictionary column into multiple columns

In [11]:
## use .apply pd.Series to convert a dict to columns
df['coordinates'].apply(pd.Series)

Unnamed: 0,latitude,longitude
0,44.946802,-93.321968
1,44.984156,-93.268783
2,44.851056,-93.239506
3,44.971904,-93.276914
4,44.976010,-93.270922
...,...,...
355,44.867687,-93.328011
356,45.091528,-93.435945
357,45.028360,-93.019630
358,45.038845,-93.020922


In [12]:
## Concatenate the 2 new columns and drop the original.
df = pd.concat([df, df['coordinates'].apply(pd.Series)], axis = 1)
df = df.drop(columns = 'coordinates')
df.head(2)

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,transactions,price,location,phone,display_phone,distance,latitude,longitude
0,mTnoCM3BrLttWb7m9P5SQQ,wakame-sushi-and-asian-bistro-minneapolis,Wakame Sushi & Asian Bistro,https://s3-media1.fl.yelpcdn.com/bphoto/_PG51j...,False,https://www.yelp.com/biz/wakame-sushi-and-asia...,819,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.0,[],$$,"{'address1': '3070 Excelsior Blvd', 'address2'...",16128860000.0,(612) 886-2484,5431.153227,44.946802,-93.321968
1,kcf7Bc1KKk-qoGJ2QIQVvw,billy-sushi-minneapolis,Billy Sushi,https://s3-media3.fl.yelpcdn.com/bphoto/ynRfeG...,False,https://www.yelp.com/biz/billy-sushi-minneapol...,214,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,['delivery'],$$$,"{'address1': '116 N First Ave', 'address2': No...",,,1627.114072,44.984156,-93.268783


## Creating a Simple Map

### Register for MapBox API

Mapbox API: https://www.mapbox.com/

In [None]:
## Load in mapbox api credentials from .secret


- Use the plotly express `set_maptbox_acccess_token` function

In [None]:
## set mapbox token


In [None]:
## use scatter_mapbox for M.V.P map


### Adding Hover Data

- We want to show the restaurants:
    - name
    - price range
    - address
    - and if they do delivery or takeout.
    
    
- We can use the `hover_name` and `hover_data` arguments for `px.scatter_mapbox` to add this info!

In [None]:
## add hover_name (name) and hover_data for price,rating,location


### Fixing the Location Column

In [None]:
## slice out a test address


> Also a string-dictionary...

In [None]:
## replace ' with "
df['location'] = df['location'].str.replace("'", '"')
df

In [None]:
## apply json.loads


> Ruh roh....

- Hmm, let's slice out a test_address again and let's write a function to accomplish this instead.
    - We can use try and except in our function to get around the errors.

### Fixing Addresses - with a custom function


In [18]:
## slice out test address 
test_addr = df.loc[0, 'location']
test_addr

'{"address1": "3070 Excelsior Blvd", "address2": "", "address3": "", "city": "Minneapolis", "zip_code": "55416", "country": "US", "state": "MN", "display_address": ["3070 Excelsior Blvd", "Minneapolis, MN 55416"]}'

In [19]:
## write a function to just run json.loads on the address
def fix_address(test_addr):
    try:
        return json.loads(test_addr)
    except:
        return 'Error'

In [20]:
## test applying our function
df['location'].apply(fix_address)

0      {'address1': '3070 Excelsior Blvd', 'address2'...
1                                                  Error
2                                                  Error
3                                                  Error
4      {'address1': '733 Marquette Ave', 'address2': ...
                             ...                        
355    {'address1': '7401 France Ave S', 'address2': ...
356    {'address1': '12201 Elm Creek Blvd N', 'addres...
357                                                Error
358    {'address1': '1920 Buerkle Rd', 'address2': ''...
359                                                Error
Name: location, Length: 360, dtype: object

- It worked! Now let's save this as a new column (display_location),
and then let's investigate the businesses that had an "ERROR".

In [21]:
### save a new display_location column using our function
df['display_location'] = df['location'].apply(fix_address)

In [22]:
## filter for businesses with display_location == "ERROR"
errors = df[df['display_location']=='Error']
errors

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,transactions,price,location,phone,display_phone,distance,latitude,longitude,display_location
1,kcf7Bc1KKk-qoGJ2QIQVvw,billy-sushi-minneapolis,Billy Sushi,https://s3-media3.fl.yelpcdn.com/bphoto/ynRfeG...,False,https://www.yelp.com/biz/billy-sushi-minneapol...,214,"[{'alias': 'japanese', 'title': 'Japanese'}, {...",4.0,['delivery'],$$$,"{""address1"": ""116 N First Ave"", ""address2"": No...",,,1627.114072,44.984156,-93.268783,Error
2,RFLSUYCsAAJAneScJl51gA,kura-revolving-sushi-bar-bloomington-3,Kura Revolving Sushi Bar,https://s3-media4.fl.yelpcdn.com/bphoto/cjVtPO...,False,https://www.yelp.com/biz/kura-revolving-sushi-...,39,"[{'alias': 'conveyorsushi', 'title': 'Conveyor...",4.0,[],,"{""address1"": ""378 N Garden"", ""address2"": """", ""...",,,13388.384050,44.851056,-93.239506,Error
3,FR9ZFGmwrrxCrolxZDy6NQ,sushi-train-minneapolis,Sushi Train,https://s3-media3.fl.yelpcdn.com/bphoto/wlQO7g...,False,https://www.yelp.com/biz/sushi-train-minneapol...,345,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",3.5,"['delivery', 'pickup']",$$,"{""address1"": ""1200 Nicollet Mall"", ""address2"":...",1.612260e+10,(612) 259-8488,1218.244125,44.971904,-93.276914,Error
5,VtsXtKbe4KWYif90LAzACA,ama-sushi-edina,AMA Sushi,https://s3-media4.fl.yelpcdn.com/bphoto/2_f2xM...,False,https://www.yelp.com/biz/ama-sushi-edina?adjus...,62,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.5,[],$$,"{""address1"": ""5033 France Ave S"", ""address2"": ...",1.952920e+10,(952) 920-1547,8419.254042,44.911630,-93.328650,Error
6,st6hKTpxG-0ltDs8gujkBQ,iwa-sushi-inver-grove-heights,Iwa Sushi,https://s3-media3.fl.yelpcdn.com/bphoto/cJs-4k...,False,https://www.yelp.com/biz/iwa-sushi-inver-grove...,146,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.5,"['delivery', 'pickup']",$$,"{""address1"": ""7781 Amana Trl"", ""address2"": ""St...",1.651455e+10,(651) 455-1473,20099.649357,44.836572,-93.089877,Error
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
351,WZhRFyVEO8n_w9aOTiWt3Q,fresh-thyme-market-bloomington-3,Fresh Thyme Market,https://s3-media3.fl.yelpcdn.com/bphoto/YPWF0X...,False,https://www.yelp.com/biz/fresh-thyme-market-bl...,66,"[{'alias': 'farmersmarket', 'title': 'Farmers ...",3.5,[],$$,"{""address1"": ""2100 W 80 1/2 St"", ""address2"": N...",1.763321e+10,(763) 321-3555,12974.955825,44.858204,-93.306629,Error
353,r5th1PZgu3VMeGgKQhyyDQ,hy-vee-eagan,Hy-Vee,https://s3-media3.fl.yelpcdn.com/bphoto/996rfZ...,False,https://www.yelp.com/biz/hy-vee-eagan?adjust_c...,78,"[{'alias': 'grocery', 'title': 'Grocery'}]",3.5,[],$$,"{""address1"": ""1500 Central Park Village Dr"", ""...",1.651405e+10,(651) 405-3660,16705.028526,44.834436,-93.171476,Error
354,VR6PyZVTje8eRw8IYJtmVA,hy-vee-oakdale-3,Hy-Vee,https://s3-media1.fl.yelpcdn.com/bphoto/rxsQ_e...,False,https://www.yelp.com/biz/hy-vee-oakdale-3?adju...,6,"[{'alias': 'servicestations', 'title': 'Gas St...",3.0,"['delivery', 'pickup']",,"{""address1"": ""7180 10th St N"", ""address2"": """",...",,,23691.000896,44.964436,-92.960653,Error
357,3I5O0GUO57uOCcLKqiaIJQ,ihop-maplewood-2,IHOP,https://s3-media4.fl.yelpcdn.com/bphoto/Tsa45R...,False,https://www.yelp.com/biz/ihop-maplewood-2?adju...,55,"[{'alias': 'breakfast_brunch', 'title': 'Break...",2.0,"['delivery', 'pickup']",$$,"{""address1"": ""1935 Beam Ave"", ""address2"": """", ...",1.651748e+10,(651) 748-1700,20083.387362,45.028360,-93.019630,Error


In [23]:
## slice out a new test address and inspect
test_addr = df.loc[359, 'location']
test_addr

'{"address1": "8150 Wedgewood Ln N", "address2": None, "address3": None, "city": "Maple Grove", "zip_code": "55369", "country": "US", "state": "MN", "display_address": ["8150 Wedgewood Ln N", "Maple Grove, MN 55369"]}'

> After some more investigation, we would find a few issues with these "ERROR" rows.
1. They contained None.
2. They contained an apostrophe in the name.
3. ...?

### Possible Fixes (if we care to/have the time)


- Use Regular Expressions to find an fix the display addresses with "'" in them
- Use string split to split on the word display address.
    - Then use string methods to clean up

### Moving Forward without those rows (for now)

In [25]:
## remove any rows where display_location == 'ERROR'
error_filter = df['display_location']=='Error'
df1 = df.loc[~error_filter, :]
df1.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,transactions,price,location,phone,display_phone,distance,latitude,longitude,display_location
0,mTnoCM3BrLttWb7m9P5SQQ,wakame-sushi-and-asian-bistro-minneapolis,Wakame Sushi & Asian Bistro,https://s3-media1.fl.yelpcdn.com/bphoto/_PG51j...,False,https://www.yelp.com/biz/wakame-sushi-and-asia...,819,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.0,[],$$,"{""address1"": ""3070 Excelsior Blvd"", ""address2""...",16128860000.0,(612) 886-2484,5431.153227,44.946802,-93.321968,"{'address1': '3070 Excelsior Blvd', 'address2'..."
4,ddpjLv0P6iu7p1dRGCPWWw,sushi-takatsu-minneapolis,Sushi Takatsu,https://s3-media1.fl.yelpcdn.com/bphoto/WVD-r_...,False,https://www.yelp.com/biz/sushi-takatsu-minneap...,142,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.5,"['delivery', 'pickup']",$,"{""address1"": ""733 Marquette Ave"", ""address2"": ...",16123400000.0,(612) 339-5981,946.209834,44.97601,-93.270922,"{'address1': '733 Marquette Ave', 'address2': ..."
9,PJqifWn3xg_x0ItIDYfbZg,nakamori-japanese-bistro-edina,Nakamori Japanese Bistro,https://s3-media3.fl.yelpcdn.com/bphoto/ZWb4ZB...,False,https://www.yelp.com/biz/nakamori-japanese-bis...,264,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",4.0,"['delivery', 'pickup']",$$,"{""address1"": ""7101 France Ave S"", ""address2"": ...",19529210000.0,(952) 920-9980,11865.991716,44.874506,-93.327567,"{'address1': '7101 France Ave S', 'address2': ..."
11,51SWfc0I300IpUZcv7kCfA,zen-box-izakaya-minneapolis,Zen Box Izakaya,https://s3-media2.fl.yelpcdn.com/bphoto/sCMGcA...,False,https://www.yelp.com/biz/zen-box-izakaya-minne...,691,"[{'alias': 'ramen', 'title': 'Ramen'}, {'alias...",4.0,"['delivery', 'pickup']",$$,"{""address1"": ""602 Washington Ave S"", ""address2...",16123320000.0,(612) 332-3936,901.681643,44.978433,-93.259617,"{'address1': '602 Washington Ave S', 'address2..."
20,fGWdvXWxN9hlvebIjU10pQ,soberfish-minneapolis-4,Soberfish,https://s3-media1.fl.yelpcdn.com/bphoto/zXWplx...,False,https://www.yelp.com/biz/soberfish-minneapolis...,215,"[{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...",3.5,"['pickup', 'delivery']",$$,"{""address1"": ""2627 E Franklin Ave"", ""address2""...",16123540000.0,(612) 354-2544,2409.17727,44.962856,-93.232882,"{'address1': '2627 E Franklin Ave', 'address2'..."


- We want the "display_address" key from the "display_location" dictionaries.
- We could use a .apply and a lamda to slice out the desired key.

In [None]:
## use apply and lambda to slice correct key


- Almost done! We want to convert display_address to a string instead a list of strings.
- We can use the string method .join to do so!

In [26]:
## slice out a test_address
test_addr = df1.loc[2, 'display_location']['display_address']
test_addr

KeyError: 2

In [16]:
## replace ' with "
df['display_address'] = df['display_location'].apply(lambda x: x['display_address'])


'{"address1": "3070 Excelsior Blvd", "address2": "", "address3": "", "city": "Minneapolis", "zip_code": "55416", "country": "US", "state": "MN", "display_address": ["3070 Excelsior Blvd", "Minneapolis, MN 55416"]}'

In [17]:
## test using .join with a "\n"
df['location'].apply(json.loads)
df['location']

JSONDecodeError: Expecting value: line 1 column 45 (char 44)

In [None]:
## apply the join to every row with a lambda


### Final Map

In [None]:
## make ourn final map and save as varaible


#### HTML Uses `<br>` instead of `\n`

In [None]:
## remake the final address column with <br> instead 

## plot the final map

In [None]:
## use fig.write_html to save map
