## 2. A description of the data and how it will be used to solve the problem. 

#### 2.1 Data Description

Data to consider when choosing a pizza restaurant location involves: 
* Demographics information extracted from [Census API tool](https://api.census.gov/data.html); 
* Business vendor information extracted from [Foursquare API Too](https://developer.foursquare.com/docs/api-reference/venues/search/); 
* and Johnson County zip code data from [Mongabay Zip Data](https://data.mongabay.com/igapo/zip_codes/metropolitan-areas/metro-zip/Kansas%20City%20(MO-KS)1.html).

The census database I used in this project is 2017 Economic Annual Surveys. The basis of reporting ZIP Code Business Patterns is tabulated at the establishment level. An establishment is a single physical location at which business is conducted. Number of employees and annual payroll for each zip code area are extracted to add to this project model. Thus, the household wealthiness and the number target customers are added to the prediction modal as parameters. Johnson Count Business vendor data including vendor location, vendor cross street, distance to entrance, business categories is added to my analysis. I found the location by zip code level from Mongabay by use of Beautiful Soup to scrape the web.




#### 2.2 Data Wrangling

##### 2.21 Import Data and handle missing values

Import all the libraries and packages

In [1]:
!conda install bs4 --yes
from bs4 import BeautifulSoup
from urllib.request import urlopen
from sklearn.cluster import KMeans
import numpy as np
import pandas as pd

!conda install -c conda-forge folium=0.5.0 --yes
import folium
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim 


Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - bs4


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    beautifulsoup4-4.9.0       |           py36_0         167 KB
    bs4-4.9.0                  |                0           9 KB
    ca-certificates-2020.1.1   |                0         125 KB
    certifi-2020.4.5.1         |           py36_0         155 KB
    openssl-1.1.1g             |       h7b6447c_0         2.5 MB
    soupsieve-2.0              |             py_0          33 KB
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be INSTALLED:

  beautifulsoup4     pkgs/main/linux-64::beautifulsoup4-4.9.0-py36_0
  bs4            

First to create latitude & longitude coordinates for centroids of Johnson county, KS neighborhoods. BeautifulSoup is used to capture zip codes of cities in Johnson County, from [Mongabay Zip Data](https://data.mongabay.com/igapo/zip_codes/metropolitan-areas/metro-zip/Kansas%20City%20(MO-KS)1.html).

In [2]:
 
f = requests.get('https://data.mongabay.com/igapo/zip_codes/metropolitan-areas/metro-zip/Kansas%20City%20(MO-KS)1.html') 

kc_data = BeautifulSoup(f.text)
match = kc_data.find('table', class_='boldtable')


In [3]:
rows = match.find_all ('tr')
 
dt_kc=[]
for r in range(len(rows)):
    
    dt_kctext = rows[r].find_all('td')
    temp=[]
    for val in dt_kctext:
        temp.append(val.text)
    dt_kc.append(temp)

    


column = ['postal','phone','county','state','city','district']

df = pd.DataFrame(dt_kc, columns=column)

df['Postal Code'] = df['postal'].str[:5]
 
df['postal'] = df['postal'].str[5:]

df_kc = df.rename(columns={'postal':'city_name'})
 
df_kc = df_kc[['city_name','Postal Code','county','state']]
df_johnson = df_kc.loc[df['county'] =='Johnson County']
 
print(df_johnson)
 

            city_name Postal Code          county        state
276           De Soto       66018  Johnson County  Kansas - KS
277    Clearview City       66019  Johnson County  Kansas - KS
278           De Soto       66019  Johnson County  Kansas - KS
280          Edgerton       66021  Johnson County  Kansas - KS
284           Gardner       66030  Johnson County  Kansas - KS
..                ...         ...             ...          ...
482   Shawnee Mission       66285  Johnson County  Kansas - KS
483                Sm       66285  Johnson County  Kansas - KS
484            Lenexa       66286  Johnson County  Kansas - KS
485           Shawnee       66286  Johnson County  Kansas - KS
486   Shawnee Mission       66286  Johnson County  Kansas - KS

[177 rows x 4 columns]


Add the latitude & longitude keys to each city centers with pgeocode.Nominatim package. 

In [5]:
!pip install pgeocode

Collecting pgeocode
  Downloading https://files.pythonhosted.org/packages/86/44/519e3db3db84acdeb29e24f2e65991960f13464279b61bde5e9e96909c9d/pgeocode-0.2.1-py2.py3-none-any.whl
Installing collected packages: pgeocode
Successfully installed pgeocode-0.2.1


In [6]:
import pgeocode

nomi = pgeocode.Nominatim('us')

df_loc=[]
for i in df_johnson['Postal Code']:
    df_loc.append(nomi.query_postal_code(i))
    
column=['postal_code','country code','place_name','state_name', 'county_name','county_code','community_name','community_code','latitude','longitude','accuracy']
df_j = pd.DataFrame(df_loc, columns = column)
df_p=df_j.reset_index(drop=True) 
df_po=df_p[['postal_code','latitude','longitude']]
df_po=df_po.rename(columns={'postal_code':'Postal Code'})


In [7]:
df_jo = pd.merge(df_johnson,df_po, how='right', on='Postal Code')
#df_jo = df_johnson.append(df_po, sort=False)
df_jo

Unnamed: 0,city_name,Postal Code,county,state,latitude,longitude
0,De Soto,66018,Johnson County,Kansas - KS,38.9462,-94.9714
1,Clearview City,66019,Johnson County,Kansas - KS,,
2,De Soto,66019,Johnson County,Kansas - KS,,
3,Clearview City,66019,Johnson County,Kansas - KS,,
4,De Soto,66019,Johnson County,Kansas - KS,,
...,...,...,...,...,...,...
866,Shawnee,66286,Johnson County,Kansas - KS,39.0417,-94.7202
867,Shawnee Mission,66286,Johnson County,Kansas - KS,39.0417,-94.7202
868,Lenexa,66286,Johnson County,Kansas - KS,39.0417,-94.7202
869,Shawnee,66286,Johnson County,Kansas - KS,39.0417,-94.7202


Clean data, drop NA and duplicate data. Concatenate city areas with same postal code so I am able to use zip code to search for business venues with a radius of 1500 meters. 

In [8]:
df_jo=df_jo.drop_duplicates().dropna().reset_index(drop=True)
df_jon = pd.DataFrame(columns = list(df_jo.columns))
zips = set(df_jo['Postal Code'])
for zipp in zips:
    rows = df_jo[df_jo['Postal Code'] == zipp].reset_index(drop=True)
    city_names = rows['city_name'].tolist()
    city_name = '/'.join(city_names)
    new_row = {}
    new_row['city_name'] = city_name
    for col in list(rows.columns):
        if col is not 'city_name':
            new_row[col] = rows.loc[0][col]
    df_jon.loc[len(df_jon)] = new_row
df_jon


Unnamed: 0,city_name,Postal Code,county,state,latitude,longitude
0,Shawnee/ Shawnee Mission/ Sm,66218,Johnson County,Kansas - KS,39.0417,-94.7202
1,Overland/ Overland Park/ Shawnee Mission/ Sm/...,66223,Johnson County,Kansas - KS,38.8619,-94.661
2,Overland/ Overland Park/ Shawnee Mission/ Sm,66282,Johnson County,Kansas - KS,38.8999,-94.832
3,Lenexa/ Olathe/ Overland Park,66062,Johnson County,Kansas - KS,38.8733,-94.7752
4,Lenexa/ Overland/ Overland Park/ Shawnee Miss...,66215,Johnson County,Kansas - KS,38.9536,-94.7336
5,Lenexa/ Shawnee/ Shawnee Mission/ Sm,66227,Johnson County,Kansas - KS,38.9536,-94.7336
6,Lenexa/ Op/ Overland Park/ Shawnee Mission/ S...,66251,Johnson County,Kansas - KS,38.8999,-94.832
7,Merriam/ Overland/ Overland Park/ Prairie Vil...,66204,Johnson County,Kansas - KS,38.9928,-94.6771
8,Edgerton,66021,Johnson County,Kansas - KS,38.7811,-95.0094
9,Spring Hill,66083,Johnson County,Kansas - KS,38.7631,-94.8246


#### Foursquare

Use Foursquare API to get venue informations in each city neighborhood. First to take a look at the data patten for each venue in Olathe neighborhood. So I am able to do further analysis to extract useful information to add to models as key parameter.  

In [9]:
address = 'Olathe, KS'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of olathe are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of olathe are 38.8838856, -94.81887.


In [10]:
CLIENT_ID = 'FU3FNGEIK2IW5VS01MZC1ZVR2W20LSCEZOUCIDBB2WOU051W' # your Foursquare ID
CLIENT_SECRET = '3DCGCPEJ4YIEDGTY5QMLI3MPS3CLWLA2SOZRIOU2OUGOPCMS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

LIMIT = 100
radius = 1000
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)
results = requests.get(url).json()
results

 

{'meta': {'code': 200, 'requestId': '5ebc40f202a172001b31a55d'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Central Core',
  'headerFullLocation': 'Central Core, Olathe',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 14,
  'suggestedBounds': {'ne': {'lat': 38.89288560900001,
    'lng': -94.80732969008133},
   'sw': {'lat': 38.87488559099999, 'lng': -94.83041030991868}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bd1ecbc046076b029d17271',
       'name': 'Kansas Coffee Café',
       'location': {'address': '101 E Park St',
        'crossStreet': 'at N Cherry St.',
        'lat': 38.881981099879475,
        'lng': -94.81945893487695,
        'labeledLatLn

As I can tell from the above database, the following factors will be considered when choosing the business location: 
* Access: One fo the most important factors is how accessible your potential location is. Distance to entrance and crossroad information need to be added to the API  request.
* Business type: business type is what I need predict for.  This item will be added to my test/train model.
* Popularity of the business: All the venues in Johnson County area have not been rated based on the data pulled from Foursquare.  I will skip this part.
* Tax: The restaurant business income rates and sales tax rates imposed on customer are not changed too much within Johnson County. Combined tax rate for Johnson County is 9.48%.


In [8]:
CLIENT_ID = 'FU3FNGEIK2IW5VS01MZC1ZVR2W20LSCEZOUCIDBB2WOU051W' # your Foursquare ID
CLIENT_SECRET = '3DCGCPEJ4YIEDGTY5QMLI3MPS3CLWLA2SOZRIOU2OUGOPCMS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version


In [22]:
def getNearbyVenues(names,postal, latitudes, longitudes):
    
    venues_list=[]
    LIMIT = 100
    radius = 1500
    for cname,pos,lat,lng in zip(names,postal,latitudes,longitudes):
 
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        for v in results:
            try:
                venues_list.append([(
                cname,
                pos,
                lat, 
                lng, 
                v['venue']['name'], 
                v['venue']['location']['lat'], 
                v['venue']['location']['lng'],
                v['venue']['location']['distance'],
                v['venue']['categories'][0]['name'],
                v['venue']['location']['crossStreet'])])
        
           
            except:
                venues_list.append([(
                cname,
                pos,
                lat, 
                lng, 
                v['venue']['name'], 
                v['venue']['location']['lat'], 
                v['venue']['location']['lng'],
                v['venue']['location']['distance'],
                v['venue']['categories'][0]['name'],
                'NA')])
         

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])

    nearby_venues.columns = ['City',
                  'Postal Code',
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Dis to entrance',
                  'Venue Category',
                  'Venue crosssSreet',]
    
    return(nearby_venues)

In [23]:
jo_venues = getNearbyVenues(names=df_jon['city_name'],
                            postal=df_jon['Postal Code'],
                            longitudes=df_jon['longitude'],
                            latitudes=df_jon['latitude']
                            )

jo_venues

Unnamed: 0,City,Postal Code,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Dis to entrance,Venue Category,Venue crosssSreet
0,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,West Flanders Park,39.030392,-94.713196,1396,Park,Near Johnson Dr.
1,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,West Flanders Park & Walking Trail,39.029862,-94.713282,1447,Park,nieman
2,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,Black Swan Lake,39.041802,-94.730938,928,Lake,
3,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,Bierman's Christmas Tree Farm,39.049750,-94.723537,941,Garden Center,
4,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,KC's Neighborhood Bar & Grill,39.043677,-94.704676,1360,Bar,
...,...,...,...,...,...,...,...,...,...,...
1760,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,State Line Rd & Bannister Rd/ 95th St,38.956705,-94.608141,1346,Intersection,
1761,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Ceramic Café,38.957999,-94.629457,1455,Arts & Crafts Store,
1762,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Hallmark Creations,38.958188,-94.629920,1472,Gift Shop,
1763,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Oz's MAQ Donut House,38.956166,-94.627651,1497,Donut Shop,


In [24]:
jo_pc=jo_venues.groupby('Postal Code').count()

Now, I will use census API tool to collect demographics data for the Johnson County neighborhood. The census database I used in this project is 2017 Economic Annual Surveys. Employee numbers and annual payroll ($1,000) are two variables which will be considered.  So restaurants owner are able to make judgment on neighborhood size and neighborhood wealthness of perspective customers. 

In [25]:
def getpopinfo(zip_codes):
    jo_emp=[]
    zip_list=[]
    for zip_code in zip_codes:
        url = 'https://api.census.gov/data/2017/zbp?get=NAICS2017_LABEL,EMP,GEO_ID,PAYANN,EMPSZES_LABEL&for=zip%20code:{}&NAICS2017=00&key=234d6c235a105e7e50723b5e4e747b03c4692a24'.format(
        zip_code)
        
        res = requests.get(url)
        
        try:
            zip_list.append(res.json()[1])           
        except:
            pass
        
    column= res.json()[0]
    jo_emp = pd.DataFrame(zip_list,columns=column)
 

    return(jo_emp)

In [26]:
df_jovenue=df_jon['Postal Code'].values.tolist()
 

In [27]:
df_jovenues = getpopinfo(zip_codes=df_jovenue)
df_jovenues

Unnamed: 0,NAICS2017_LABEL,EMP,GEO_ID,PAYANN,EMPSZES_LABEL,NAICS2017,zip code
0,Total for all sectors,1280,8610000US66218,64208,All establishments,0,66218
1,Total for all sectors,10456,8610000US66223,332174,All establishments,0,66223
2,Total for all sectors,18,8610000US66282,223,All establishments,0,66282
3,Total for all sectors,29366,8610000US66062,1224442,All establishments,0,66062
4,Total for all sectors,29052,8610000US66215,1436549,All establishments,0,66215
5,Total for all sectors,4693,8610000US66227,203755,All establishments,0,66227
6,Total for all sectors,0,8610000US66251,0,All establishments,0,66251
7,Total for all sectors,10259,8610000US66204,452242,All establishments,0,66204
8,Total for all sectors,1562,8610000US66021,66227,All establishments,0,66021
9,Total for all sectors,1833,8610000US66083,74597,All establishments,0,66083


In [28]:
df_joks = df_jovenues.drop(['NAICS2017_LABEL','GEO_ID','EMPSZES_LABEL','NAICS2017'], axis=1).rename(columns={'EMP':'Employee No.','PAYANN':'Ann payroll','zip code':'Postal Code'})

df_joks


Unnamed: 0,Employee No.,Ann payroll,Postal Code
0,1280,64208,66218
1,10456,332174,66223
2,18,223,66282
3,29366,1224442,66062
4,29052,1436549,66215
5,4693,203755,66227
6,0,0,66251
7,10259,452242,66204
8,1562,66227,66021
9,1833,74597,66083


Combine all the variables in one dataset.

In [29]:
df_kcjo = pd.merge(jo_venues,df_joks, how='left', on='Postal Code')
df_kcjo

Unnamed: 0,City,Postal Code,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Dis to entrance,Venue Category,Venue crosssSreet,Employee No.,Ann payroll
0,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,West Flanders Park,39.030392,-94.713196,1396,Park,Near Johnson Dr.,1280,64208
1,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,West Flanders Park & Walking Trail,39.029862,-94.713282,1447,Park,nieman,1280,64208
2,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,Black Swan Lake,39.041802,-94.730938,928,Lake,,1280,64208
3,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,Bierman's Christmas Tree Farm,39.049750,-94.723537,941,Garden Center,,1280,64208
4,Shawnee/ Shawnee Mission/ Sm,66218,39.0417,-94.7202,KC's Neighborhood Bar & Grill,39.043677,-94.704676,1360,Bar,,1280,64208
...,...,...,...,...,...,...,...,...,...,...,...,...
1760,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,State Line Rd & Bannister Rd/ 95th St,38.956705,-94.608141,1346,Intersection,,30534,2447778
1761,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Ceramic Café,38.957999,-94.629457,1455,Arts & Crafts Store,,30534,2447778
1762,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Hallmark Creations,38.958188,-94.629920,1472,Gift Shop,,30534,2447778
1763,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211,38.9667,-94.6169,Oz's MAQ Donut House,38.956166,-94.627651,1497,Donut Shop,,30534,2447778


##### 2.22 Data Data Standarization and Data Normalization

Identify pizza place from venue category and transform this category item to numerical value. This is predicted value I will use to build models. Venue crosssStreet also needs to be transform to numerical value. All the NA values will be transform to 0. 

In [30]:
df_kcjo['Venue Category'].value_counts()

Clothing Store        90
Pizza Place           77
Sandwich Place        60
Mexican Restaurant    57
Coffee Shop           51
                      ..
Recruiting Agency      1
Cheese Shop            1
Taco Place             1
Rock Climbing Spot     1
Airport Terminal       1
Name: Venue Category, Length: 166, dtype: int64

In [32]:
new=df_kcjo.copy()

new.loc[new['Venue Category'] == 'Pizza Place', 'Venue Category'] = 1
new.loc[new['Venue Category'] != 1, 'Venue Category'] = 0

new_jo=pd.to_numeric(new['Venue Category'], downcast='float')
new['Venue Category']=new_jo
new.dtypes

City                      object
Postal Code               object
Latitude                 float64
Longitude                float64
Venue                     object
Venue Latitude           float64
Venue Longitude          float64
Venue Dis to entrance      int64
Venue Category           float32
Venue crosssSreet         object
Employee No.              object
Ann payroll               object
dtype: object

In [33]:
new['Venue Category'].value_counts()

0.0    1688
1.0      77
Name: Venue Category, dtype: int64

In [34]:
df_kcjonson=new.copy()

df_kcjonson.loc[df_kcjonson['Venue crosssSreet'] == 'NA', 'Venue crosssSreet'] = 1
df_kcjonson.loc[df_kcjonson['Venue crosssSreet'] != 1, 'Venue crosssSreet'] = 0

new_jon=pd.to_numeric(df_kcjonson['Venue crosssSreet'], downcast='float')
df_kcjonson['Venue crosssSreet']=new_jon


new_jonkc=pd.to_numeric(df_kcjonson['Employee No.'], downcast='float')
df_kcjonson['Employee No.']=new_jonkc

new_jonkan=pd.to_numeric(df_kcjonson['Ann payroll'], downcast='float')
df_kcjonson['Ann payroll']=new_jonkan

new_jonkanp=pd.to_numeric(df_kcjonson['Postal Code'], downcast='float')
df_kcjonson['Postal Code']=new_jonkanp

df_kcjonson.dtypes

City                      object
Postal Code              float32
Latitude                 float64
Longitude                float64
Venue                     object
Venue Latitude           float64
Venue Longitude          float64
Venue Dis to entrance      int64
Venue Category           float32
Venue crosssSreet        float32
Employee No.             float32
Ann payroll              float32
dtype: object

In [36]:
df_kcjonson

Unnamed: 0,City,Postal Code,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Dis to entrance,Venue Category,Venue crosssSreet,Employee No.,Ann payroll
0,Shawnee/ Shawnee Mission/ Sm,66218.0,39.0417,-94.7202,West Flanders Park,39.030392,-94.713196,1396,0.0,0.0,1280.0,64208.0
1,Shawnee/ Shawnee Mission/ Sm,66218.0,39.0417,-94.7202,West Flanders Park & Walking Trail,39.029862,-94.713282,1447,0.0,0.0,1280.0,64208.0
2,Shawnee/ Shawnee Mission/ Sm,66218.0,39.0417,-94.7202,Black Swan Lake,39.041802,-94.730938,928,0.0,1.0,1280.0,64208.0
3,Shawnee/ Shawnee Mission/ Sm,66218.0,39.0417,-94.7202,Bierman's Christmas Tree Farm,39.049750,-94.723537,941,0.0,1.0,1280.0,64208.0
4,Shawnee/ Shawnee Mission/ Sm,66218.0,39.0417,-94.7202,KC's Neighborhood Bar & Grill,39.043677,-94.704676,1360,0.0,1.0,1280.0,64208.0
...,...,...,...,...,...,...,...,...,...,...,...,...
1760,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211.0,38.9667,-94.6169,State Line Rd & Bannister Rd/ 95th St,38.956705,-94.608141,1346,0.0,1.0,30534.0,2447778.0
1761,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211.0,38.9667,-94.6169,Ceramic Café,38.957999,-94.629457,1455,0.0,1.0,30534.0,2447778.0
1762,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211.0,38.9667,-94.6169,Hallmark Creations,38.958188,-94.629920,1472,0.0,1.0,30534.0,2447778.0
1763,Leawood/ Overland/ Overland Park/ Shawnee Mis...,66211.0,38.9667,-94.6169,Oz's MAQ Donut House,38.956166,-94.627651,1497,0.0,1.0,30534.0,2447778.0


In [38]:
 
map_jo = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, city in zip(df_kcjonson['Venue Latitude'], df_kcjonson['Venue Longitude'], df_kcjonson['City']):
    label = '{}'.format(city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_jo)  
    
map_jo