# IBM Applied Data Science Capstone 

## Week 4



## Introduction

When you come to Shanghai, you must visit the French concession, a variety of garden houses, tall plane trees on both sides of the road, and cafes and bars hidden in every corner. However, only a few well-known streets are known to everyone, and other streets with similar styles may not be known to everyone. Therefore, by clustering similar streets, more tourism information can be provided for tourists

## Data

To consider the problem we can list the datas as below:

- List of neighbourhoods in Shanghai. This defines the scope of this project.
- Latitude and longitude coordinates of those neighbourhoods. This is required in order to plot the map and also to get the venue data
- Venue data, particularly data related to shopping malls. We will use this data to perform clustering on the neighbourhoods

**Sources of data and methods to extract them:**
Shanghai civil affairs bureau page contains a list of neighbourhoods in Shanghai, with a total of 106 neighbourhoods. We will use web scraping techniques to extract the data from the, with the help of Python requests and beautifulsoup packages. Then we will get the geographical coordinates of the neighbourhoods using Python Geocoder package which will give us the latitude and longitude coordinates of the neighbourhoods. __(web scraping)__

After that, we will use Foursquare API to get the venue data for those neighbourhoods.  __(working with API)__

Foursquare API will provide many categories of the venue data in order to help us to solve problem put forward
__(data cleaning, data wrangling, to machine learningand map visualization)__


## Methodology

### 1. Import libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Libraries imported.")

Libraries imported.


### 2. Scrap data from page into a DataFrame

### send the GET request

In [2]:
# send the GET request
data = requests.get('http://www.shlnb.cn/gb/shmzj/node8/node15/node55/node238/u1ai43816.html')
data.encoding = "gbk"
data = data.text

### parse data from the html into a beautifulsoup object

In [3]:
soup = BeautifulSoup(data, 'html.parser')
soup

<!DOCTYPE html>

<html lang="zh-CN">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<meta content="width=device-width, initial-scale=1.0" name="viewport"/>
<meta content="pc" name="applicable-device"/>
<meta content="IE=Edge" http-equiv="X-UA-Compatible"/>
<title>上海民政-上海市行政区划名称表(截至2019年06月30日)</title>
<meta content="上海民政" name="SiteName"/>
<meta content="mzj.sh.gov.cn" name="SiteDomain"/>
<meta content="3100000013" name="SiteIDCode"/>
<meta content="上海民政,上海市民政局,上海,民政局,上海民政局网站" name="keywords">
<meta content="上海市行政区划名称表(截至2019年06月30日)" name="description">
<meta content="上海市行政区划名称表(截至2019年06月30日)" name="ArticleTitle"/>
<meta content="上海民政局" name="ContentSource"/>
<meta content="2019-7-31 10:42:22" name="PubDate"/>
<!-- Custom CSS -->
<link href="/shmzj2016/assets/css/ETUI/ETUI3.min.css" rel="stylesheet"/>
<link href="/shmzj2016/assets/css/ETUI/ETUI3.Utility.css" rel="stylesheet"/>
<!-- Widget CSS -->
<link href="/shmzj2016/assets/plugin/owl/owl.carousel.css" 

### create two lists to store table data

In [4]:
boroughList = []
neighborhoodList = []

### crawlling for boroughList

In [5]:

for row in soup.find('table').find_all('tr')[2:]:
    data = row.find(attrs={'style':"width:73px;height:1px;"})
    data  = data.get_text().strip()
    boroughList.append(data)
boroughList = boroughList[:-1]
boroughList
    



['黄浦',
 '徐汇',
 '长宁',
 '静安',
 '普陀',
 '虹口',
 '杨浦',
 '闵行',
 '宝山',
 '嘉定',
 '浦东',
 '金山',
 '松江',
 '青浦',
 '奉贤',
 '崇明']

### crawlling for neighborhoodList _(township-level divisions)_

In [6]:

for row in soup.find('table').find_all('tr')[2:]:
    data = row.find(attrs={'style':"width:312px;height:1px;"})
    data  = data.get_text().strip()
    neighborhoodList.append(data)
neighborhoodList = neighborhoodList[:-1]
neighborhoodList

['外滩街道、南京东路街道、半淞园路街道、小东门街道、老西门街道、豫园街道、打浦桥街道、淮海中路街道、瑞金二路街道、五里桥街道',
 '湖南路街道、天平路街道、枫林路街道、徐家汇街道、斜土路街道、长桥街道、漕河泾街道、康健新村街道、虹梅路街道、田林街道、凌云路街道、龙华街道、华泾镇',
 '华阳路街道、新华路街道、江苏路街道、天山路街道、周家桥街道、虹桥街道、仙霞新村街道、程家桥街道、北新泾街道、新泾镇',
 '江宁路街道、静安寺街道、南京西路街道、曹家渡街道、石门二路街道、天目西路街道、北站街道、宝山路街道、芷江西路街道、共和新路街道、大宁路街道、彭浦新村街道、临汾路街道、彭浦镇',
 '长寿路街道、曹杨新村街道、长风新村街道、宜川路街道、甘泉路街道、石泉路街道、真如镇街道、万里街道、长征镇、桃浦镇',
 '四川北路街道、北外滩街道、欧阳路街道、广中路街道、凉城新村街道、嘉兴路街道、曲阳路街道、江湾镇街道',
 '定海路街道、大桥街道、平凉路街道、江浦路街道、控江路街道、殷行街道、长白新村街道、延吉新村街道、五角场街道、四平路街道、新江湾城街道、长海路街道',
 '江川路街道、古美街道、新虹街道、浦锦街道、莘庄镇、七宝镇、浦江镇、梅陇镇、虹桥镇、马桥镇、吴泾镇、华漕镇、颛桥镇',
 '吴淞街道、张庙街道、友谊路街道、庙行镇、罗店镇、大场镇、顾村镇、罗泾镇、杨行镇、月浦镇、淞南镇、高境镇',
 '嘉定镇街道、新成路街道、真新街道、马陆镇、南翔镇、江桥镇、安亭镇、外冈镇、徐行镇、华亭镇',
 '潍坊新村街道、陆家嘴街道、塘桥街道、周家渡街道、东明路街道、洋泾街道、上钢新村街道、沪东新村街道、金杨新村街道、浦兴路街道、南码头路街道、花木街道、川沙新镇、合庆镇、曹路镇、高东镇、高桥镇、高行镇、金桥镇、张江镇、唐镇、北蔡镇、三林镇、惠南镇、新场镇、大团镇、周浦镇、航头镇、康桥镇、宣桥镇、祝桥镇、泥城镇、书院镇、万祥镇、老港镇、南汇新城镇',
 '石化街道、枫泾镇、朱泾镇、亭林镇、漕泾镇、山阳镇、金山卫镇、张堰镇、廊下镇、吕巷镇',
 '岳阳街道、中山街道、永丰街道、方松街道、九里亭街道、广富林街道、九亭镇、泗泾镇、泖港镇、车墩镇、洞泾镇、叶榭镇、新桥镇、石湖荡镇、新浜镇、佘山镇、小昆山镇',
 '夏阳街道、盈浦街道、香花桥街道、赵巷镇、徐泾镇、华新镇、重固镇、白鹤镇、朱家角

### 3.create a new DataFrame from the lists

In [7]:
df_shanghai = pd.DataFrame({"Borough": boroughList,
                           "Neighborhood": neighborhoodList})

df_shanghai

Unnamed: 0,Borough,Neighborhood
0,黄浦,外滩街道、南京东路街道、半淞园路街道、小东门街道、老西门街道、豫园街道、打浦桥街道、淮海中路...
1,徐汇,湖南路街道、天平路街道、枫林路街道、徐家汇街道、斜土路街道、长桥街道、漕河泾街道、康健新村街...
2,长宁,华阳路街道、新华路街道、江苏路街道、天山路街道、周家桥街道、虹桥街道、仙霞新村街道、程家桥街...
3,静安,江宁路街道、静安寺街道、南京西路街道、曹家渡街道、石门二路街道、天目西路街道、北站街道、宝山...
4,普陀,长寿路街道、曹杨新村街道、长风新村街道、宜川路街道、甘泉路街道、石泉路街道、真如镇街道、万里...
5,虹口,四川北路街道、北外滩街道、欧阳路街道、广中路街道、凉城新村街道、嘉兴路街道、曲阳路街道、江湾镇街道
6,杨浦,定海路街道、大桥街道、平凉路街道、江浦路街道、控江路街道、殷行街道、长白新村街道、延吉新村街...
7,闵行,江川路街道、古美街道、新虹街道、浦锦街道、莘庄镇、七宝镇、浦江镇、梅陇镇、虹桥镇、马桥镇、吴...
8,宝山,吴淞街道、张庙街道、友谊路街道、庙行镇、罗店镇、大场镇、顾村镇、罗泾镇、杨行镇、月浦镇、淞南...
9,嘉定,嘉定镇街道、新成路街道、真新街道、马陆镇、南翔镇、江桥镇、安亭镇、外冈镇、徐行镇、华亭镇


### split the Neighborhood so that I can get the coordinates for each one

In [8]:
df_shanghai_expand = df_shanghai.drop('Neighborhood',axis = 1).join(df_shanghai['Neighborhood'].str.split("、",expand = True).stack().reset_index(level = 1,drop = True).rename('Neighborhood'))
df_shanghai_expand.shape

(214, 2)

### 4.Use geocode API to get the latitude and longitude values

In [9]:
longitudeList = []
latitudeList = []

for neighborhood in df_shanghai_expand['Neighborhood']:
    #print(neighborhood)
    address = "上海"+neighborhood
    url="http://restapi.amap.com/v3/geocode/geo?key=%s&address=%s" %('5770b0da3ff6ab387efa7cb7894c807f',address)
    data=requests.get(url)
    contest=data.json()
    print(contest)
    try:
        contest=contest['geocodes'][0]['location']
        contest = contest.split(",")
        longitude = contest[0]
        latitude = contest[1]
        longitudeList.append(longitude)
        latitudeList.append(latitude)
    except:
        longitudeList.append("Not found")
        latitudeList.append("Not found")
    

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市黄浦区外滩街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '黄浦区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310101', 'street': [], 'number': [], 'location': '121.484146,31.240357', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市黄浦区南京东路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '黄浦区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310101', 'street': [], 'number': [], 'location': '121.471941,31.226107', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市黄浦区半淞园路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': 

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市长宁区虹桥街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '长宁区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310105', 'street': [], 'number': [], 'location': '121.420137,31.195531', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市长宁区仙霞新村街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '长宁区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310105', 'street': [], 'number': [], 'location': '121.394701,31.203999', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市长宁区程家桥街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市普陀区宜川路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '普陀区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310107', 'street': [], 'number': [], 'location': '121.433955,31.255842', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市普陀区甘泉路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '普陀区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310107', 'street': [], 'number': [], 'location': '121.426557,31.263581', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市普陀区石泉路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市杨浦区延吉新村街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '杨浦区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310110', 'street': [], 'number': [], 'location': '121.536901,31.288689', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市杨浦区五角场街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '杨浦区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310110', 'street': [], 'number': [], 'location': '121.503859,31.293929', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市杨浦区四平路街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': 

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市宝山区月浦镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '宝山区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310113', 'street': [], 'number': [], 'location': '121.421874,31.420287', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市宝山区淞南镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '宝山区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310113', 'street': [], 'number': [], 'location': '121.489764,31.348028', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市宝山区高境镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '宝山区', 

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市浦东新区川沙新镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '浦东新区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310115', 'street': [], 'number': [], 'location': '121.695947,31.193105', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市浦东新区合庆镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '浦东新区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310115', 'street': [], 'number': [], 'location': '121.723707,31.236879', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市浦东新区曹路镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市浦东新区南汇新城镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '浦东新区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310115', 'street': [], 'number': [], 'location': '121.924644,30.901626', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市金山区石化街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '金山区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310116', 'street': [], 'number': [], 'location': '121.336792,30.725684', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市金山区枫泾镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '金

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市松江区小昆山镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '松江区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310117', 'street': [], 'number': [], 'location': '121.132203,31.029603', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市青浦区夏阳街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '青浦区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310118', 'street': [], 'number': [], 'location': '121.126023,31.148256', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市青浦区盈浦街道', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '青浦区

{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市崇明区向化镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '崇明区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310151', 'street': [], 'number': [], 'location': '121.723524,31.520566', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市崇明区绿华镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '崇明区', 'township': [], 'neighborhood': {'name': [], 'type': []}, 'building': {'name': [], 'type': []}, 'adcode': '310151', 'street': [], 'number': [], 'location': '121.220535,31.762452', 'level': '乡镇'}]}
{'status': '1', 'info': 'OK', 'infocode': '10000', 'count': '1', 'geocodes': [{'formatted_address': '上海市崇明区建设镇', 'country': '中国', 'province': '上海市', 'citycode': '021', 'city': '上海市', 'district': '崇明区', 

In [10]:
df_coordinates = pd.DataFrame({"Neighborhood": df_shanghai_expand['Neighborhood'],"longitude": longitudeList,"latitude": latitudeList})

df_coordinates

Unnamed: 0,Neighborhood,longitude,latitude
0,外滩街道,121.484146,31.240357
0,南京东路街道,121.471941,31.226107
0,半淞园路街道,121.487103,31.206786
0,小东门街道,121.501063,31.220083
0,老西门街道,121.486244,31.215171
0,豫园街道,121.486984,31.225396
0,打浦桥街道,121.473506,31.203656
0,淮海中路街道,121.475832,31.214941
0,瑞金二路街道,121.466792,31.217426
0,五里桥街道,121.481785,31.200743


### 5.Merging

In [11]:
df_shanghai_expand = df_shanghai_expand.merge(df_coordinates, on="Neighborhood", how="left")
df_shanghai_expand

Unnamed: 0,Borough,Neighborhood,longitude,latitude
0,黄浦,外滩街道,121.484146,31.240357
1,黄浦,南京东路街道,121.471941,31.226107
2,黄浦,半淞园路街道,121.487103,31.206786
3,黄浦,小东门街道,121.501063,31.220083
4,黄浦,老西门街道,121.486244,31.215171
5,黄浦,豫园街道,121.486984,31.225396
6,黄浦,打浦桥街道,121.473506,31.203656
7,黄浦,淮海中路街道,121.475832,31.214941
8,黄浦,瑞金二路街道,121.466792,31.217426
9,黄浦,五里桥街道,121.481785,31.200743


### Drop the Neighborhood that the latitude and longitude is not found

In [12]:
df_shanghai_expand = df_shanghai_expand[df_shanghai_expand.longitude != "Not found"].reset_index(drop=True)
df_shanghai_expand.shape

(212, 4)

### 6.Create a map of Shanghai with neighborhoods(township-level divisions) superimposed on top

In [13]:
map_shanghai = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_shanghai_expand['latitude'], df_shanghai_expand['longitude'], df_shanghai_expand['Borough'], df_shanghai_expand['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_shanghai)  
    
map_shanghai

### 7. Filter only boroughs at the center of city

In [14]:

borough_names = list(df_shanghai_expand.Borough.unique())

borough_with_shanghai = ["黄浦" ,"徐汇","静安","长宁","普陀","虹口","杨浦"]



In [15]:
df_inner= df_shanghai_expand[df_shanghai_expand['Borough'].isin(borough_with_shanghai)].reset_index(drop=True)
print(df_inner.shape)
df_inner

(77, 4)


Unnamed: 0,Borough,Neighborhood,longitude,latitude
0,黄浦,外滩街道,121.484146,31.240357
1,黄浦,南京东路街道,121.471941,31.226107
2,黄浦,半淞园路街道,121.487103,31.206786
3,黄浦,小东门街道,121.501063,31.220083
4,黄浦,老西门街道,121.486244,31.215171
5,黄浦,豫园街道,121.486984,31.225396
6,黄浦,打浦桥街道,121.473506,31.203656
7,黄浦,淮海中路街道,121.475832,31.214941
8,黄浦,瑞金二路街道,121.466792,31.217426
9,黄浦,五里桥街道,121.481785,31.200743


In [16]:
# create map of city center using latitude and longitude values
map_inner = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_inner['latitude'], df_inner['longitude'], df_inner['Borough'], df_inner['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_inner)  
    
map_inner

### 8.Use the Foursquare API to explore the neighborhoods

In [17]:
CLIENT_ID = '5UYFAAJU42TAPMYDWMWPY5IRVS3X5S2I23CXXZ0Y55TD5S3N' # your Foursquare ID
CLIENT_SECRET = 'POS3W12XBPMOI1IHK1C4LSZ2DRNI33PNUT1OJZFYMYXQUW4Y' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 5UYFAAJU42TAPMYDWMWPY5IRVS3X5S2I23CXXZ0Y55TD5S3N
CLIENT_SECRET:POS3W12XBPMOI1IHK1C4LSZ2DRNI33PNUT1OJZFYMYXQUW4Y


**get the top 100 venues that are within a radius of 1000 meters.**

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    LIMIT = 100 # limit of number of venues returned by Foursquare API
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [21]:
shanghai_venues = getNearbyVenues(names=df_inner['Neighborhood'],
                                   latitudes=df_inner['latitude'],
                                   longitudes=df_inner['longitude'])

外滩街道
南京东路街道
半淞园路街道
小东门街道
老西门街道
豫园街道
打浦桥街道
淮海中路街道
瑞金二路街道
五里桥街道
湖南路街道
天平路街道
枫林路街道
徐家汇街道
斜土路街道
长桥街道
漕河泾街道
康健新村街道
虹梅路街道
田林街道
凌云路街道
龙华街道
华泾镇
华阳路街道
新华路街道
江苏路街道
天山路街道
周家桥街道
虹桥街道
仙霞新村街道
程家桥街道
北新泾街道
新泾镇
江宁路街道
静安寺街道
南京西路街道
曹家渡街道
石门二路街道
天目西路街道
北站街道
宝山路街道
芷江西路街道
共和新路街道
大宁路街道
彭浦新村街道
临汾路街道
彭浦镇
长寿路街道
曹杨新村街道
长风新村街道
宜川路街道
甘泉路街道
石泉路街道
真如镇街道
万里街道
长征镇
桃浦镇
四川北路街道
北外滩街道
欧阳路街道
广中路街道
凉城新村街道
嘉兴路街道
曲阳路街道
江湾镇街道
定海路街道
大桥街道
平凉路街道
江浦路街道
控江路街道
殷行街道
长白新村街道
延吉新村街道
五角场街道
四平路街道
新江湾城街道
长海路街道


In [82]:
print(shanghai_venues.shape)
shanghai_venues

(1055, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,外滩街道,31.240357,121.484146,Fairmont Peace Hotel (和平饭店),31.240865,121.484742,Hotel
1,外滩街道,31.240357,121.484146,The Bund (外滩),31.239316,121.486065,Waterfront
2,外滩街道,31.240357,121.484146,Mr & Mrs Bund - Modern Eatery by Paul Pairet,31.24045,121.48562,French Restaurant
3,外滩街道,31.240357,121.484146,Hakkasan,31.240436,121.485537,Chinese Restaurant
4,外滩街道,31.240357,121.484146,The Shanghai EDITION (上海爱迪逊酒店),31.240001,121.481678,Hotel
5,外滩街道,31.240357,121.484146,The Swatch Art Peace Hotel,31.240719,121.485127,Hotel
6,外滩街道,31.240357,121.484146,Ultraviolet by Paul Pairet,31.240398,121.485271,Restaurant
7,外滩街道,31.240357,121.484146,Bar Rouge,31.240391,121.485224,Lounge
8,外滩街道,31.240357,121.484146,Imperial Treasure Fine Chinese Cuisine (御寶軒),31.242553,121.484388,Chinese Restaurant
9,外滩街道,31.240357,121.484146,The Peninsula Shanghai,31.243049,121.484564,Hotel


**find out how many unique categories can be curated from all the returned venues**

In [83]:
shanghai_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
万里街道,28,28,28,28,28,28
临汾路街道,2,2,2,2,2,2
五角场街道,11,11,11,11,11,11
五里桥街道,6,6,6,6,6,6
仙霞新村街道,13,13,13,13,13,13
共和新路街道,4,4,4,4,4,4
凉城新村街道,4,4,4,4,4,4
凌云路街道,5,5,5,5,5,5
北外滩街道,20,20,20,20,20,20
北新泾街道,5,5,5,5,5,5


In [84]:
print('There are {} uniques categories.'.format(len(shanghai_venues['Venue Category'].unique())))

There are 180 uniques categories.


## 9.Analyze Each Area

In [85]:
# one hot encoding
shanghai_onehot = pd.get_dummies(shanghai_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
shanghai_onehot['Neighborhood'] =shanghai_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [shanghai_onehot.columns[-1]] + list(shanghai_onehot.columns[:-1])
shanghai_onehot = shanghai_onehot[fixed_columns]

shanghai_onehot.head()

Unnamed: 0,Zhejiang Restaurant,ATM,Airport,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Badminton Court,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Bistro,Bookstore,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buddhist Temple,Burger Joint,Bus Station,Business Center,Cafeteria,Café,Camera Store,Campground,Candy Store,Cantonese Restaurant,Cha Chaan Teng,Chinese Aristocrat Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,College Soccer Field,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dongbei Restaurant,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Grocery Store,Guizhou Restaurant,Gym,Gym / Fitness Center,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hong Kong Restaurant,Hostel,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hunan Restaurant,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Lake,Latin American Restaurant,Lingerie Store,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motel,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,Outdoor Sculpture,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Polish Restaurant,Pool,Public Art,Racetrack,Ramen Restaurant,Residential Building (Apartment / Condo),Restaurant,Roof Deck,Salad Place,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Shopping Mall,Shopping Plaza,Smoothie Shop,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Club,Stadium,Steakhouse,Student Center,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Theme Restaurant,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Women's Store,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,外滩街道,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,外滩街道,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,外滩街道,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,外滩街道,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,外滩街道,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


**group rows by neighborhood and by taking the mean of the frequency of occurrence of each category**

In [86]:
shanghai_onehot.shape

(1055, 180)

In [87]:
shanghai_grouped = shanghai_onehot.groupby('Neighborhood').mean().reset_index()
shanghai_grouped

Unnamed: 0,Neighborhood,Zhejiang Restaurant,ATM,Airport,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Badminton Court,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Bistro,Bookstore,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buddhist Temple,Burger Joint,Bus Station,Business Center,Cafeteria,Café,Camera Store,Campground,Candy Store,Cantonese Restaurant,Cha Chaan Teng,Chinese Aristocrat Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,College Soccer Field,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Dongbei Restaurant,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,English Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,German Restaurant,Grocery Store,Guizhou Restaurant,Gym,Gym / Fitness Center,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hong Kong Restaurant,Hostel,Hotel,Hotel Bar,Hotel Pool,Hotpot Restaurant,Hunan Restaurant,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Lake,Latin American Restaurant,Lingerie Store,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Miscellaneous Shop,Mobile Phone Shop,Monument / Landmark,Motel,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nail Salon,New American Restaurant,Night Market,Nightclub,Noodle House,Outdoor Sculpture,Park,Pedestrian Plaza,Performing Arts Venue,Pet Store,Photography Studio,Pie Shop,Pier,Pizza Place,Plaza,Polish Restaurant,Pool,Public Art,Racetrack,Ramen Restaurant,Residential Building (Apartment / Condo),Restaurant,Roof Deck,Salad Place,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Shopping Mall,Shopping Plaza,Smoothie Shop,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Club,Stadium,Steakhouse,Student Center,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park,Theme Restaurant,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Whisky Bar,Wine Bar,Women's Store,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant
0,万里街道,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.107143,0.0,0.0,0.0,0.142857,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0
1,临汾路街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,五角场街道,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,五里桥街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,仙霞新村街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.076923,0.0,0.0,0.0,0.307692,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.230769,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,共和新路街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,凉城新村街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,凌云路街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,北外滩街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.2,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,北新泾街道,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


**create the new dataframe and display the top 10 venues for each Neighborhoods**

In [52]:
shanghai_grouped.shape

(74, 180)

In [88]:
num_top_venues = 5

for hood in shanghai_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = shanghai_grouped[shanghai_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----万里街道----
                  venue  freq
0           Coffee Shop  0.14
1    Chinese Restaurant  0.11
2  Fast Food Restaurant  0.07
3     Hotpot Restaurant  0.07
4                  Café  0.04


----临汾路街道----
                 venue  freq
0   Athletics & Sports   0.5
1   Chinese Restaurant   0.5
2  Zhejiang Restaurant   0.0
3     Pedestrian Plaza   0.0
4          Music Venue   0.0


----五角场街道----
                 venue  freq
0   Chinese Restaurant  0.27
1  Zhejiang Restaurant  0.09
2                Hotel  0.09
3          Coffee Shop  0.09
4          Flea Market  0.09


----五里桥街道----
                 venue  freq
0  Japanese Restaurant  0.17
1                 Café  0.17
2        Metro Station  0.17
3          Coffee Shop  0.17
4           Theme Park  0.17


----仙霞新村街道----
                 venue  freq
0          Coffee Shop  0.31
1  Japanese Restaurant  0.23
2            BBQ Joint  0.15
3       Cha Chaan Teng  0.08
4                Hotel  0.08


----共和新路街道----
                  venue  freq

                 venue  freq
0   Chinese Restaurant  0.25
1              Airport  0.25
2        Metro Station  0.25
3       Shopping Plaza  0.25
4  Zhejiang Restaurant  0.00




In [89]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [109]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = shanghai_grouped['Neighborhood']

for ind in np.arange(shanghai_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(shanghai_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,万里街道,Coffee Shop,Chinese Restaurant,Fast Food Restaurant,Hotpot Restaurant,Bubble Tea Shop,Bookstore,Grocery Store,Concert Hall,Museum,Noodle House
1,临汾路街道,Chinese Restaurant,Athletics & Sports,Falafel Restaurant,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
2,五角场街道,Chinese Restaurant,Coffee Shop,Sushi Restaurant,Metro Station,Flea Market,Market,Zhejiang Restaurant,Ice Cream Shop,Hotel,Dumpling Restaurant
3,五里桥街道,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food
4,仙霞新村街道,Coffee Shop,Japanese Restaurant,BBQ Joint,Park,Chinese Restaurant,Hotel,Cha Chaan Teng,Food & Drink Shop,Food,Flower Shop
5,共和新路街道,Noodle House,Hotel,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
6,凉城新村街道,Convenience Store,Japanese Restaurant,Gym,Fast Food Restaurant,Yunnan Restaurant,Farmers Market,Food Court,Food & Drink Shop,Food,Flower Shop
7,凌云路街道,Bus Station,Fast Food Restaurant,Italian Restaurant,Market,Night Market,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop
8,北外滩街道,Hotel,Residential Building (Apartment / Condo),Chinese Restaurant,Bookstore,Metro Station,German Restaurant,Bar,Coffee Shop,Shopping Mall,BBQ Joint
9,北新泾街道,Seafood Restaurant,Hostel,Shopping Plaza,Motel,Bus Station,Dive Bar,Dessert Shop,Food Court,Food & Drink Shop,Food


### 10.Cluster Areas
Run k-means to cluster the city center areas into 5 clusters.

In [110]:
# set number of clusters
kclusters = 5

shanghai_grouped_clustering = shanghai_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(shanghai_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 4, 4, 2, 2, 0, 0, 0, 2, 2], dtype=int32)

In [111]:
neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,万里街道,Coffee Shop,Chinese Restaurant,Fast Food Restaurant,Hotpot Restaurant,Bubble Tea Shop,Bookstore,Grocery Store,Concert Hall,Museum,Noodle House
1,临汾路街道,Chinese Restaurant,Athletics & Sports,Falafel Restaurant,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
2,五角场街道,Chinese Restaurant,Coffee Shop,Sushi Restaurant,Metro Station,Flea Market,Market,Zhejiang Restaurant,Ice Cream Shop,Hotel,Dumpling Restaurant
3,五里桥街道,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food
4,仙霞新村街道,Coffee Shop,Japanese Restaurant,BBQ Joint,Park,Chinese Restaurant,Hotel,Cha Chaan Teng,Food & Drink Shop,Food,Flower Shop
5,共和新路街道,Noodle House,Hotel,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
6,凉城新村街道,Convenience Store,Japanese Restaurant,Gym,Fast Food Restaurant,Yunnan Restaurant,Farmers Market,Food Court,Food & Drink Shop,Food,Flower Shop
7,凌云路街道,Bus Station,Fast Food Restaurant,Italian Restaurant,Market,Night Market,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop
8,北外滩街道,Hotel,Residential Building (Apartment / Condo),Chinese Restaurant,Bookstore,Metro Station,German Restaurant,Bar,Coffee Shop,Shopping Mall,BBQ Joint
9,北新泾街道,Seafood Restaurant,Hostel,Shopping Plaza,Motel,Bus Station,Dive Bar,Dessert Shop,Food Court,Food & Drink Shop,Food


In [112]:
df_inner.shape

(77, 4)

In [113]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

shanghai_merged = df_inner

shanghai_merged = shanghai_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

shanghai_merged # check the last columns!

Unnamed: 0,Borough,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,黄浦,外滩街道,121.484146,31.240357,2.0,French Restaurant,Coffee Shop,Chinese Restaurant,Hotel,Italian Restaurant,Dumpling Restaurant,Shopping Mall,Jazz Club,Restaurant,Café
1,黄浦,南京东路街道,121.471941,31.226107,2.0,Chinese Restaurant,Café,Hotel,Coffee Shop,Cocktail Bar,Spa,American Restaurant,Gastropub,Park,Shopping Mall
2,黄浦,半淞园路街道,121.487103,31.206786,4.0,Chinese Restaurant,Noodle House,Convenience Store,Coffee Shop,Park,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
3,黄浦,小东门街道,121.501063,31.220083,4.0,Indian Restaurant,Chinese Restaurant,Hotel Bar,Latin American Restaurant,Pool,Shopping Plaza,Flea Market,French Restaurant,Tapas Restaurant,Lounge
4,黄浦,老西门街道,121.486244,31.215171,0.0,Dumpling Restaurant,Shopping Mall,Bookstore,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
5,黄浦,豫园街道,121.486984,31.225396,2.0,Coffee Shop,Dumpling Restaurant,Chinese Restaurant,Buddhist Temple,Bakery,Pedestrian Plaza,Fast Food Restaurant,Shopping Mall,Food Court,Supermarket
6,黄浦,打浦桥街道,121.473506,31.203656,2.0,Fast Food Restaurant,Restaurant,Coffee Shop,Café,BBQ Joint,Supermarket,Chinese Restaurant,Mobile Phone Shop,Theme Park,Shopping Mall
7,黄浦,淮海中路街道,121.475832,31.214941,4.0,Noodle House,Dim Sum Restaurant,Photography Studio,Dumpling Restaurant,Eastern European Restaurant,Vegetarian / Vegan Restaurant,Chinese Restaurant,Gym,Shopping Mall,English Restaurant
8,黄浦,瑞金二路街道,121.466792,31.217426,2.0,Café,Coffee Shop,Cocktail Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Historic Site,Restaurant,Burger Joint,Soup Place,Shopping Mall
9,黄浦,五里桥街道,121.481785,31.200743,2.0,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food


In [118]:
shanghai_merged.dropna(axis = 0,inplace=True)
shanghai_merged

Unnamed: 0,Borough,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,黄浦,外滩街道,121.484146,31.240357,2.0,French Restaurant,Coffee Shop,Chinese Restaurant,Hotel,Italian Restaurant,Dumpling Restaurant,Shopping Mall,Jazz Club,Restaurant,Café
1,黄浦,南京东路街道,121.471941,31.226107,2.0,Chinese Restaurant,Café,Hotel,Coffee Shop,Cocktail Bar,Spa,American Restaurant,Gastropub,Park,Shopping Mall
2,黄浦,半淞园路街道,121.487103,31.206786,4.0,Chinese Restaurant,Noodle House,Convenience Store,Coffee Shop,Park,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
3,黄浦,小东门街道,121.501063,31.220083,4.0,Indian Restaurant,Chinese Restaurant,Hotel Bar,Latin American Restaurant,Pool,Shopping Plaza,Flea Market,French Restaurant,Tapas Restaurant,Lounge
4,黄浦,老西门街道,121.486244,31.215171,0.0,Dumpling Restaurant,Shopping Mall,Bookstore,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
5,黄浦,豫园街道,121.486984,31.225396,2.0,Coffee Shop,Dumpling Restaurant,Chinese Restaurant,Buddhist Temple,Bakery,Pedestrian Plaza,Fast Food Restaurant,Shopping Mall,Food Court,Supermarket
6,黄浦,打浦桥街道,121.473506,31.203656,2.0,Fast Food Restaurant,Restaurant,Coffee Shop,Café,BBQ Joint,Supermarket,Chinese Restaurant,Mobile Phone Shop,Theme Park,Shopping Mall
7,黄浦,淮海中路街道,121.475832,31.214941,4.0,Noodle House,Dim Sum Restaurant,Photography Studio,Dumpling Restaurant,Eastern European Restaurant,Vegetarian / Vegan Restaurant,Chinese Restaurant,Gym,Shopping Mall,English Restaurant
8,黄浦,瑞金二路街道,121.466792,31.217426,2.0,Café,Coffee Shop,Cocktail Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Historic Site,Restaurant,Burger Joint,Soup Place,Shopping Mall
9,黄浦,五里桥街道,121.481785,31.200743,2.0,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food


In [120]:
shanghai_merged["Cluster Labels"] = shanghai_merged["Cluster Labels"].astype('int64')
shanghai_merged

Unnamed: 0,Borough,Neighborhood,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,黄浦,外滩街道,121.484146,31.240357,2,French Restaurant,Coffee Shop,Chinese Restaurant,Hotel,Italian Restaurant,Dumpling Restaurant,Shopping Mall,Jazz Club,Restaurant,Café
1,黄浦,南京东路街道,121.471941,31.226107,2,Chinese Restaurant,Café,Hotel,Coffee Shop,Cocktail Bar,Spa,American Restaurant,Gastropub,Park,Shopping Mall
2,黄浦,半淞园路街道,121.487103,31.206786,4,Chinese Restaurant,Noodle House,Convenience Store,Coffee Shop,Park,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
3,黄浦,小东门街道,121.501063,31.220083,4,Indian Restaurant,Chinese Restaurant,Hotel Bar,Latin American Restaurant,Pool,Shopping Plaza,Flea Market,French Restaurant,Tapas Restaurant,Lounge
4,黄浦,老西门街道,121.486244,31.215171,0,Dumpling Restaurant,Shopping Mall,Bookstore,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
5,黄浦,豫园街道,121.486984,31.225396,2,Coffee Shop,Dumpling Restaurant,Chinese Restaurant,Buddhist Temple,Bakery,Pedestrian Plaza,Fast Food Restaurant,Shopping Mall,Food Court,Supermarket
6,黄浦,打浦桥街道,121.473506,31.203656,2,Fast Food Restaurant,Restaurant,Coffee Shop,Café,BBQ Joint,Supermarket,Chinese Restaurant,Mobile Phone Shop,Theme Park,Shopping Mall
7,黄浦,淮海中路街道,121.475832,31.214941,4,Noodle House,Dim Sum Restaurant,Photography Studio,Dumpling Restaurant,Eastern European Restaurant,Vegetarian / Vegan Restaurant,Chinese Restaurant,Gym,Shopping Mall,English Restaurant
8,黄浦,瑞金二路街道,121.466792,31.217426,2,Café,Coffee Shop,Cocktail Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Historic Site,Restaurant,Burger Joint,Soup Place,Shopping Mall
9,黄浦,五里桥街道,121.481785,31.200743,2,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food


In [121]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(shanghai_merged['latitude'], shanghai_merged['longitude'], shanghai_merged['Neighborhood'], shanghai_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Result

#### Cluster 1

In [128]:
shanghai_merged.loc[shanghai_merged['Cluster Labels'] == 0,shanghai_merged.columns[[0]+[1] + list(range(5, shanghai_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,黄浦,老西门街道,Dumpling Restaurant,Shopping Mall,Bookstore,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
16,徐汇,漕河泾街道,Chinese Restaurant,Shanghai Restaurant,Hunan Restaurant,Fast Food Restaurant,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant,Farmers Market
17,徐汇,康健新村街道,Fast Food Restaurant,Cosmetics Shop,Coffee Shop,Metro Station,Shopping Mall,Sandwich Place,Chinese Restaurant,Train Station,Bus Station,Japanese Restaurant
18,徐汇,虹梅路街道,Fast Food Restaurant,Park,Hotpot Restaurant,English Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
20,徐汇,凌云路街道,Bus Station,Fast Food Restaurant,Italian Restaurant,Market,Night Market,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop
27,长宁,周家桥街道,Coffee Shop,Fast Food Restaurant,Malay Restaurant,Hotel,Public Art,Szechuan Restaurant,Noodle House,Tea Room,Shopping Mall,Yunnan Restaurant
28,长宁,虹桥街道,Fast Food Restaurant,Coffee Shop,Asian Restaurant,Yoga Studio,Hotpot Restaurant,Fried Chicken Joint,Gym / Fitness Center,Convenience Store,Seafood Restaurant,Athletics & Sports
42,静安,共和新路街道,Noodle House,Hotel,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
44,静安,彭浦新村街道,Restaurant,Campground,Fast Food Restaurant,Night Market,Yunnan Restaurant,Falafel Restaurant,Food & Drink Shop,Food,Flower Shop,Flea Market
61,虹口,凉城新村街道,Convenience Store,Japanese Restaurant,Gym,Fast Food Restaurant,Yunnan Restaurant,Farmers Market,Food Court,Food & Drink Shop,Food,Flower Shop


#### Cluster 2

In [130]:
shanghai_merged.loc[shanghai_merged['Cluster Labels'] == 1, shanghai_merged.columns[[0] +[1]+ list(range(5, shanghai_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
71,杨浦,长白新村街道,Bus Station,Yunnan Restaurant,Fried Chicken Joint,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant,Fast Food Restaurant


#### Cluster 3

In [132]:
shanghai_merged.loc[shanghai_merged['Cluster Labels'] == 2, shanghai_merged.columns[[0] +[1]+ list(range(5, shanghai_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,黄浦,外滩街道,French Restaurant,Coffee Shop,Chinese Restaurant,Hotel,Italian Restaurant,Dumpling Restaurant,Shopping Mall,Jazz Club,Restaurant,Café
1,黄浦,南京东路街道,Chinese Restaurant,Café,Hotel,Coffee Shop,Cocktail Bar,Spa,American Restaurant,Gastropub,Park,Shopping Mall
5,黄浦,豫园街道,Coffee Shop,Dumpling Restaurant,Chinese Restaurant,Buddhist Temple,Bakery,Pedestrian Plaza,Fast Food Restaurant,Shopping Mall,Food Court,Supermarket
6,黄浦,打浦桥街道,Fast Food Restaurant,Restaurant,Coffee Shop,Café,BBQ Joint,Supermarket,Chinese Restaurant,Mobile Phone Shop,Theme Park,Shopping Mall
8,黄浦,瑞金二路街道,Café,Coffee Shop,Cocktail Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Historic Site,Restaurant,Burger Joint,Soup Place,Shopping Mall
9,黄浦,五里桥街道,Theme Park,Café,Japanese Restaurant,Metro Station,Coffee Shop,Fast Food Restaurant,Yunnan Restaurant,Food Court,Food & Drink Shop,Food
10,徐汇,湖南路街道,Café,Cocktail Bar,Chinese Restaurant,Sushi Restaurant,Wine Bar,Convenience Store,Coffee Shop,Shopping Plaza,Bistro,Bookstore
12,徐汇,枫林路街道,Shopping Mall,Coffee Shop,Yunnan Restaurant,New American Restaurant,Movie Theater,Café,Bookstore,Thai Restaurant,Malay Restaurant,Grocery Store
13,徐汇,徐家汇街道,Coffee Shop,Gym / Fitness Center,Grocery Store,Chinese Restaurant,Bar,Japanese Restaurant,Dumpling Restaurant,Convenience Store,Fast Food Restaurant,Athletics & Sports
19,徐汇,田林街道,Coffee Shop,Chinese Restaurant,Fast Food Restaurant,Hotpot Restaurant,Bubble Tea Shop,Bookstore,Grocery Store,Concert Hall,Museum,Noodle House


#### Cluster 4

In [133]:
shanghai_merged.loc[shanghai_merged['Cluster Labels'] == 3, shanghai_merged.columns[[0] +[1]+ list(range(5, shanghai_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
38,静安,天目西路街道,Food & Drink Shop,Yunnan Restaurant,Deli / Bodega,French Restaurant,Food Court,Food,Flower Shop,Flea Market,Filipino Restaurant,Fast Food Restaurant


#### Cluster 5

In [136]:
shanghai_merged.loc[shanghai_merged['Cluster Labels'] == 4, shanghai_merged.columns[[0] +[1]+ list(range(5, shanghai_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,黄浦,半淞园路街道,Chinese Restaurant,Noodle House,Convenience Store,Coffee Shop,Park,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant
3,黄浦,小东门街道,Indian Restaurant,Chinese Restaurant,Hotel Bar,Latin American Restaurant,Pool,Shopping Plaza,Flea Market,French Restaurant,Tapas Restaurant,Lounge
7,黄浦,淮海中路街道,Noodle House,Dim Sum Restaurant,Photography Studio,Dumpling Restaurant,Eastern European Restaurant,Vegetarian / Vegan Restaurant,Chinese Restaurant,Gym,Shopping Mall,English Restaurant
11,徐汇,天平路街道,Chinese Restaurant,Bar,Park,Dumpling Restaurant,Taiwanese Restaurant,Massage Studio,Food,Convenience Store,Japanese Restaurant,Hotel
14,徐汇,斜土路街道,Hotel,Pet Store,Asian Restaurant,Chinese Restaurant,Yunnan Restaurant,Falafel Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop
21,徐汇,龙华街道,Airport,Metro Station,Chinese Restaurant,Shopping Plaza,Yunnan Restaurant,Farmers Market,Food Court,Food & Drink Shop,Food,Flower Shop
23,长宁,华阳路街道,Chinese Restaurant,Noodle House,Gym / Fitness Center,Asian Restaurant,Szechuan Restaurant,Motel,Salad Place,French Restaurant,Dumpling Restaurant,Eastern European Restaurant
41,静安,芷江西路街道,Supermarket,Karaoke Bar,Bridal Shop,Metro Station,Gym,Chinese Restaurant,Hotel,Dongbei Restaurant,Filipino Restaurant,Food Court
43,静安,大宁路街道,Gym,Concert Hall,Coffee Shop,Chinese Restaurant,Theater,Yunnan Restaurant,Food & Drink Shop,Food,Flower Shop,Flea Market
45,静安,临汾路街道,Chinese Restaurant,Athletics & Sports,Falafel Restaurant,French Restaurant,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market,Filipino Restaurant


## Discussion

**We can see that in the Cluster 3, the neighborhood style is closer to the French concession style, which means that in addition to the French concession, you can visit these places at the same time**

*At the same time, we find the Cluster 1, the community style is close to the traditional Chinese culture area, and we can go to these places to understand the living conditions of ordinary Shanghai residents*