# 1. INTRODUCTION
There are many cities in China, and each city has different cultures and economies. Next, I will screen the major cities in China for cluster analysis to see which cities are similar. I will use the Foursquare API to explore the most common venue categories in each city and then use this feature to group cities. I will use the k-means clustering algorithm to accomplish this task. Finally, I will use the Folium library to visualize Chinese cities and their emerging clusters.

# 2.DATA OVERVIEW 
I have selected 36 major cities in China. Below are the names of these cities and the geographic coordinates of the city center. I will explore the most common site categories within 10 km of the city center and cluster analysis.

## Libraries, Packages

In [1]:
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim 
from sklearn.cluster import KMeans
import folium
import requests
import matplotlib.cm as cm
import matplotlib.colors as colors

In [2]:
data = pd.read_csv("China_Major_City.csv")
data

Unnamed: 0,City,Latitude,Longitude
0,Beijing,39.91667,116.41667
1,Shanghai,31.23,121.43333
2,Tianjin,39.13333,117.2
3,Hong Kong,22.2,114.1
4,Guangzhou,23.16667,113.23333
5,Shenzhen,22.61667,114.06667
6,Zhuhai,22.3,113.51667
7,Hangzhou,30.26667,120.2
8,Chongqing,29.56667,106.45
9,Qingdao,36.06667,120.33333


In [3]:
data.shape

(36, 3)

# 3.METHODOLOGY
k-means clustering.
Folium library to visualize Chinese cities and their emerging clusters.

## Define Foursquare Credentials and Version

In [13]:
CLIENT_ID = 'JVM13LVVCYCVWKMRBYTINDRTAI1MSICS0TAPF5SZMG35IPCO' 
CLIENT_SECRET = '3EJ202WMSYDJYIYRLOBGR2WQZDETEYAHWGUYPDDLHWYJJBCS' 
VERSION = '20180605'
radius = 5000
LIMIT = 300

## Query geographic coordinates

In [6]:
address = 'China'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Manhattan are 35.000074, 104.999927.


## Create a map of China with cities superimposed on top

In [14]:
map_china = folium.Map(location=[latitude, longitude], zoom_start=4)

for lat, lng, city in zip(data['Latitude'], data['Longitude'],data['City']):
    label = '{}'.format(city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_china)  
    
map_china

##  Explore Cities in China

In [15]:
def getNearbyVenues(names, latitudes, longitudes, radius=10000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Explore each city in china

In [16]:
china_city_venues = getNearbyVenues(names=data['City'],
                                   latitudes=data['Latitude'],
                                   longitudes=data['Longitude']
                                  )

Beijing
Shanghai
Tianjin
Hong Kong
Guangzhou
Shenzhen
Zhuhai
Hangzhou
Chongqing
Qingdao
Xiamen
Fuzhou
Lanzhou
Guiyang
Changsha
Nanjing
Nanchang
Shenyang
Taiyuan
Chengdu
Lhasa
Urumqi
Kunming
Xi'an
Xining
Yinchuan
Hohhot
Harbin
Changchun
Wuhan
Zhengzhou
Shijiazhuang
Sanya
Haikou
Macao
Taipei


In [17]:
china_city_venues.rename(columns={'Neighborhood':'City'},inplace=True)
china_city_venues.head()

Unnamed: 0,City,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Beijing,39.91667,116.41667,Din Tai Fung (鼎泰丰),39.91363,116.405766,Dumpling Restaurant
1,Beijing,39.91667,116.41667,The Peninsula Beijing (王府半岛酒店),39.914167,116.410192,Hotel
2,Beijing,39.91667,116.41667,Duck de Chine 全鸭季,39.913152,116.414793,Chinese Restaurant
3,Beijing,39.91667,116.41667,1949 全鴨季 (金寶街),39.91317,116.414699,Beijing Restaurant
4,Beijing,39.91667,116.41667,The Grandma's (外婆家),39.915184,116.411919,Zhejiang Restaurant


In [18]:
china_city_venues.shape

(1905, 7)

### Let's check how many venues were returned for each city

In [19]:
china_city_venues.groupby('City').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Beijing,100,100,100,100,100,100
Changchun,23,23,23,23,23,23
Changsha,45,45,45,45,45,45
Chengdu,100,100,100,100,100,100
Chongqing,36,36,36,36,36,36
Fuzhou,30,30,30,30,30,30
Guangzhou,100,100,100,100,100,100
Guiyang,11,11,11,11,11,11
Haikou,16,16,16,16,16,16
Hangzhou,100,100,100,100,100,100


Let's find out how many unique categories can be curated from all the returned venues

In [20]:
print('There are {} uniques categories.'.format(len(china_city_venues['Venue Category'].unique())))

There are 235 uniques categories.


## Analyze Each City

In [21]:
# one hot encoding
china_city_onehot = pd.get_dummies(china_city_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
china_city_onehot['City'] = data['City'] 

# move neighborhood column to the first column
index = china_city_onehot.columns.get_loc("City")
fixed_columns = [china_city_onehot.columns[int(index)]] + list(china_city_onehot.columns[:int(index)]) + list(china_city_onehot.columns[int(index)+1:china_city_onehot.shape[1]])
china_city_onehot = china_city_onehot[fixed_columns]

china_city_onehot.head(20)

Unnamed: 0,City,Airport,Airport Lounge,American Restaurant,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,...,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant,Zhejiang Restaurant,Zoo,Zoo Exhibit
0,Beijing,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Shanghai,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Tianjin,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Hong Kong,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Guangzhou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
5,Shenzhen,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,Zhuhai,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,Hangzhou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,Chongqing,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,Qingdao,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### group rows by city and by taking the mean of the frequency of occurrence of each category

In [22]:
city_grouped = china_city_onehot.groupby('City').mean().reset_index()
city_grouped

Unnamed: 0,City,Airport,Airport Lounge,American Restaurant,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,...,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant,Zhejiang Restaurant,Zoo,Zoo Exhibit
0,Beijing,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Changchun,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Changsha,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Chengdu,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Chongqing,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,Fuzhou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,Guangzhou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
7,Guiyang,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,Haikou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,Hangzhou,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### print each cities along with the top 5 most common venues

In [23]:
num_top_venues = 5

for hood in city_grouped['City']:
    print("----"+hood+"----")
    temp = city_grouped[city_grouped['City'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Beijing----
                 venue  freq
0  Dumpling Restaurant   1.0
1                  Pub   0.0
2                 Park   0.0
3          Pastry Shop   0.0
4     Pedestrian Plaza   0.0


----Changchun----
                 venue  freq
0  Dumpling Restaurant   1.0
1                  Pub   0.0
2                 Park   0.0
3          Pastry Shop   0.0
4     Pedestrian Plaza   0.0


----Changsha----
                           venue  freq
0                           Park   1.0
1                        Airport   0.0
2          Portuguese Restaurant   0.0
3  Paper / Office Supplies Store   0.0
4                    Pastry Shop   0.0


----Chengdu----
                           venue  freq
0                  Historic Site   1.0
1                        Airport   0.0
2              Outdoor Sculpture   0.0
3  Paper / Office Supplies Store   0.0
4                           Park   0.0


----Chongqing----
                   venue  freq
0      French Restaurant   1.0
1                Airport   0.

### put that into a pandas dataframe

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
city_venues_sorted = pd.DataFrame(columns=columns)
city_venues_sorted['City'] = city_grouped['City']

for ind in np.arange(city_grouped.shape[0]):
    city_venues_sorted.iloc[ind, 1:] = return_most_common_venues(city_grouped.iloc[ind, :], num_top_venues)

city_venues_sorted

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beijing,Dumpling Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
1,Changchun,Dumpling Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
2,Changsha,Park,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
3,Chengdu,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
4,Chongqing,French Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,Fountain
5,Fuzhou,Motel,Zoo Exhibit,Gourmet Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
6,Guangzhou,Zhejiang Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
7,Guiyang,Park,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
8,Haikou,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
9,Hangzhou,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain


## Cluster Cities

In [26]:
# set number of clusters
kclusters = 3

city_grouped_clustering = city_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(city_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 2, 0, 0, 0, 0, 2, 1])

In [27]:
city_merged = data

# add clustering labels
city_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
city_merged = city_merged.join(city_venues_sorted.set_index('City'), on='City')

city_merged.head() # check the last columns!

Unnamed: 0,City,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beijing,39.91667,116.41667,0,Dumpling Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
1,Shanghai,31.23,121.43333,0,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
2,Tianjin,39.13333,117.2,0,Chinese Restaurant,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
3,Hong Kong,22.2,114.1,2,Beijing Restaurant,Zoo Exhibit,Gourmet Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
4,Guangzhou,23.16667,113.23333,0,Zhejiang Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant


In [28]:
city_merged.shape

(36, 14)

### Finally, let's visualize the resulting clusters

In [29]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(city_merged['Latitude'], city_merged['Longitude'], city_merged['City'], city_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# 4. RESULTS

Cluster１

In [35]:
city_merged.loc[city_merged['Cluster Labels'] == 0, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Beijing,Dumpling Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
1,Shanghai,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
2,Tianjin,Chinese Restaurant,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
4,Guangzhou,Zhejiang Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
5,Shenzhen,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
6,Zhuhai,Church,Zoo Exhibit,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain,Food Truck
7,Hangzhou,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
10,Xiamen,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
11,Fuzhou,Motel,Zoo Exhibit,Gourmet Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
12,Lanzhou,Shopping Mall,Zoo Exhibit,Flea Market,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain


Cluster2

In [33]:
city_merged.loc[city_merged['Cluster Labels'] == 1, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Qingdao,Hotel,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
16,Nanchang,History Museum,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
19,Chengdu,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
21,Urumqi,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
23,Xi'an,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
26,Hohhot,Shopping Mall,Zoo Exhibit,Flea Market,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
31,Shijiazhuang,Peking Duck Restaurant,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
32,Sanya,Brewery,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain


Cluster3

In [34]:
city_merged.loc[city_merged['Cluster Labels'] == 2, city_merged.columns[[0] + list(range(4, city_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Hong Kong,Beijing Restaurant,Zoo Exhibit,Gourmet Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
8,Chongqing,French Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,Fountain
15,Nanjing,Electronics Store,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
28,Changchun,Dumpling Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant
29,Wuhan,Historic Site,Zoo Exhibit,Flower Shop,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant,Fountain
30,Zhengzhou,Yunnan Restaurant,Zoo Exhibit,Flower Shop,German Restaurant,Gay Bar,Gastropub,Garden,Furniture / Home Store,Fujian Restaurant,French Restaurant


# 5. DISCUSSION

Cluster 1 consists of 22 cities, which are the following cities:<br>
1.Beijing<br>
2.Shanghai<br>
3.Tianjin<br>
4.Guangzhou<br>
5.Shenzhen<br>
6.Zhuhai<br>
7.Hangzhou<br>
8.Xiamen<br>
9.Fuzhou<br>
10.Lanzhou<br>
11.Guiyang<br>
12.Changsha<br>
13.Shenyang<br>
14.Taiyuan<br>
15.Lhasa<br>
16.Kunming<br>
17.Xining<br>
18.Yinchuan<br>
19.Harbin<br>
20.Haikou<br>
21.Macao<br>
22.Taipei<br>
Most China cities are this Cluster<br>
<br>
<br>
Cluster 2 consists of 8 cities, which are the following cities:<br>
1.Qingdao<br>
2.Nanchang<br>
3.Chengdu<br>
4.Urumqi<br>
5.Xi'an	<br>
6.Hohhot<br>
7.Shijiazhuang<br>
8.Sanya<br>
<br>
<br>
Cluster 3 consists of 6 cities, which are the following cities:<br>
1.Hong Kong<br>
2.Chongqing<br>
3.Nanjing<br>
4.Changchun<br>
5.Wuhan	<br>
6.Zhengzhou<br>

# 6. Conclusion

This project divides China's major cities into three categories. Although the data is biased, the results reflect to some extent the differences in consumption patterns of different urban agglomerations. Coastal cities have relatively developed economies and similar consumption habits. There are only a few economically developed cities in the interior that are similar to the consumption habits of coastal cities. A few cities in the central region have similar levels of economic development, and some consumption habits are similar.