# Clustering Similar African Cities using the Four-Square API.


<img src = "https://upload.wikimedia.org/wikipedia/commons/8/8a/Africa-regions.png" width = 380, align = "center"></a>

## Clustering Similar African Cities using the Four-Square API. Introduction and Data


## Introduction


Africa is a diverse region as it is home to many cultures, histories and peoples. The diversity can also be observed in the standard of living and economic growth in the different parts. In this project we will use the location of major African cities and he four square API to seek and cluster similar African cities to cities within the region.



BY clustering similar cities it can be possible to enhance travel and tourism recommendations. This will allow tourists to better plan their trips and have a 'feeling' of what to expect when visitng Africa. Secondly, Africa is an emerging region with abundant natural resouorces and a young population this presents an opportunity for investors looking for higher returns and a strong market positons or monopolies. With the clustering possible in this project tourists, investors and etc, can better understand African Cities what they have to offer and the opportunites arising there.


<img src = "https://upload.wikimedia.org/wikipedia/commons/5/50/Victoria_Falls_gorge1.jpg" width = 350, align = "center"></a>

## Data


Our location Data will be retrieved from Simple Maps and Wikipedia. The data will provide the latitude and the longitude of most cities. The location will then be used to leverage Foursquare API to retrieve information on various venues around these locations. With the venue data and the location data we will the run a k-means algorithm to group cities with similar venues. With these clusters the Folium library will be used to plot similar cities.


The Clusters will be coloured and clearly labelled for peer-review. Secondly the Gross Domestic Product calculated for Purchasing Power parity and the populations will be used to produce GDP(PPP) per capita as a good proxy for the standard of living in these countries. 

Thus the first data set from Simple Maps will have a coloumns for country, city, latitude, longitude. We shall the use a list of African countries retrieved from Wikipedia to filter our large list from Simple Maps. With this list We shall collect the locations for 150 venues in a 10km radius of the cities, this is possible with Four Squares API. Finally, the complete list will contian the city, country, latitude, longitude, 10 most common venues in a particulare city and the per capita purchaisng power parity.

## Methodology

The goal of this project is to cluster similar african capital cities based on their per capita incomes and social life.

Fistly the locations of the african capitals are collect from simple maps, Foursquar and wikipedia. This information includes population, GDP corrceted for purchasing power parity and venue data from Four Square. 

Secondly, the required rows are computed, which include GDP per capita and the 10 most popular venues in each city. 

Lastly, similar african cities are clustered based on the popular venues and the per capita income.





## Analysis



We begin by importing the necessary libraries to perform this analysis

In [143]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


##  1.2 Data retrieval

We then use pandas to retrieve our information straight from wikipedia

#### The global cities database

In [241]:
Wcity = pd.read_csv('worldcities.csv')
City_coord = Wcity.iloc[:,[4,0,2,3,8]]

In [240]:
#Wcity.head()
Wcity = Wcity[Wcity['capital']=='primary']

#### Africa GDP(PPP) and  Population

In [149]:
W_pop = pd.read_html('https://en.wikipedia.org/wiki/List_of_African_countries_by_GDP_(PPP)')

In [159]:
a_gdp_ppp_pc = W_pop[0]

In [161]:
C_names = a_gdp_ppp_pc.iloc[0]

In [162]:
C_names

0                                           RegionRank
1                                              Country
2    Peak value of GDP (PPP) as of 2019Billions of ...
3                                            Peak Year
Name: 0, dtype: object

In [163]:
a_gdp_ppp_pc.columns = C_names

In [166]:
afr_income = a_gdp_ppp_pc.iloc[2:,]

In [168]:
afr_income.head(1)

Unnamed: 0,RegionRank,Country,Peak value of GDP (PPP) as of 2019Billions of International dollars,Peak Year
2,1,Egypt,1391.734,2019


In [174]:
population = pd.read_csv('population.csv', encoding = 'latin1')

In [176]:
popul = population.iloc[:,[0,5]]

In [187]:
popul.columns

Index(['Region, subregion, country or area *', '2018'], dtype='object')

#### The list of African Countries

In [242]:
Df = City_coord.merge(popul, how = 'inner', left_on = 'country', right_on = 'Region, subregion, country or area *')
Df.head(2)

Unnamed: 0,country,city,lat,lng,capital,"Region, subregion, country or area *",2018
0,South Africa,Pretoria,-25.7069,28.2294,primary,South Africa,57792.518
1,South Africa,Bloemfontein,-29.12,26.2299,primary,South Africa,57792.518


Finally we merge the two datasets

In [243]:
df1 = Df.merge(afr_income, how = 'inner', left_on = 'country', right_on = 'Country')
df1.head(1)

Unnamed: 0,country,city,lat,lng,capital,"Region, subregion, country or area *",2018,RegionRank,Country,Peak value of GDP (PPP) as of 2019Billions of International dollars,Peak Year
0,South Africa,Pretoria,-25.7069,28.2294,primary,South Africa,57792.518,3,South Africa,813.1,2019


In [196]:
df1.columns

Index(['country', 'city', 'lat', 'lng', 'capital',
       'Region, subregion, country or area *', '2018', 'RegionRank', 'Country',
       'Peak value of GDP (PPP) as of 2019Billions of International dollars',
       'Peak Year'],
      dtype='object')

In [244]:
df1['per_cap_inc']=df1['Peak value of GDP (PPP) as of 2019Billions of International dollars'].astype('float64')*1000000/df1['2018']

In [245]:
df1.head(1)

Unnamed: 0,country,city,lat,lng,capital,"Region, subregion, country or area *",2018,RegionRank,Country,Peak value of GDP (PPP) as of 2019Billions of International dollars,Peak Year,per_cap_inc
0,South Africa,Pretoria,-25.7069,28.2294,primary,South Africa,57792.518,3,South Africa,813.1,2019,14069.295268


In [246]:
africa_data = df1.iloc[:,[1,2,3,4,8,11]]

In [247]:
africa_data.tail()

Unnamed: 0,city,lat,lng,capital,Country,per_cap_inc
44,Conakry,9.5315,-13.6802,primary,Guinea,2306.776552
45,Malabo,3.75,8.7833,primary,Equatorial Guinea,28792.757692
46,Bissau,11.865,-15.5984,primary,Guinea-Bissau,1806.004685
47,Nairobi,-1.2833,36.8167,primary,Kenya,3715.907155
48,Moroni,-11.7042,43.2402,primary,Comoros,1662.817996


We now have the data in the preferred format!!!

## 2.0 Building our Map with the Folium Library and Foursquare API

In [None]:
location

In [214]:
address = 'Africa'

geolocator = Nominatim(user_agent="Africa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Africa are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Africa are 11.5024338, 17.7578122.


In [248]:
map_africa = folium.Map(location=[latitude, longitude], zoom_start=3)

# add markers to map
for lat, lng, label in zip(africa_data['lat'], africa_data['lng'], africa_data['city']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_africa)  
    
map_africa

A distribution of our neighbourhoods

In [254]:
CLIENT_ID = 'AYN2INSTLPV42E31M14BHJR2WFYDRCYTMU3RFIKJFX0KZMWI' # 
CLIENT_SECRET = 'VZETUFMMNMOBMDXOYYKDJT5O4FNWPNE5BLA0HK5BKYILCAZT' #t
VERSION = '20180605' # Foursquare API version

print('My credentials are:')
print('CLIENT_ID: ' + 'Some random code')
print('CLIENT_SECRET: ' + 'A Secret')
  

My credentials are:
CLIENT_ID: Some random code
CLIENT_SECRET: A Secret


A function to find nearby venues

In [283]:
# This Function will find venues around a selected neighbour h
def getNearbyVenues(names, latitudes, longitudes, radius=9000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['city', 
                  'city Latitude', 
                  'city Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [279]:
LIMIT = 30

In [284]:

africa_venues = getNearbyVenues(names=africa_data['city'],
                                   latitudes=africa_data['lat'],
                                   longitudes=africa_data['lng']
                                  )

Pretoria
Bloemfontein
Cape Town
Lusaka
Harare
Monrovia
Maseru
Tripoli
Rabat
Antananarivo
Bamako
Nouakchott
Port Louis
Lilongwe
Maputo
Windhoek
Niamey
Abuja
Kigali
Victoria
Khartoum
Freetown
Dakar
Mogadishu
Juba
Ndjamena
Lomé
Tunis
Kampala
Luanda
Ouagadougou
Bujumbura
Porto-Novo
Cotonou
Gaborone
Bangui
Yaounde
Djibouti
Algiers
Cairo
Asmara
Addis Ababa
Libreville
Accra
Conakry
Malabo
Bissau
Nairobi
Moroni


In [285]:
print(africa_venues.shape)
africa_venues.head()

(1064, 7)


Unnamed: 0,city,city Latitude,city Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Pretoria,-25.7069,28.2294,Burger Bistro,-25.722152,28.227975,Burger Joint
1,Pretoria,-25.7069,28.2294,Brewers BBQ,-25.703939,28.240994,BBQ Joint
2,Pretoria,-25.7069,28.2294,Café 41,-25.744691,28.222439,Gastropub
3,Pretoria,-25.7069,28.2294,Fruit Stop,-25.718634,28.205505,Farmers Market
4,Pretoria,-25.7069,28.2294,Royal Danish Icecream,-25.742076,28.242174,Ice Cream Shop


In [286]:
africa_venues.groupby('city').count()

Unnamed: 0_level_0,city Latitude,city Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
city,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Abuja,30,30,30,30,30,30
Accra,30,30,30,30,30,30
Addis Ababa,30,30,30,30,30,30
Algiers,30,30,30,30,30,30
Antananarivo,30,30,30,30,30,30
Asmara,4,4,4,4,4,4
Bamako,17,17,17,17,17,17
Bangui,6,6,6,6,6,6
Bissau,10,10,10,10,10,10
Bloemfontein,30,30,30,30,30,30


In [287]:
print('There are {} uniques categories.'.format(len(africa_venues['Venue Category'].unique())))

There are 160 uniques categories.


In [288]:
# one hot encoding
africa_onehot = pd.get_dummies(africa_venues[['Venue Category']], prefix="", prefix_sep="")

# add city column back to dataframe
africa_onehot['city'] = africa_venues['city'] 

# move Neighbourhood column to the first column
fixed_columns = [africa_onehot.columns[-1]] + list(africa_onehot.columns[:-1])
africa_onehot = africa_onehot[fixed_columns]

africa_onehot.tail()

Unnamed: 0,city,African Restaurant,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Big Box Store,Bistro,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Café,Candy Store,Casino,Caucasian Restaurant,Chinese Restaurant,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Creperie,Cricket Ground,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Dive Bar,Eastern European Restaurant,Ethiopian Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Service,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Latin American Restaurant,Lebanese Restaurant,Lighthouse,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Park,Pastry Shop,Performing Arts Venue,Pier,Pizza Place,Playground,Plaza,Pool,Port,Portuguese Restaurant,Pub,Racetrack,Resort,Restaurant,Roof Deck,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Field,Soccer Stadium,Spa,Spanish Restaurant,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Track Stadium,Trail,Train Station,Turkish Restaurant,Wine Bar,Wings Joint
1059,Moroni,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1060,Moroni,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1061,Moroni,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1062,Moroni,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1063,Moroni,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [289]:
africa_grouped = africa_onehot.groupby('city').mean().reset_index()
africa_grouped.head()

Unnamed: 0,city,African Restaurant,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Big Box Store,Bistro,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Café,Candy Store,Casino,Caucasian Restaurant,Chinese Restaurant,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Creperie,Cricket Ground,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Dive Bar,Eastern European Restaurant,Ethiopian Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Service,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Latin American Restaurant,Lebanese Restaurant,Lighthouse,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Park,Pastry Shop,Performing Arts Venue,Pier,Pizza Place,Playground,Plaza,Pool,Port,Portuguese Restaurant,Pub,Racetrack,Resort,Restaurant,Roof Deck,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Field,Soccer Stadium,Spa,Spanish Restaurant,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Track Stadium,Trail,Train Station,Turkish Restaurant,Wine Bar,Wings Joint
0,Abuja,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Accra,0.1,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.133333,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Addis Ababa,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.1,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.266667,0.0,0.0,0.0,0.033333,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Algiers,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.1,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0
4,Antananarivo,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.166667,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [264]:
africa_grouped[africa_grouped['city'] == 'Nairobi'].T.reset_index().iloc[1:]

Unnamed: 0,index,20
1,African Restaurant,0.09375
2,Arts & Crafts Store,0.0
3,Athletics & Sports,0.0
4,Bakery,0.0
5,Bar,0.03125
6,Bed & Breakfast,0.0
7,Bookstore,0.0
8,Breakfast Spot,0.03125
9,Brewery,0.0
10,Buffet,0.0


In [290]:
num_top_venues = 5

for city in africa_grouped['city']:
    print("----"+city+"----")
    temp = africa_grouped[africa_grouped['city'] == city].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Abuja----
                venue  freq
0           BBQ Joint  0.07
1               Hotel  0.07
2              Arcade  0.07
3       Movie Theater  0.07
4  Chinese Restaurant  0.07


----Accra----
                venue  freq
0               Hotel  0.13
1  African Restaurant  0.10
2         Pizza Place  0.07
3              Lounge  0.03
4        Dessert Shop  0.03


----Addis Ababa----
                venue  freq
0               Hotel  0.27
1  Italian Restaurant  0.13
2                Café  0.10
3          Restaurant  0.07
4           Nightclub  0.07


----Algiers----
               venue  freq
0  French Restaurant  0.13
1              Hotel  0.10
2       Burger Joint  0.10
3             Lounge  0.07
4              Diner  0.07


----Antananarivo----
                venue  freq
0               Hotel  0.17
1          Restaurant  0.10
2  African Restaurant  0.07
3   French Restaurant  0.07
4       Grocery Store  0.03


----Asmara----
               venue  freq
0              Hotel  0.50
1 

4              Hotel  0.07


----Windhoek----
                venue  freq
0               Hotel  0.13
1          Restaurant  0.10
2       Shopping Mall  0.10
3  Italian Restaurant  0.07
4                Café  0.07


----Yaounde----
                venue  freq
0              Bakery  0.13
1               Hotel  0.10
2          Restaurant  0.10
3  African Restaurant  0.07
4              Lounge  0.07




In [291]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [292]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['city']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
cities_venues_sorted = pd.DataFrame(columns=columns)
cities_venues_sorted['city'] = africa_grouped['city']

for ind in np.arange(africa_grouped.shape[0]):
    cities_venues_sorted.iloc[ind, 1:] = return_most_common_venues(africa_grouped.iloc[ind, :], num_top_venues)

cities_venues_sorted.head()

Unnamed: 0,city,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abuja,Restaurant,Movie Theater,Arcade,Chinese Restaurant,BBQ Joint,Hotel,Fried Chicken Joint,Steakhouse,Café,Pizza Place
1,Accra,Hotel,African Restaurant,Pizza Place,Pub,Music Venue,Modern European Restaurant,Lounge,Jazz Club,Italian Restaurant,Indian Restaurant
2,Addis Ababa,Hotel,Italian Restaurant,Café,Nightclub,Restaurant,Coffee Shop,Spa,Greek Restaurant,Indian Restaurant,Ethiopian Restaurant
3,Algiers,French Restaurant,Hotel,Burger Joint,Restaurant,Diner,Lounge,Steakhouse,Sandwich Place,Café,Plaza
4,Antananarivo,Hotel,Restaurant,African Restaurant,French Restaurant,Hostel,Mediterranean Restaurant,Sandwich Place,Italian Restaurant,Snack Place,Food Court


In [293]:
africa_grouped.head()

Unnamed: 0,city,African Restaurant,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Big Box Store,Bistro,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Café,Candy Store,Casino,Caucasian Restaurant,Chinese Restaurant,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Creperie,Cricket Ground,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Dive Bar,Eastern European Restaurant,Ethiopian Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Service,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Latin American Restaurant,Lebanese Restaurant,Lighthouse,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Park,Pastry Shop,Performing Arts Venue,Pier,Pizza Place,Playground,Plaza,Pool,Port,Portuguese Restaurant,Pub,Racetrack,Resort,Restaurant,Roof Deck,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Field,Soccer Stadium,Spa,Spanish Restaurant,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Track Stadium,Trail,Train Station,Turkish Restaurant,Wine Bar,Wings Joint
0,Abuja,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Accra,0.1,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.133333,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Addis Ababa,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.1,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.266667,0.0,0.0,0.0,0.033333,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Algiers,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.1,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0
4,Antananarivo,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.166667,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [303]:
africa_grouped_clustering1.head(1)

Unnamed: 0,Per_capita_Income,African Restaurant,Airport,Airport Lounge,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,BBQ Joint,Bakery,Bar,Basketball Court,Beach,Bed & Breakfast,Beer Garden,Big Box Store,Bistro,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Café,Candy Store,Casino,Caucasian Restaurant,Chinese Restaurant,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Creperie,Cricket Ground,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Diner,Dive Bar,Eastern European Restaurant,Ethiopian Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food,Food & Drink Shop,Food Court,Food Service,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,German Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Harbor / Marina,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Latin American Restaurant,Lebanese Restaurant,Lighthouse,Lounge,Market,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Mosque,Motel,Movie Theater,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Optical Shop,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Park,Pastry Shop,Performing Arts Venue,Pier,Pizza Place,Playground,Plaza,Pool,Port,Portuguese Restaurant,Pub,Racetrack,Resort,Restaurant,Roof Deck,Salad Place,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shopping Mall,Snack Place,Soccer Field,Soccer Stadium,Spa,Spanish Restaurant,Sports Bar,Sports Club,Steakhouse,Supermarket,Surf Spot,Sushi Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Track Stadium,Trail,Train Station,Turkish Restaurant,Wine Bar,Wings Joint
0,14069.295268,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [348]:
# set number of clusters
kclusters = 6

africa_grouped_clustering1 = africa_grouped.drop('city', 1)
africa_grouped_clustering1.insert(0, 'Per_capita_Income', africa_data['per_cap_inc'])
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(africa_grouped_clustering1)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 4, 1, 1, 4, 2, 3, 1])

In [349]:
cities_venues_sorted.head()

Unnamed: 0,Cluster Labels,city,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Abuja,Restaurant,Movie Theater,Arcade,Chinese Restaurant,BBQ Joint,Hotel,Fried Chicken Joint,Steakhouse,Café,Pizza Place
1,1,Accra,Hotel,African Restaurant,Pizza Place,Pub,Music Venue,Modern European Restaurant,Lounge,Jazz Club,Italian Restaurant,Indian Restaurant
2,1,Addis Ababa,Hotel,Italian Restaurant,Café,Nightclub,Restaurant,Coffee Shop,Spa,Greek Restaurant,Indian Restaurant,Ethiopian Restaurant
3,5,Algiers,French Restaurant,Hotel,Burger Joint,Restaurant,Diner,Lounge,Steakhouse,Sandwich Place,Café,Plaza
4,0,Antananarivo,Hotel,Restaurant,African Restaurant,French Restaurant,Hostel,Mediterranean Restaurant,Sandwich Place,Italian Restaurant,Snack Place,Food Court


In [350]:
africa_merged.head(1)
#['Cluster Labels']

Unnamed: 0,city,lat,lng,capital,Country,per_cap_inc,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Pretoria,-25.7069,28.2294,primary,South Africa,14069.295268,4,Coffee Shop,Farmers Market,Restaurant,Garden,Pizza Place,Nightclub,Music Venue,Ice Cream Shop,Gym,Grocery Store


In [351]:
# add clustering labels
#cities_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
cities_venues_sorted['Cluster Labels']= kmeans.labels_
africa_merged = africa_data

# merge africa_grouped with africa_data to add latitude/longitude for each city
africa_merged = africa_merged.join(cities_venues_sorted.set_index('city'), on='city', how='inner')

africa_merged.head() # check the last columns!

Unnamed: 0,city,lat,lng,capital,Country,per_cap_inc,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Pretoria,-25.7069,28.2294,primary,South Africa,14069.295268,5,Coffee Shop,Farmers Market,Restaurant,Garden,Pizza Place,Nightclub,Music Venue,Ice Cream Shop,Gym,Grocery Store
1,Bloemfontein,-29.12,26.2299,primary,South Africa,14069.295268,1,Coffee Shop,Hotel,Fast Food Restaurant,Shopping Mall,Breakfast Spot,Italian Restaurant,Snack Place,Seafood Restaurant,Restaurant,Gym
2,Cape Town,-33.92,18.435,primary,South Africa,14069.295268,5,Coffee Shop,Hotel,Café,Theater,Gym,Italian Restaurant,Gym / Fitness Center,Cocktail Bar,City Hall,Breakfast Spot
3,Lusaka,-15.4166,28.2833,primary,Zambia,4409.940508,1,Hotel,Shopping Mall,Café,Restaurant,Steakhouse,Cocktail Bar,Italian Restaurant,Movie Theater,Pizza Place,Farmers Market
4,Harare,-17.8178,31.0447,primary,Zimbabwe,2463.154492,2,Shopping Mall,Restaurant,Performing Arts Venue,Café,Hotel,Italian Restaurant,Coffee Shop,Dive Bar,Chinese Restaurant,Miscellaneous Shop


In [352]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=3)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(africa_merged['lat'], africa_merged['lng'], africa_merged['city'], africa_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

***We also examine the average to gain and Idea of the clusters***

In [353]:
africa_c_groups = africa_merged.groupby('Cluster Labels').mean()
africa_c_groups = africa_c_groups.reset_index()


In [354]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=3)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(africa_c_groups['lat'], africa_c_groups['lng'], africa_c_groups['per_cap_inc'], africa_c_groups['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# Results and Discussion

The k means algorithm created seven clusters, with analysis it appears the clusters can be describe as follows:
1. Spanish Atlantic Coastal Cities-  Morocco,  Malabo, Freetown and Bissau
2. Ancient African trading Civilasations- Ethiopia, Nigeria, Angola, Ghana, Senegal and Burkina Faso
3. Desert regions with high incomes and small populations- Namibia, Chad, Egypt, Sudan, South Sudan, Niger, Mali and Gabon
4. High Income modern societies- Pretoria, Cape Town and Mauritius
5. Middle Income States mostly in ex British Africa- 
6. Crisis States- Tunisia, Zimbabwe and the Central African Republic

It is worth noting they are few anomalies in thes classifications:
1. Freetown was not a spanish colony, and it has lower incomes the its group members, but it has been clustered with these cities. It would be intresting to discover these lingkages.
2. Tunisia is a high-income democracy unlike CAR and Zimbabwe

For the other groups the classification was nearly perfect considering the history, geography and politics of the regions.


# Conclusion

The purpose of this project was to classify similar african cities for the purpose of tourism and investment. Thhis has been achieved through K-means clustering although a few anomalies exist, most clusteer hold true to reality.