## Neighborhoods around the World

### Introduction and Problem
Every country around the world has a unique culture and way of life. In order to thrive in this globalized economy it is important to understand our differences in order to interact respectfully with our counterparts around the world. This study will take a look at how similar major cities are around the world. We will focus on the size of the city, popular venues, available green space and other amenities located in metropolitan centers to better understand the priorities of different populations. This study is meant to inform high ranking business officials on how different cities are structured which tells us a lot about the behavior of the people that live there. This will give us a better understanding of their way of life before we do business with them.

<b>Question:</b> How similar are popular establishments in cities around the world? 

### Data Sources 
In this project, I will be using the Foursquare API to extract location data for major cities around the world. I will use the Foursquare API to extract the top 10 most popular venues in each city. I will then use the World Cities Basic Database (https://simplemaps.com/data/world-cities) to obtain a city's population. We will use this information along with a K-means algorithm to identify similar groups of cities. We will then add to this by including a country's economic data taken from The World Bank (https://data.worldbank.org/indicator/NY.GDP.PCAP.CD) including GDP per capita and unemployment as a csv. We will then use a similar algorithm to identify if these factors impact popular venues in cities. 

Potential Cities:
- Toronto, Canada
- Tokyo, Japan
- Sao Paolo, Brazil
- Cairo, Egypt
- Madrid, Spain
- Berlin, Germany
- Houston, USA
- Shanghai, China
- Bogota, Colombia
- Istanbul, Turkey

Cities will be finalized after data is reviewed to ensure enough data is available to support the study.

### Data Analysis

In [6]:
import pandas as pd
import numpy as np
import folium

In [44]:
world_cities = pd.read_csv("worldcities.csv")
world_cities.drop(columns=["city_ascii", "iso2", "iso3", "capital", "id"], inplace=True)
large_cities = world_cities[world_cities["population"] > 500000]
large_cities_samp = large_cities.sample(n=250, random_state = 9)
large_cities_samp.set_index("city", inplace=True)

In [50]:
large_cities_samp.drop("Fuzhou", inplace = True)

In [62]:
def colorbypop(mag):
    if mag > 3000000:
        color = '#6e1010'
        rad = 10
    elif mag > 1000000:
        color = '#f5443d'
        rad = 7
    else:
        color = '#f5a73b'
        rad = 3
    return color, rad

In [68]:
# create map of World using latitude and longitude values
world_map = folium.Map(location=[33,48], zoom_start=2, width=1024, height=600)

# add markers to map
for lat, lng, city, ctry in zip(large_cities_samp['lat'], large_cities_samp['lng'], 
                                large_cities_samp.index, large_cities_samp['country']):
    label = '{}, {}'.format(city, ctry)
    color, rad = colorbypop(large_cities_samp.loc[city, 'population'])
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=float(rad),
        popup=label,
        color= color,
        fill=True,
        fill_color= color,
        fill_opacity=0.5,
        parse_html=False).add_to(world_map)  

world_map

This map shows the cities that will be used for the study. Each marker represents a city with the radius and color representing
the population.
- Small Yellow points represent cities with a population between 500,000 and 1,000,000
- Medium Orange points represent cities with a population between 1,000,000 and 3,000,000
- Large Maroon points represent cities with a population greater than 3,000,000