### INTRODUCCION

Un grupo de inversionistas extranjeros han llegado a Colombia con el deseo de invertir parte de su capital, iniciando un negocio de tipo comercial en alguna de los municipios del hermoso pais cafetero.

Al no tener muchos conocimientos sobre el pais ni de cual o cuales pueden ser los municipios ideales para dicho establecimiento comercial o de que tipo de comercio seria el mas provechoso en dichos municipios, optan por buscar a un cientifico de datos el cual les ayude a resolver todas estas dudas y puedan tomar la decision mas acertada acerca de que tipo de establecimiento comercial establecer y en que municipio.

### DATOS

Los datos necesarios para este analisis son:

- Division Politica de los municipios de Colombia con sus coordenadas geograficas, obtenidos de la pagina del DANE (Departamento Administrativo Nacional de Estadística) es la entidad responsable de la planeación, levantamiento, procesamiento, análisis y difusión de las estadísticas oficiales de Colombia. https://geoportal.dane.gov.co/geovisores/territorio/consulta-divipola-division-politico-administrativa-de-colombia/   
 
- Los 10 municipios con el mayor PIB de Colombia para el año 2019, ya que son los municipios con mayores ingresos economicos del pais a la fecha son tentativamente la mejor opcion para tener en cuenta en el lugar deseado para invertir, datos obtenidos de Wikipedia proporcionados por el DANE. https://es.wikipedia.org/wiki/Anexo:Municipios_de_Colombia_por_Producto_Interno_Bruto   
 
- Coordenadas geograficas de Colombia y de los lugares mas populares de los 10 municipios mas representativos economicamente del pais, los cuales seriviran para tener una mayor idea de en que tipo de comercio se debe invertir, datos obtenidos de la pagina Foursquare. https://es.foursquare.com/developers/projects

### METODOLOGIA

In [1]:
# Importar librerias necesarias
import numpy as np
import pandas as pd
import json  
from geopy.geocoders import Nominatim
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium

In [2]:
# Explorar archivo csv con datos de los municipios de Colombia
colombia_data = pd.read_excel('DIVIPOLA_Municipios.xlsx')
colombia_data


Unnamed: 0,COD_DPTO,NOM_DPTO,COD_MPIO,NOM_MPIO,TIPO,LATITUD,LONGITUD
0,5,ANTIOQUIA,5001,MEDELLÍN,Municipio,6.257590,-75.611031
1,5,ANTIOQUIA,5002,ABEJORRAL,Municipio,5.803728,-75.438474
2,5,ANTIOQUIA,5004,ABRIAQUÍ,Municipio,6.627569,-76.085978
3,5,ANTIOQUIA,5021,ALEJANDRÍA,Municipio,6.365534,-75.090597
4,5,ANTIOQUIA,5030,AMAGÁ,Municipio,6.032922,-75.708003
...,...,...,...,...,...,...,...
1116,97,VAUPÉS,97889,YAVARATÉ,Área no municipalizada,0.833312,-69.618678
1117,99,VICHADA,99001,PUERTO CARREÑO,Municipio,5.836530,-68.141222
1118,99,VICHADA,99524,LA PRIMAVERA,Municipio,5.517594,-69.620441
1119,99,VICHADA,99624,SANTA ROSALÍA,Municipio,4.968581,-70.659971


In [3]:
# Eliminar columnas innecesarias
colombia_data = colombia_data.drop(['COD_DPTO',	'COD_MPIO', 'TIPO'],axis=1)
colombia_data

Unnamed: 0,NOM_DPTO,NOM_MPIO,LATITUD,LONGITUD
0,ANTIOQUIA,MEDELLÍN,6.257590,-75.611031
1,ANTIOQUIA,ABEJORRAL,5.803728,-75.438474
2,ANTIOQUIA,ABRIAQUÍ,6.627569,-76.085978
3,ANTIOQUIA,ALEJANDRÍA,6.365534,-75.090597
4,ANTIOQUIA,AMAGÁ,6.032922,-75.708003
...,...,...,...,...
1116,VAUPÉS,YAVARATÉ,0.833312,-69.618678
1117,VICHADA,PUERTO CARREÑO,5.836530,-68.141222
1118,VICHADA,LA PRIMAVERA,5.517594,-69.620441
1119,VICHADA,SANTA ROSALÍA,4.968581,-70.659971


In [4]:
# Reduccion de la tabla a solo los 10 municipios del pais con mayor PIB en el año 2019
principales_data = colombia_data[colombia_data.NOM_MPIO.isin (['BOGOTÁ', 'MEDELLÍN','CALI','BARRANQUILLA','CARTAGENA DE INDIAS', 'BARRANCABERMEJA', 'BUCARAMANGA', 'PUERTO GAITÁN', 'PEREIRA','ENVIGADO'])]
principales_data

Unnamed: 0,NOM_DPTO,NOM_MPIO,LATITUD,LONGITUD
0,ANTIOQUIA,MEDELLÍN,6.25759,-75.611031
46,ANTIOQUIA,ENVIGADO,6.154395,-75.546868
125,ATLÁNTICO,BARRANQUILLA,10.981521,-74.827715
148,CUNDINAMARCA,BOGOTÁ,4.316108,-74.181073
149,BOLÍVAR,CARTAGENA DE INDIAS,10.463434,-75.458899
705,META,PUERTO GAITÁN,4.005034,-71.631574
831,RISARALDA,PEREIRA,4.803663,-75.795791
845,SANTANDER,BUCARAMANGA,7.155834,-73.11157
851,SANTANDER,BARRANCABERMEJA,7.054075,-73.782116
1005,VALLE DEL CAUCA,CALI,3.399044,-76.576493


In [5]:
# Obtner latitud y longitud de Colombia
address = 'Colombia'

geolocator = Nominatim(user_agent="colombia_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('Las coordenadas geograficas de Colombia son {}, {}.'.format(latitude, longitude))

Las coordenadas geograficas de Colombia son 4.099917, -72.9088133.


In [6]:
# crear mapa de Colombia utilizando los valores de latitud y longitud
map_colombia = folium.Map(location=[latitude, longitude], zoom_start=10)

# añadir marcadores al mapa
for lat, lng, departamento, municipio in zip(principales_data['LATITUD'], principales_data['LONGITUD'], principales_data['NOM_DPTO'], principales_data['NOM_MPIO']):
    label = '{}, {}'.format(municipio, departamento)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_colombia)  
    
map_colombia

In [7]:
# Credenciales Foursquare
CLIENT_ID = 'LGFU3IAQROCJDKKZXZ2FJYUFM1GW10YSFCWB0WHZTBCMP0W4'
CLIENT_SECRET = 'P0W1UFPI0NXJJSYSYZAS1SMQSSCZA0K1TILWYUYBWC41FSNN'
VERSION = '20180605'
LIMIT = 100

print('Credenciales Foursquare:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Credenciales Foursquare:
CLIENT_ID: LGFU3IAQROCJDKKZXZ2FJYUFM1GW10YSFCWB0WHZTBCMP0W4
CLIENT_SECRET:P0W1UFPI0NXJJSYSYZAS1SMQSSCZA0K1TILWYUYBWC41FSNN


In [8]:
# Explorar lugares de los municipios principales
def getNearbyVenues(nombres, latitudes, longitudes, radius=50000):
    
    lugares_list=[]
    for nombre, lat, lng in zip(nombres, latitudes, longitudes):
        print(nombre)
            
        # crear la URL de solicitud de API
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # solicitud GET
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # regresa solo información relevante de cada sitio cercano
        lugares_list.append([(
            nombre, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
            
    lugares_cercanos = pd.DataFrame([item for lugar_list in lugares_list for item in lugar_list])
    lugares_cercanos.columns = ['Municipio', 
                  'Latitud del Municipio', 
                  'Longitud del Municipio', 
                  'Lugar', 
                  'Latitud del Lugar', 
                  'Longitud del Lugar', 
                  'Categoria del Lugar']
    
    return(lugares_cercanos)

In [9]:
lugares_principales = getNearbyVenues(nombres=principales_data['NOM_MPIO'],
                                   latitudes=principales_data['LATITUD'],
                                   longitudes=principales_data['LONGITUD']
                                  )

MEDELLÍN
ENVIGADO
BARRANQUILLA
BOGOTÁ
CARTAGENA DE INDIAS
PUERTO GAITÁN
PEREIRA
BUCARAMANGA
BARRANCABERMEJA
CALI


In [10]:
lugares_principales

Unnamed: 0,Municipio,Latitud del Municipio,Longitud del Municipio,Lugar,Latitud del Lugar,Longitud del Lugar,Categoria del Lugar
0,MEDELLÍN,6.257590,-75.611031,Saludpan,6.246565,-75.590031,Vegetarian / Vegan Restaurant
1,MEDELLÍN,6.257590,-75.611031,Unidad Deportiva Atanasio Girardot,6.256149,-75.590842,Athletics & Sports
2,MEDELLÍN,6.257590,-75.611031,Cafezinho,6.263800,-75.598035,Café
3,MEDELLÍN,6.257590,-75.611031,La Miguería,6.242531,-75.597832,Bakery
4,MEDELLÍN,6.257590,-75.611031,Unidad Deportiva de Belén,6.235082,-75.588046,Athletics & Sports
...,...,...,...,...,...,...,...
795,CALI,3.399044,-76.576493,Restaurante La Tinaja,3.620307,-76.434374,Latin American Restaurant
796,CALI,3.399044,-76.576493,Parque vijes,3.700391,-76.442599,Plaza
797,CALI,3.399044,-76.576493,Juan Valdez Cafe - Aeropuerto,3.536647,-76.388320,Café
798,CALI,3.399044,-76.576493,Comidas Rápidas Billo's,3.533640,-76.297852,Fast Food Restaurant


In [11]:
# Revisar cuantos lugares por municipio se obtuvieron
lugares_principales.groupby('Municipio').count()

Unnamed: 0_level_0,Latitud del Municipio,Longitud del Municipio,Lugar,Latitud del Lugar,Longitud del Lugar,Categoria del Lugar
Municipio,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
BARRANCABERMEJA,36,36,36,36,36,36
BARRANQUILLA,100,100,100,100,100,100
BOGOTÁ,100,100,100,100,100,100
BUCARAMANGA,60,60,60,60,60,60
CALI,100,100,100,100,100,100
CARTAGENA DE INDIAS,100,100,100,100,100,100
ENVIGADO,100,100,100,100,100,100
MEDELLÍN,100,100,100,100,100,100
PEREIRA,100,100,100,100,100,100
PUERTO GAITÁN,4,4,4,4,4,4


In [12]:
# Analizar cada Municipio
principales_onehot = pd.get_dummies(lugares_principales[['Categoria del Lugar']], prefix="", prefix_sep="")

# Añadir la columna Municipio al dataframe
principales_onehot['Municipio'] = lugares_principales['Municipio']


# Mover la columna Municipio a la primer columna
fixed_columns = [principales_onehot.columns[-1]] + list(principales_onehot.columns[:-1])
principales_onehot = principales_onehot[fixed_columns]


principales_onehot.head()

Unnamed: 0,Municipio,Airport,Airport Lounge,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,...,Theme Park Ride / Attraction,Trail,University,Vegetarian / Vegan Restaurant,Video Game Store,Water Park,Waterfront,Wings Joint,Zoo,Zoo Exhibit
0,MEDELLÍN,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
1,MEDELLÍN,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,MEDELLÍN,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,MEDELLÍN,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,MEDELLÍN,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0


In [13]:
# Agrupar filas por Municipio tomando la media de la frecuencia de la ocurrencia de cada categoría
principales_grouped = principales_onehot.groupby('Municipio').mean().reset_index()
principales_grouped

Unnamed: 0,Municipio,Airport,Airport Lounge,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,...,Theme Park Ride / Attraction,Trail,University,Vegetarian / Vegan Restaurant,Video Game Store,Water Park,Waterfront,Wings Joint,Zoo,Zoo Exhibit
0,BARRANCABERMEJA,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,BARRANQUILLA,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0
2,BOGOTÁ,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.06,0.0,...,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,BUCARAMANGA,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,...,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0
4,CALI,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01
5,CARTAGENA DE INDIAS,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0
6,ENVIGADO,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.02,...,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
7,MEDELLÍN,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.02,...,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
8,PEREIRA,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,...,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0
9,PUERTO GAITÁN,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [25]:
# 5 Sitios mas comunes por municipio
num_top_venues = 5

for hood in principales_grouped['Municipio']:
    print("----"+hood+"----")
    temp = principales_grouped[principales_grouped['Municipio'] == hood].T.reset_index()
    temp.columns = ['Tipo de Lugar','Frecuencia']
    temp = temp.iloc[1:]
    temp['Frecuencia'] = temp['Frecuencia'].astype(float)
    temp = temp.round({'Frecuencia': 2})
    print(temp.sort_values('Frecuencia', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----BARRANCABERMEJA----
  Tipo de Lugar  Frecuencia
0         Hotel        0.14
1    Restaurant        0.11
2  Burger Joint        0.11
3     Multiplex        0.06
4          Park        0.06


----BARRANQUILLA----
   Tipo de Lugar  Frecuencia
0  Shopping Mall        0.07
1          Beach        0.07
2    Pizza Place        0.06
3          Hotel        0.05
4           Park        0.05


----BOGOTÁ----
      Tipo de Lugar  Frecuencia
0              Park        0.07
1             Hotel        0.06
2  Asian Restaurant        0.06
3       Coffee Shop        0.05
4       Pizza Place        0.05


----BUCARAMANGA----
    Tipo de Lugar  Frecuencia
0    Burger Joint        0.10
1           Hotel        0.10
2  Scenic Lookout        0.08
3      Restaurant        0.05
4      Steakhouse        0.05


----CALI----
               Tipo de Lugar  Frecuencia
0         Italian Restaurant        0.06
1                Pizza Place        0.05
2           Department Store        0.05
3  Latin American Res

In [15]:
# Ordenar sitios descendentemente 
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [16]:
# Los 10 sitios mas populares por Municipio
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# Crear las columnas acorde al numero de sitios  mas populares
columns = ['Municipio']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Sitio mas popular'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Sitio mas popular'.format(ind+1))

# Crear un nuevo dataframe
municipios_lugares = pd.DataFrame(columns=columns)
municipios_lugares['Municipio'] = principales_grouped['Municipio']

for ind in np.arange(principales_grouped.shape[0]):
    municipios_lugares.iloc[ind, 1:] = return_most_common_venues(principales_grouped.iloc[ind, :], num_top_venues)

municipios_lugares

Unnamed: 0,Municipio,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
0,BARRANCABERMEJA,Hotel,Restaurant,Burger Joint,Multiplex,Park,Sandwich Place,BBQ Joint,Caribbean Restaurant,Café,Supermarket
1,BARRANQUILLA,Shopping Mall,Beach,Pizza Place,Hotel,Park,Bakery,Café,Furniture / Home Store,Italian Restaurant,Ice Cream Shop
2,BOGOTÁ,Park,Hotel,Asian Restaurant,Coffee Shop,Pizza Place,French Restaurant,Bakery,Breakfast Spot,Ice Cream Shop,BBQ Joint
3,BUCARAMANGA,Burger Joint,Hotel,Scenic Lookout,Restaurant,Steakhouse,Multiplex,Golf Course,South American Restaurant,Coffee Shop,Shopping Mall
4,CALI,Italian Restaurant,Pizza Place,Department Store,Latin American Restaurant,Park,Ice Cream Shop,Bar,Hotel,Plaza,Gym / Fitness Center
5,CARTAGENA DE INDIAS,Hotel,Beach,Restaurant,Plaza,Caribbean Restaurant,Historic Site,Resort,Seafood Restaurant,Café,Cocktail Bar
6,ENVIGADO,Shopping Mall,Bakery,Peruvian Restaurant,Café,Pizza Place,Supermarket,BBQ Joint,Hotel,Coffee Shop,Cocktail Bar
7,MEDELLÍN,Shopping Mall,Bakery,Café,Theater,Breakfast Spot,Park,Peruvian Restaurant,Hotel,Cocktail Bar,BBQ Joint
8,PEREIRA,Café,Restaurant,Hotel,Latin American Restaurant,BBQ Joint,Theme Park,Park,Theme Park Ride / Attraction,Supermarket,Burger Joint
9,PUERTO GAITÁN,Restaurant,Latin American Restaurant,Fast Food Restaurant,Bus Station,Pie Shop,Park,Pedestrian Plaza,Peruvian Restaurant,Pet Store,Pharmacy


In [17]:
# K-means para agrupar los municipios 
kclusters = 5

principales_grouped_clustering = principales_grouped.drop('Municipio', 1)

# Ejecutar k-means
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(principales_grouped_clustering)

# revisar las etiquetas de las agrupaciones generadas para cada fila del dataframe
kmeans.labels_[0:10] 

  principales_grouped_clustering = principales_grouped.drop('Municipio', 1)


array([3, 2, 2, 4, 2, 0, 2, 2, 2, 1])

In [18]:
# Los 10 sitios mas populares agrupados (K-means)
municipios_lugares.insert(0, 'Grupo', kmeans.labels_)

lugares_union = principales_data

# Unir principales_grouped con principales_data 
lugares_union = lugares_union.join(municipios_lugares.set_index('Municipio'), on='NOM_MPIO')

lugares_union

Unnamed: 0,NOM_DPTO,NOM_MPIO,LATITUD,LONGITUD,Grupo,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
0,ANTIOQUIA,MEDELLÍN,6.25759,-75.611031,2,Shopping Mall,Bakery,Café,Theater,Breakfast Spot,Park,Peruvian Restaurant,Hotel,Cocktail Bar,BBQ Joint
46,ANTIOQUIA,ENVIGADO,6.154395,-75.546868,2,Shopping Mall,Bakery,Peruvian Restaurant,Café,Pizza Place,Supermarket,BBQ Joint,Hotel,Coffee Shop,Cocktail Bar
125,ATLÁNTICO,BARRANQUILLA,10.981521,-74.827715,2,Shopping Mall,Beach,Pizza Place,Hotel,Park,Bakery,Café,Furniture / Home Store,Italian Restaurant,Ice Cream Shop
148,CUNDINAMARCA,BOGOTÁ,4.316108,-74.181073,2,Park,Hotel,Asian Restaurant,Coffee Shop,Pizza Place,French Restaurant,Bakery,Breakfast Spot,Ice Cream Shop,BBQ Joint
149,BOLÍVAR,CARTAGENA DE INDIAS,10.463434,-75.458899,0,Hotel,Beach,Restaurant,Plaza,Caribbean Restaurant,Historic Site,Resort,Seafood Restaurant,Café,Cocktail Bar
705,META,PUERTO GAITÁN,4.005034,-71.631574,1,Restaurant,Latin American Restaurant,Fast Food Restaurant,Bus Station,Pie Shop,Park,Pedestrian Plaza,Peruvian Restaurant,Pet Store,Pharmacy
831,RISARALDA,PEREIRA,4.803663,-75.795791,2,Café,Restaurant,Hotel,Latin American Restaurant,BBQ Joint,Theme Park,Park,Theme Park Ride / Attraction,Supermarket,Burger Joint
845,SANTANDER,BUCARAMANGA,7.155834,-73.11157,4,Burger Joint,Hotel,Scenic Lookout,Restaurant,Steakhouse,Multiplex,Golf Course,South American Restaurant,Coffee Shop,Shopping Mall
851,SANTANDER,BARRANCABERMEJA,7.054075,-73.782116,3,Hotel,Restaurant,Burger Joint,Multiplex,Park,Sandwich Place,BBQ Joint,Caribbean Restaurant,Café,Supermarket
1005,VALLE DEL CAUCA,CALI,3.399044,-76.576493,2,Italian Restaurant,Pizza Place,Department Store,Latin American Restaurant,Park,Ice Cream Shop,Bar,Hotel,Plaza,Gym / Fitness Center


In [19]:
# crear mapa
map_agrupaciones = folium.Map(location=[latitude, longitude], zoom_start=11)

# establecer el esquema de color para las agrupaciones
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# añadir marcadores al mapa
markers_colors = []
for lat, lon, poi, cluster in zip(lugares_union['LATITUD'], lugares_union['LONGITUD'], lugares_union['NOM_MPIO'], lugares_union['Grupo']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_agrupaciones)
       
map_agrupaciones

In [20]:
# Examinar Agrupacion 1
lugares_union.loc[lugares_union['Grupo'] == 0, lugares_union.columns[[1] + list(range(5, lugares_union.shape[1]))]]

Unnamed: 0,NOM_MPIO,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
149,CARTAGENA DE INDIAS,Hotel,Beach,Restaurant,Plaza,Caribbean Restaurant,Historic Site,Resort,Seafood Restaurant,Café,Cocktail Bar


In [21]:
# Examinar Agrupacion 2
lugares_union.loc[lugares_union['Grupo'] == 1, lugares_union.columns[[1] + list(range(5, lugares_union.shape[1]))]]

Unnamed: 0,NOM_MPIO,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
705,PUERTO GAITÁN,Restaurant,Latin American Restaurant,Fast Food Restaurant,Bus Station,Pie Shop,Park,Pedestrian Plaza,Peruvian Restaurant,Pet Store,Pharmacy


In [22]:
# Examinar Agrupacion 3
lugares_union.loc[lugares_union['Grupo'] == 2, lugares_union.columns[[1] + list(range(5, lugares_union.shape[1]))]]

Unnamed: 0,NOM_MPIO,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
0,MEDELLÍN,Shopping Mall,Bakery,Café,Theater,Breakfast Spot,Park,Peruvian Restaurant,Hotel,Cocktail Bar,BBQ Joint
46,ENVIGADO,Shopping Mall,Bakery,Peruvian Restaurant,Café,Pizza Place,Supermarket,BBQ Joint,Hotel,Coffee Shop,Cocktail Bar
125,BARRANQUILLA,Shopping Mall,Beach,Pizza Place,Hotel,Park,Bakery,Café,Furniture / Home Store,Italian Restaurant,Ice Cream Shop
148,BOGOTÁ,Park,Hotel,Asian Restaurant,Coffee Shop,Pizza Place,French Restaurant,Bakery,Breakfast Spot,Ice Cream Shop,BBQ Joint
831,PEREIRA,Café,Restaurant,Hotel,Latin American Restaurant,BBQ Joint,Theme Park,Park,Theme Park Ride / Attraction,Supermarket,Burger Joint
1005,CALI,Italian Restaurant,Pizza Place,Department Store,Latin American Restaurant,Park,Ice Cream Shop,Bar,Hotel,Plaza,Gym / Fitness Center


In [23]:
# Examinar Agrupacion 4
lugares_union.loc[lugares_union['Grupo'] == 3, lugares_union.columns[[1] + list(range(5, lugares_union.shape[1]))]]

Unnamed: 0,NOM_MPIO,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
851,BARRANCABERMEJA,Hotel,Restaurant,Burger Joint,Multiplex,Park,Sandwich Place,BBQ Joint,Caribbean Restaurant,Café,Supermarket


In [24]:
# Examinar Agrupacion 5
lugares_union.loc[lugares_union['Grupo'] == 4, lugares_union.columns[[1] + list(range(5, lugares_union.shape[1]))]]

Unnamed: 0,NOM_MPIO,1st Sitio mas popular,2nd Sitio mas popular,3rd Sitio mas popular,4th Sitio mas popular,5th Sitio mas popular,6th Sitio mas popular,7th Sitio mas popular,8th Sitio mas popular,9th Sitio mas popular,10th Sitio mas popular
845,BUCARAMANGA,Burger Joint,Hotel,Scenic Lookout,Restaurant,Steakhouse,Multiplex,Golf Course,South American Restaurant,Coffee Shop,Shopping Mall


### RESULTADOS

- La Agrupación #3 cuenta con mas de la mitad de los municipios.
- Se puede observar una clara tendencia a los centros comerciales en 3 de los municipios.
- Los hoteles podrian ser el segundo tipo de lugar que mas se presenta en estos municipios, y es algo normal teniendo en cuenta que es un pais con un buen  nivel de turismo.
- Los diferentes tipos de restaurante tambien son una constante en todos los municipios.
- Los lugares de comida rapida, asi como los café se observan en varios de los municipios, a pesar de no ser los sitios mas populares.

### CONCLUSIONES

La decision de los inversionistas acerca del municipio y tipo de comercio a invertir podria analizarse desde diferentes puntos de vista:
1. Entrar a competir con centros comerciales a municipios como Medellin, Envigado o Barranquilla en los cuales estos establecimientos ya tienen bastante apogeo y arriesgarse a que el negocio pueda arrebatarle clientes a la competencia ya establecida.
2. Competir con hoteles o restaurantes en municipios como Bogota, Cali o Pereira en los cuales al igual que la opcion anterior, entrarian a tratar de atraer a los clientes de negocios ya establecidos.
3. Invertir en hoteles, cafés o distintos tipo de restaurante en ciudades como Medellin y Bucaramanga que a pesar de estar en el top 10 de lugares mas populares no son el top 1 de dichas ciudades.
4. En lo personal esta opcion es la que yo acogeria, invertir en un centro comercial en alguna ciudad como Cali, Pereira o Bogota, los cuales son bastantes populares en otras ciudades pero que se ven muy poco en estas, lo cual podria ser una apuesta innovdora y de volverse tan popular como en otras ciudades, traer grandes ganancias.