<h1 align=center><font size = 5>Proyecto Final Capstone - La Batalla de los Vecindarios</font></h1>

<h2>Tabla de Contenido</h2>

<div class="alert alert-block alert-info" style="margin-top: 20px">
<ol>
    <li><a href="#introduccion">Introducción</a></li>
    <li><a href="#problema">Problema Comercial</a></li>
    <li><a href="#datos">Datos</a></li>
    <li><a href="#metodologia">Metodologia</a></li>
    <li><a href="#analisis">Analisis</a></li>
</ol>
    
</div>
 
<hr>

<h2 id="introduccion">1. Introducción</h2>

Para este proyecto nos centraremos en analizar y visualizar, los barrios de la ciudad de **Medellin-Colombia**, para determinar cual lugar es mejor para abrir una **pizzeria**.

<h2 id="problema">2. Problema Comercial</h2>

Un amigo cercano desea abrir un local de comida, especificamente una pizzeria, pero no tiene idea de en que lugar seria mas propicio abrir este local, en este emprendimiento tiene puesto sus esfuerzos y ahorros, es de vital importancia escoger bien el lugar donde estará la pizzeria.
Con la ayuda de la ciencia de datos, podemos darle un panomara claro, de donde es factible abrir la pizzeria y donde no lo es.

<h2 id="datos">3. Datos</h2>

<h3 id="datos">3.1. Fuentes de Datos</h3>

Los datos que vamos a utilizar, provienen de varias fuentes, una de ella es la base de datos proporcionada por la alcaldia de **Medellin**, donde se encuentran todos los barrios de medellin, la comuna a la que pertenecen, (parecido a los distritos en otras ciudades) y si son urbanos o rurales. Estos datos pueden ser consultados en este enlace: <a href="https://geomedellin-m-medellin.opendata.arcgis.com/datasets/M-Medellin::barrio-vereda/explore?location=6.268900%2C-75.595550%2C12.00&showTable=true">FuenteDatos<a/>.

<p>Tenemos la opción de hacer web scraping, con el paquete <b>BeautifulSoup</b>, que utilizamos en los laboratorios anteriores, o tambien podemos descargar el archivo en formato <b>.csv</b>, directamente desde la pagina de la alcaldia de Medellin. y cargarlo a nuestro Notebook, con Pandas.
Escogeremos esta opción por simplicidad.</p>
<p>Utilizaremos la libreria <b>GeoPy</b>, para recuperar los datos Geoespaciales de cada uno de los barrios de la ciudad y los combinaremos en una sola tabla.</p>
<p>Ademas utilizaremos los datos de la Api <b>FourSquare</b>, para recuperar lugares y calificaciones, de los negocios similares, para los que deseamos predecir las mejores ubicaciones.</p>

<h3 id="datos">3.2. Data Cleaning</h3>

Como los Datos provienen de varias fuentes es posible, que halla datos faltantes o nulos, decidí eliminar estos datos de nuestro DataFrame principal, al igual que solo trabajaremos con los barrios que sean catalogados como **Urbanos**, y eliminaremos los que aparecen como rurales.

<h3 id="metodologia">4. Metodologia</h3>

Abordaremos el problema utilizando la tecnica de agrupamiento que ya conocemos, **k-means**. Este enfoque permitirá a la audiencia ver como los veciendarios se parecen en sus datos demograficos.

### Importar librerias

In [1]:
import numpy as np # librería para manejar datos vectorizados

import pandas as pd # librería para análisis de datos
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # librería para manejar archivos JSON 

import requests # librería para manejar solicitudes
from pandas.io.json import json_normalize # librería para convertir un archivo json en un dataframe pandas

# Matplotlib y módulos asociados para graficar
import matplotlib.cm as cm
import matplotlib.colors as colors

# importar k-means desde la fase de agrupación
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # librería para graficar mapas 

print('Libraries imported.')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Libraries imported.


### Cargamos el conjunto de datos de los barrios de **Medellin**.

In [2]:
dfBarrios = pd.read_csv('C:/Users/andrey.zapata/Downloads/Barrio_Vereda.csv')
dfBarrios.head(5)

Unnamed: 0,OBJECTID,CODIGO,NOMBRE,SUBTIPO_BARRIOVEREDA,NOMBRE_COMUNA_CORREGIMIENTO,SHAPEAREA,SHAPELEN
0,1112,510,Tricentenario,1,Castilla,420637.970349,2897.304229
1,1113,208,Villa Niza,1,Santa Cruz,143215.327504,1697.303318
2,1114,1108,Laureles,1,Laureles Estadio,707014.821267,3847.112683
3,1115,1303,Santa Rosa de Lima,1,San Javier,139970.996369,2158.954261
4,1116,1206,Santa Lucía,1,La América,275913.740234,3048.703385


#### Eliminamos los barrios que estan catalogados como **Rurales** y los barrios que no tienen nombre, o GeoPy no puede encontrar su datos Geoespaciales

In [3]:
indexNames = dfBarrios[ (dfBarrios['SUBTIPO_BARRIOVEREDA'] == 2)
                | (dfBarrios['NOMBRE'] == "Hospital San Vicente de Paúl")
                | (dfBarrios['NOMBRE'] == "Área de Expansión El Noral")
                | (dfBarrios['NOMBRE'] == "Facultad de Minas")
                | (dfBarrios['NOMBRE'] == "Facultad Veterinaria y Zootecnia U.de.A.")
                | (dfBarrios['NOMBRE'] == "U.P.B")
                | (dfBarrios['NOMBRE'] == "El Nogal-Los Almendros")
                | (dfBarrios['NOMBRE'] == "Cementerio Universal")
                | (dfBarrios['NOMBRE'] == "Centro Administrativo")
                | (dfBarrios['NOMBRE'] == "Facultad de Minas U. Nacional")
                | (dfBarrios['NOMBRE'] == "Las Acacias")
                | (dfBarrios['NOMBRE'] == "Plaza de Ferias")
                | (dfBarrios['NOMBRE'] == "Terminal de Transporte")
                | (dfBarrios['NOMBRE'] == "Oleoducto")
                | (dfBarrios['NOMBRE'] == "San Isidro")
                | (dfBarrios['NOMBRE'] == "Naranjal")
                | (dfBarrios['NOMBRE'] == "La Palma")
                | (dfBarrios['NOMBRE'] == "Las Palmas")
                | (dfBarrios['NOMBRE'] == "El Salado")
                | (dfBarrios['NOMBRE'] == "Altavista")                     
                | (dfBarrios['NOMBRE'] == "Sin Nombre") ].index
dfBarrios.drop(indexNames , inplace=True)
dfBarrios.head(5)

Unnamed: 0,OBJECTID,CODIGO,NOMBRE,SUBTIPO_BARRIOVEREDA,NOMBRE_COMUNA_CORREGIMIENTO,SHAPEAREA,SHAPELEN
0,1112,510,Tricentenario,1,Castilla,420637.970349,2897.304229
1,1113,208,Villa Niza,1,Santa Cruz,143215.327504,1697.303318
2,1114,1108,Laureles,1,Laureles Estadio,707014.821267,3847.112683
3,1115,1303,Santa Rosa de Lima,1,San Javier,139970.996369,2158.954261
4,1116,1206,Santa Lucía,1,La América,275913.740234,3048.703385


#### Verificamos el numero de resgistros resultantes.

In [4]:
dfBarrios.shape

(251, 7)

#### Importamos libreria **GeoPy**, para obtener los datos geoespaciales de los barrios de la ciudad

In [5]:
from geopy.geocoders import Nominatim

In [6]:
address = 'Medellin, CO'

geolocator = Nominatim(user_agent="mde_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Medellin City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Medellin City are 6.2443382, -75.573553.


### Insertamos 2 columnas nuevas para la latitud y longitud

In [7]:
dfBarrios.insert(7, "lat", "")
dfBarrios.insert(8, "lng", "")
dfBarrios.head(5)

Unnamed: 0,OBJECTID,CODIGO,NOMBRE,SUBTIPO_BARRIOVEREDA,NOMBRE_COMUNA_CORREGIMIENTO,SHAPEAREA,SHAPELEN,lat,lng
0,1112,510,Tricentenario,1,Castilla,420637.970349,2897.304229,,
1,1113,208,Villa Niza,1,Santa Cruz,143215.327504,1697.303318,,
2,1114,1108,Laureles,1,Laureles Estadio,707014.821267,3847.112683,,
3,1115,1303,Santa Rosa de Lima,1,San Javier,139970.996369,2158.954261,,
4,1116,1206,Santa Lucía,1,La América,275913.740234,3048.703385,,


#### Construimos un ciclo para asignar a cada fila, de nuestro DataFrame, los datos de **latitud** y **longitud**, resultantes de la libreria GeoPy

In [8]:
address = 'Medellin,CO'

for i in dfBarrios.index:
    nomBarrio = dfBarrios.loc[i, 'NOMBRE']
    addressFull = nomBarrio + "," + address
    geolocator = Nominatim(user_agent="mde_explorer")
    location = geolocator.geocode(addressFull)
    #print(str(i) + dfBarrios.loc[i, 'NOMBRE'])
    if(location.latitude is not None and location.longitude is not None):
        latitude = location.latitude
        longitude = location.longitude
    else:
        print(dfBarrios.loc[i, 'NOMBRE'] + "Tiene datos geoespaciales nulos")
        latitude = "NoData"
        longitude = "NoData"
        
    
    dfBarrios.loc[i, 'lat']= latitude
    dfBarrios.loc[i, 'lng']= longitude

#### Reseteamos el index de nuestro dataFrame y descartamos las columnas innecesarias 

In [9]:
dfBarrios.reset_index(inplace=True)
dfBarrios.drop(['index'], axis=1, inplace=True)
dfBarrios.head(10)

Unnamed: 0,OBJECTID,CODIGO,NOMBRE,SUBTIPO_BARRIOVEREDA,NOMBRE_COMUNA_CORREGIMIENTO,SHAPEAREA,SHAPELEN,lat,lng
0,1112,510,Tricentenario,1,Castilla,420637.970349,2897.304229,6.29107,-75.5663
1,1113,208,Villa Niza,1,Santa Cruz,143215.327504,1697.303318,6.29565,-75.5634
2,1114,1108,Laureles,1,Laureles Estadio,707014.821267,3847.112683,6.242,-75.5958
3,1115,1303,Santa Rosa de Lima,1,San Javier,139970.996369,2158.954261,6.2651,-75.6064
4,1116,1206,Santa Lucía,1,La América,275913.740234,3048.703385,6.25744,-75.6074
5,1117,1205,La Floresta,1,La América,422571.913797,2734.223663,6.25641,-75.6014
6,1118,1617,Las Mercedes,1,Belén,376395.954028,3346.632592,6.23573,-75.6094
7,1119,1016,Boston,1,La Candelaria,535706.432626,3120.282724,6.24804,-75.5576
8,1120,725,Nueva Villa de La Iguaná,1,Robledo,65026.710827,1838.740592,6.25994,-75.5817
9,1121,905,Alejandro Echavarría,1,Buenos Aires,365418.27313,3884.167537,6.23877,-75.5463


<h3 id="analisis">5. Analisis</h3>

#### inicializamos los parametros para generar el mapa de la ciudad

In [10]:
address = 'Medellin, CO'

geolocator = Nominatim(user_agent="mde_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Medellin City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Medellin City are 6.2443382, -75.573553.


#### Generemos el mapa con sus marcadores

In [11]:
# crear un mapa de Medellin utilizando los valores de latitud y longitud
map_medellin = folium.Map(location=[latitude, longitude], zoom_start=12)

# añadir marcadores al mapa
for lat, lng, comuna, barrio in zip(dfBarrios['lat'], dfBarrios['lng'], dfBarrios['NOMBRE_COMUNA_CORREGIMIENTO'], dfBarrios['NOMBRE']):
    label = '{}, {}'.format(barrio, comuna)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_medellin)  
    
map_medellin

#### inicializamos los argumentos que utilizaremos para la Api Foursquare

In [12]:
CLIENT_ID = '3KQ23TNDRE4U545FXH421FR5OEUAJ0UU4PKVNC4XNHRE3LKM' # su ID de Foursquare
CLIENT_SECRET = 'L1MSXE30SXDOQ4CNBTSMIRK2ZAXX5NZ1S3S2IRU1OTMMEBEA' # Secreto de Foursquare
VERSION = '20180605' # versión de la API de Foursquare
LIMIT = 100 # Un valor límite para la API de Foursquare

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 3KQ23TNDRE4U545FXH421FR5OEUAJ0UU4PKVNC4XNHRE3LKM
CLIENT_SECRET:L1MSXE30SXDOQ4CNBTSMIRK2ZAXX5NZ1S3S2IRU1OTMMEBEA


#### Construimos una funcion para obtener los sitios cercanos de cada uno de los barrios de la ciudad.

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # crear la URL de solicitud de API
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # solicitud GET
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # regresa solo información relevante de cada sitio cercano
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

##### Llamamos la funcion que construimos

In [15]:
Medellin_venues = getNearbyVenues(names=dfBarrios['NOMBRE'],
                                   latitudes=dfBarrios['lat'],
                                   longitudes=dfBarrios['lng']
                                  )

Tricentenario
Villa Niza
Laureles
Santa Rosa de Lima
Santa Lucía
La Floresta
Las Mercedes
Boston
Nueva Villa de La Iguaná
Alejandro Echavarría
Picacho
Moscú No.2
Santo Domingo Savio No.1
El Danubio
La Avanzada
Perpetuo Socorro
San Lucas
Calasanz Parte Alta
Playón de Los Comuneros
El Chagualo
Cristo Rey
Los Conquistadores
Asomadera No.3
La Gloria
El Pesebre
Lalinde
Manrique Central No.1
Las Independencias
Parque Juan Pablo II
Moscú No.1
Bosques de San Pablo
Los Mangos
Los Cerros El Vergel
Miravalle
El Rodeo
La Milagrosa
El Castillo
San José La Cima No.1
El Raizal
Villa Guadalupe
El Poblado
Trinidad
El Pomar
Oriente
La Isla
La Sierra
María Cano-Carambolas
Belén
La Pilarica
El Corazón
Calle Nueva
La Frontera
Tejelo
Lorena
Miraflores
Los Ángeles
Alejandría
Metropolitano
La Francia
Cuarta Brigada
Los Naranjos
Aures No.1
Cataluña
Carpinelo
Andalucía
La Castellana
Carlos E. Restrepo
Robledo
Juan XXIII La Quiebra
Moravia
San Benito
Veinte de Julio
Altamira
Las Lomas No.2
Santander
Las Playas
S

#### Revisemos el tamaño del nuevo DataFrame y las primeras filas

In [16]:
print(Medellin_venues.shape)
Medellin_venues.head()

(2605, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Tricentenario,6.29107,-75.566325,METRO - Estacion Tricentenario,6.290542,-75.564733,Metro Station
1,Tricentenario,6.29107,-75.566325,Parque Juanes de la Paz,6.292663,-75.568673,Recreation Center
2,Tricentenario,6.29107,-75.566325,Club De Tenis El Bosque,6.293351,-75.568521,Tennis Court
3,Tricentenario,6.29107,-75.566325,"Parche Tricen,Tienda mixta",6.29267,-75.564551,Grocery Store
4,Tricentenario,6.29107,-75.566325,Indu Aires LS,6.288768,-75.56742,Construction & Landscaping


Revisemos cuantos retornaron para cada barrio

In [17]:
Medellin_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Aldea Pablo VI,2,2,2,2,2,2
Alejandro Echavarría,5,5,5,5,5,5
Alejandría,16,16,16,16,16,16
Alfonso López,1,1,1,1,1,1
Altamira,3,3,3,3,3,3
Altos del Poblado,4,4,4,4,4,4
Andalucía,5,5,5,5,5,5
Antonio Nariño,6,6,6,6,6,6
Aranjuez,4,4,4,4,4,4
Asomadera No.1,8,8,8,8,8,8


#### Encontremos cuantas categorías únicas se pueden conservar de todos los sitios regresados

In [18]:
print('Hay {} categorias unicas.'.format(len(Medellin_venues['Venue Category'].unique())))

Hay 241 categorias unicas.


### Analicemos cada barrio

In [19]:
# codificación
Medellin_onehot = pd.get_dummies(Medellin_venues[['Venue Category']], prefix="", prefix_sep="")

# añadir la columna de barrio de regreso al dataframe
Medellin_onehot['Neighbourhood'] = Medellin_venues['Neighborhood'] 

# mover la columna de barrio a la primer columna
fixed_columns = [Medellin_onehot.columns[-1]] + list(Medellin_onehot.columns[:-1])
Medellin_onehot = Medellin_onehot[fixed_columns]

Medellin_onehot.head()

Unnamed: 0,Neighbourhood,Advertising Agency,Airport,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Aquarium,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auditorium,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Bakery,Bar,Baseball Field,Basketball Court,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Betting Shop,Big Box Store,Bike Rental / Bike Share,Bike Trail,Bistro,Boarding House,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Cable Car,Café,Campground,Caribbean Restaurant,Casino,Cemetery,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Electronics Store,Event Service,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gymnastics Gym,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hostel,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Luggage Store,Market,Medical Supply Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nature Preserve,Nightclub,Noodle House,Notary,Nursery School,Optical Shop,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Event Space,Outdoor Supply Store,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Pedestrian Plaza,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pie Shop,Pier,Pizza Place,Planetarium,Playground,Plaza,Poke Place,Pool,Print Shop,Pub,Public Art,Real Estate Office,Recreation Center,Rental Car Location,Rental Service,Resort,Rest Area,Restaurant,River,Road,Rock Club,Salad Place,Salon / Barbershop,Salsa Club,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,South American Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Storage Facility,Street Art,Supermarket,Sushi Restaurant,TV Station,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Theater,Theme Park,Theme Restaurant,Tour Provider,Toy / Game Store,Track,Track Stadium,Tram Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Tricentenario,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Tricentenario,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Tricentenario,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Tricentenario,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Tricentenario,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Verifiquemos el nuevo tamaño del DataFrame

In [20]:
Medellin_onehot.shape

(2605, 242)

#### Agrupemos las filas por barrios tomando la média de la frecuencia de la ocurrencia de cada categoría

In [21]:
Medellin_grouped = Medellin_onehot.groupby('Neighbourhood').mean().reset_index()
Medellin_grouped

Unnamed: 0,Neighbourhood,Advertising Agency,Airport,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Aquarium,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auditorium,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Bakery,Bar,Baseball Field,Basketball Court,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Betting Shop,Big Box Store,Bike Rental / Bike Share,Bike Trail,Bistro,Boarding House,Bookstore,Botanical Garden,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Cable Car,Café,Campground,Caribbean Restaurant,Casino,Cemetery,Cheese Shop,Chinese Restaurant,Clothing Store,Club House,Cocktail Bar,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Electronics Store,Event Service,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gymnastics Gym,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hostel,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kids Store,Korean Restaurant,Lake,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Luggage Store,Market,Medical Supply Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nature Preserve,Nightclub,Noodle House,Notary,Nursery School,Optical Shop,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Event Space,Outdoor Supply Store,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Pedestrian Plaza,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Pie Shop,Pier,Pizza Place,Planetarium,Playground,Plaza,Poke Place,Pool,Print Shop,Pub,Public Art,Real Estate Office,Recreation Center,Rental Car Location,Rental Service,Resort,Rest Area,Restaurant,River,Road,Rock Club,Salad Place,Salon / Barbershop,Salsa Club,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,South American Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Storage Facility,Street Art,Supermarket,Sushi Restaurant,TV Station,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Theater,Theme Park,Theme Restaurant,Tour Provider,Toy / Game Store,Track,Track Stadium,Tram Station,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Aldea Pablo VI,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Alejandro Echavarría,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Alejandría,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Alfonso López,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Altamira,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Altos del Poblado,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Andalucía,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Antonio Nariño,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0
8,Aranjuez,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Asomadera No.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Confirmemos el tamaño del DataFrame

In [22]:
Medellin_grouped.shape

(231, 242)

#### Imprimamos cada barrio junto con los 5 sitios mas comunes

In [23]:
num_top_venues = 5

for hood in Medellin_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = Medellin_grouped[Medellin_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Aldea Pablo VI----
                        venue  freq
0  Construction & Landscaping   0.5
1                  Restaurant   0.5
2          Advertising Agency   0.0
3                 Pastry Shop   0.0
4                Noodle House   0.0


----Alejandro Echavarría----
                venue  freq
0      Ice Cream Shop   0.2
1           Multiplex   0.2
2       Shopping Mall   0.2
3  Athletics & Sports   0.2
4        Tram Station   0.2


----Alejandría----
              venue  freq
0             Hotel  0.19
1     Shopping Mall  0.12
2         BBQ Joint  0.06
3       Coffee Shop  0.06
4  Sushi Restaurant  0.06


----Alfonso López----
                venue  freq
0   Electronics Store   1.0
1  Advertising Agency   0.0
2                Park   0.0
3           Nightclub   0.0
4        Noodle House   0.0


----Altamira----
                        venue  freq
0  Construction & Landscaping  0.33
1              Breakfast Spot  0.33
2                      Bakery  0.33
3          Advertising Agency 

#### pongamoslo en el DataFrame
Primero escribamos una función para ordenar los sitios en orden descendente.

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Generemos el nuevo dataframe y mostremos los primeros 10 sitios de cada barrio.

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# crear las columnas acorde al numero de sitios populares
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# crear un nuevo dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Medellin_grouped['Neighbourhood']

for ind in np.arange(Medellin_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Medellin_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Aldea Pablo VI,Construction & Landscaping,Restaurant,Donut Shop,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
1,Alejandro Echavarría,Ice Cream Shop,Tram Station,Athletics & Sports,Shopping Mall,Multiplex,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Event Service
2,Alejandría,Hotel,Shopping Mall,Restaurant,Pizza Place,Frozen Yogurt Shop,Mediterranean Restaurant,Sushi Restaurant,Bar,BBQ Joint,Café
3,Alfonso López,Electronics Store,Zoo,Ice Cream Shop,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
4,Altamira,Construction & Landscaping,Breakfast Spot,Bakery,Event Service,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service


### Agrupemos los barrios, **K-Means**
Ejecutemos _k_-means para agrupar los barrios en 5 agrupaciones.

In [26]:
# establecer el número de agrupaciones
kclusters = 5

Medellin_grouped_clustering = Medellin_grouped.drop('Neighbourhood', 1)

# ejecutar k-means
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Medellin_grouped_clustering)

# revisar las etiquetas de las agrupaciones generadas para cada fila del dataframe
kmeans.labels_[0:10]

array([0, 2, 2, 2, 2, 2, 3, 2, 2, 2])

Generemos un nuevo dataframe que incluya la agrupación asi como los 10 sitios mas populares de cada barrio.

In [28]:
# añadir etiquetas
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Medellin_merged = dfBarrios

# juntar Medellin_grouped con dfBarrios
Medellin_merged = Medellin_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='NOMBRE')

Medellin_merged.head() # revisar las ultimas columnas

Unnamed: 0,OBJECTID,CODIGO,NOMBRE,SUBTIPO_BARRIOVEREDA,NOMBRE_COMUNA_CORREGIMIENTO,SHAPEAREA,SHAPELEN,lat,lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1112,510,Tricentenario,1,Castilla,420637.970349,2897.304229,6.29107,-75.5663,2.0,Construction & Landscaping,Grocery Store,Recreation Center,Tennis Court,Metro Station,Zoo,Farm,Event Service,Eye Doctor,Falafel Restaurant
1,1113,208,Villa Niza,1,Santa Cruz,143215.327504,1697.303318,6.29565,-75.5634,2.0,Grocery Store,Real Estate Office,Furniture / Home Store,Farm,Health & Beauty Service,Zoo,Farmers Market,Event Service,Eye Doctor,Falafel Restaurant
2,1114,1108,Laureles,1,Laureles Estadio,707014.821267,3847.112683,6.242,-75.5958,2.0,Bar,Peruvian Restaurant,Pizza Place,Burger Joint,Italian Restaurant,Gym / Fitness Center,Bakery,Hotel,Mexican Restaurant,Food Stand
3,1115,1303,Santa Rosa de Lima,1,San Javier,139970.996369,2158.954261,6.2651,-75.6064,,,,,,,,,,,
4,1116,1206,Santa Lucía,1,La América,275913.740234,3048.703385,6.25744,-75.6074,2.0,Business Service,Supermarket,Soccer Field,Athletics & Sports,Metro Station,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Zoo


Antes de visualizar las agrupaciones, eliminemos algunas columnas innecesarias

In [29]:
Medellin_merged.drop(['OBJECTID', 'CODIGO', 'SUBTIPO_BARRIOVEREDA', 'SHAPEAREA', 'SHAPELEN' ], axis = 'columns', inplace=True)
Medellin_merged.head()

Unnamed: 0,NOMBRE,NOMBRE_COMUNA_CORREGIMIENTO,lat,lng,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Tricentenario,Castilla,6.29107,-75.5663,2.0,Construction & Landscaping,Grocery Store,Recreation Center,Tennis Court,Metro Station,Zoo,Farm,Event Service,Eye Doctor,Falafel Restaurant
1,Villa Niza,Santa Cruz,6.29565,-75.5634,2.0,Grocery Store,Real Estate Office,Furniture / Home Store,Farm,Health & Beauty Service,Zoo,Farmers Market,Event Service,Eye Doctor,Falafel Restaurant
2,Laureles,Laureles Estadio,6.242,-75.5958,2.0,Bar,Peruvian Restaurant,Pizza Place,Burger Joint,Italian Restaurant,Gym / Fitness Center,Bakery,Hotel,Mexican Restaurant,Food Stand
3,Santa Rosa de Lima,San Javier,6.2651,-75.6064,,,,,,,,,,,
4,Santa Lucía,La América,6.25744,-75.6074,2.0,Business Service,Supermarket,Soccer Field,Athletics & Sports,Metro Station,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Zoo


Tratamiento de datos: renombrar columnas con espacios, eliminar valores nulos, etc

In [36]:
Medellin_merged.rename(columns = {'Cluster Labels':'ClusterLabels'}, inplace = True)
Medellin_merged.ClusterLabels = Medellin_merged.ClusterLabels.fillna(9)
Medellin_merged.ClusterLabels = Medellin_merged.ClusterLabels.astype(int)
Medellin_merged.head(5)

Unnamed: 0,NOMBRE,NOMBRE_COMUNA_CORREGIMIENTO,lat,lng,ClusterLabels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Tricentenario,Castilla,6.29107,-75.5663,2,Construction & Landscaping,Grocery Store,Recreation Center,Tennis Court,Metro Station,Zoo,Farm,Event Service,Eye Doctor,Falafel Restaurant
1,Villa Niza,Santa Cruz,6.29565,-75.5634,2,Grocery Store,Real Estate Office,Furniture / Home Store,Farm,Health & Beauty Service,Zoo,Farmers Market,Event Service,Eye Doctor,Falafel Restaurant
2,Laureles,Laureles Estadio,6.242,-75.5958,2,Bar,Peruvian Restaurant,Pizza Place,Burger Joint,Italian Restaurant,Gym / Fitness Center,Bakery,Hotel,Mexican Restaurant,Food Stand
3,Santa Rosa de Lima,San Javier,6.2651,-75.6064,9,,,,,,,,,,
4,Santa Lucía,La América,6.25744,-75.6074,2,Business Service,Supermarket,Soccer Field,Athletics & Sports,Metro Station,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Zoo


In [39]:
indexNames = Medellin_merged[ (Medellin_merged['ClusterLabels'] == 9)].index
Medellin_merged.drop(indexNames, inplace=True)
Medellin_merged.head(5)

Unnamed: 0,NOMBRE,NOMBRE_COMUNA_CORREGIMIENTO,lat,lng,ClusterLabels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Tricentenario,Castilla,6.29107,-75.5663,2,Construction & Landscaping,Grocery Store,Recreation Center,Tennis Court,Metro Station,Zoo,Farm,Event Service,Eye Doctor,Falafel Restaurant
1,Villa Niza,Santa Cruz,6.29565,-75.5634,2,Grocery Store,Real Estate Office,Furniture / Home Store,Farm,Health & Beauty Service,Zoo,Farmers Market,Event Service,Eye Doctor,Falafel Restaurant
2,Laureles,Laureles Estadio,6.242,-75.5958,2,Bar,Peruvian Restaurant,Pizza Place,Burger Joint,Italian Restaurant,Gym / Fitness Center,Bakery,Hotel,Mexican Restaurant,Food Stand
4,Santa Lucía,La América,6.25744,-75.6074,2,Business Service,Supermarket,Soccer Field,Athletics & Sports,Metro Station,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Zoo
5,La Floresta,La América,6.25641,-75.6014,2,Metro Station,Burger Joint,Hotel,Park,Restaurant,Breakfast Spot,Supermarket,Zoo,Falafel Restaurant,Farm


Despues de ajustar el formato de nuestros datos, procedemos a visualizar las agrupaciones resultantes

In [43]:
#crear mapa
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# establecer el esquema de color para las agrupaciones
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# añadir marcadores al mapa
markers_colors = []
for lat, lon, poi, cluster in zip(Medellin_merged['lat'], Medellin_merged['lng'], Medellin_merged['NOMBRE'], Medellin_merged['ClusterLabels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Examinemos el resultados de las agrupaciones

**Agrupacion 1**

In [46]:
Medellin_merged.loc[Medellin_merged['ClusterLabels'] == 0, Medellin_merged.columns[[0] + list(range(5, Medellin_merged.shape[1]))]]

Unnamed: 0,NOMBRE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
43,Oriente,Construction & Landscaping,French Restaurant,Cosmetics Shop,Creperie,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service
63,Carpinelo,Construction & Landscaping,Restaurant,Donut Shop,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
158,Brasilia,Restaurant,Latin American Restaurant,Zoo,Donut Shop,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
218,Aldea Pablo VI,Construction & Landscaping,Restaurant,Donut Shop,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
225,Córdoba,Construction & Landscaping,Ice Cream Shop,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market


**Agrupacion 2**

In [47]:
Medellin_merged.loc[Medellin_merged['ClusterLabels'] == 1, Medellin_merged.columns[[0] + list(range(5, Medellin_merged.shape[1]))]]

Unnamed: 0,NOMBRE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Moscú No.2,Construction & Landscaping,Rental Service,Gym,Park,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Food,Doner Restaurant,Farm
24,El Pesebre,Advertising Agency,Gym / Fitness Center,Mountain,Park,Cupcake Shop,Event Service,Food Court,Food & Drink Shop,Food,Creperie
30,Bosques de San Pablo,Pizza Place,Bakery,Park,Zoo,Electronics Store,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service
37,San José La Cima No.1,Park,Zoo,Electronics Store,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
39,Villa Guadalupe,Construction & Landscaping,Rental Service,Men's Store,Park,Electronics Store,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service
46,María Cano-Carambolas,Park,Grocery Store,Donut Shop,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
48,La Pilarica,Park,Food,BBQ Joint,Grocery Store,Donut Shop,Food Court,Food & Drink Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
52,Tejelo,Ice Cream Shop,Burger Joint,Park,Electronics Store,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
67,Robledo,Park,Motorcycle Shop,Gym / Fitness Center,Donut Shop,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
79,Barrio Caicedo,Museum,Comfort Food Restaurant,Food Court,Park,Bar,Zoo,Farmers Market,Falafel Restaurant,Farm,Financial or Legal Service


**Agrupacion 3**

In [48]:
Medellin_merged.loc[Medellin_merged['ClusterLabels'] == 2, Medellin_merged.columns[[0] + list(range(5, Medellin_merged.shape[1]))]]

Unnamed: 0,NOMBRE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Tricentenario,Construction & Landscaping,Grocery Store,Recreation Center,Tennis Court,Metro Station,Zoo,Farm,Event Service,Eye Doctor,Falafel Restaurant
1,Villa Niza,Grocery Store,Real Estate Office,Furniture / Home Store,Farm,Health & Beauty Service,Zoo,Farmers Market,Event Service,Eye Doctor,Falafel Restaurant
2,Laureles,Bar,Peruvian Restaurant,Pizza Place,Burger Joint,Italian Restaurant,Gym / Fitness Center,Bakery,Hotel,Mexican Restaurant,Food Stand
4,Santa Lucía,Business Service,Supermarket,Soccer Field,Athletics & Sports,Metro Station,Eye Doctor,Falafel Restaurant,Farm,Farmers Market,Zoo
5,La Floresta,Metro Station,Burger Joint,Hotel,Park,Restaurant,Breakfast Spot,Supermarket,Zoo,Falafel Restaurant,Farm
6,Las Mercedes,Pizza Place,Theater,Food & Drink Shop,Fast Food Restaurant,Sandwich Place,Café,Zoo,Food,Fish & Chips Shop,Financial or Legal Service
7,Boston,Pizza Place,Plaza,Theater,Park,Farmers Market,Salsa Club,Steakhouse,Bar,Caribbean Restaurant,BBQ Joint
8,Nueva Villa de La Iguaná,Seafood Restaurant,Housing Development,Gym,Shopping Mall,Fast Food Restaurant,BBQ Joint,Latin American Restaurant,Soccer Stadium,Cocktail Bar,Hotel
9,Alejandro Echavarría,Ice Cream Shop,Tram Station,Athletics & Sports,Shopping Mall,Multiplex,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Event Service
10,Picacho,Colombian Restaurant,Zoo,Electronics Store,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant


**Agrupacion 4**

In [49]:
Medellin_merged.loc[Medellin_merged['ClusterLabels'] == 3, Medellin_merged.columns[[0] + list(range(5, Medellin_merged.shape[1]))]]

Unnamed: 0,NOMBRE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,La Sierra,Cable Car,Convenience Store,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
64,Andalucía,Cable Car,Real Estate Office,Furniture / Home Store,Metro Station,Zoo,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
68,Juan XXIII La Quiebra,Cable Car,Convenience Store,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
81,Santa Margarita,Cable Car,Home Service,Zoo,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
84,Villa Turbay,Cable Car,Campground,Zoo,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
141,Olaya Herrera,Cable Car,Convenience Store,Food Service,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market


**Agrupacion 5**

In [50]:
Medellin_merged.loc[Medellin_merged['ClusterLabels'] == 4, Medellin_merged.columns[[0] + list(range(5, Medellin_merged.shape[1]))]]

Unnamed: 0,NOMBRE,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
168,Las Estancias,Shoe Store,Zoo,Food Stand,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
180,Barrios de Jesús,Shoe Store,Zoo,Food Stand,Food Court,Food & Drink Shop,Food,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market


### Gracias!!