# Week 5 

# Peer-graded Assignment: Capstone Poject - The Battle of Neighborhoods (First Week) 


## Educational Objectives

This Notebook will be used for the final project. In this first week, I state the introduction, background, motivation, scope definition, data & methodology used. This capstone project is done by Conrado Montealegre.



<h1 id="Title: Comparison of two small cities (towns) in the Northern Sierra of Puebla - Zacatlán vs Chignahuapan">Title: Comparison of two small cities (towns) in the Northern Sierra of Puebla - Zacatlán vs Chignahuapan</h1>

<h2 id="1. Introduction (motivations and justification, business/social problem)">1. Introduction (motivations and justification, business/social problem)</h2>

In this work, I will compare two small cities (towns) located in the country province of Mexico on the so-called Northern Sierra of Puebla in the middle of Mexico. Zacatlan and Chinahuapan are the names of these towns. Both are located in a geographical zone which is a transition zone between the central Mexican plateau, and the coastal zone of the Gulf of Mexico. This transition is characterized by a complicated orography with mountains and ravines that hardens access to the more isolated country population. In fact, both cities are municipal heads. So, although these small cities are reasonably developed, they have some populations that live in rural isolated small communities.

But if all that has been stated is not interesting enough, even though both cities are close neighbors, they are not twin cities. Their historical governments have defined a particular development for each of them. Zacatlan has been historically a neural point for the trade routes between central Mexico (Mexico City and Puebla) and the coastal zone of the Gulf of Mexico (Veracruz). In addition, nowadays Zacatlan rulers have been trying to make this city an attractive tourism landmark. So, the downtown of Zacatlan has been changed from a typical rural Mexican town into a place with nice amenities with the purpose to attract foreign tourism.

On the other side, Chignahuapan government has impulsed industrial and economical development. Chigahuaán is also trying to attract tourism, but a different kind which could arrive and purchase the local products. One of these products is the production of "Christmas spheres" and other Christmas decorations. This economical activity is maybe one of the most important across Mexico in this area.

So, the purpose of this work is to explore these similitudes and differences. I will try to state how the local population's way of life can be affected by the different development plans in two cities close enough to make a comparison. I expect to find evidence about how the amount and type of amenities are a consequence of the different economical activities.

![Sierra](images/Sierra.jpg)

<h2 id="2. Data">2. Data</h2>

For this project, I will use 

* The data available on the internet, both on commercial websites and on government websites
* The official data from the Mexican National Institute of Statistics Geography and Information (INEGI, by its acronym in Spanish). 
* Foursquare app for collecting information about the different landmarks across Zacatlan and Chignahuapan.







![Sierra](images/INEGI.jpg)

<h2 id="3. Methodology">3. Methodology</h2>

The methodology that will be used is the same that we have been using across the course. First, I will download, process, and clean the data available. Later, I will perform an exploratory analysis of the data collected in order to gain insights into them. In this stage, I will visualize data mixed with geodata.

I will use unsupervised analysis (clustering) to better identified any pattern in the geo data available. I will try to identify different classes of landmarks with the purpose of characterizing the socioeconomic orientation of both towns.

Finally, I will analyze and discuss the findings train to state if the different economical activities of these cities really determine the class and amount of amenities and other landmarks.



![Sierra](images/Zacatlan.jpg)

![Sierra](images/Chignahuapan.jpg)

<h3 id="3.1. Getting/Creating Data">3.1. Getting/Creating Data</h3>

### Importing libraries

In [3]:
from config import CLIENT_ID, CLIENT_SECRET, ACCESS_TOKEN, VERSION, LIMIT, API_KEY
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import numpy as np
from geopy.geocoders import Nominatim
import requests
import pickle
import folium

### Getting coordinates for Zacatlan and Chignahuapan

In [4]:
locator = Nominatim(user_agent="myGeocoder")

# Getting coordinates for Zacatlan
#str_Zacatlan = 'Zacatlan, Puebla, Mexico'
str_Zacatlan = 'Zacatlan, Puebla, Mexico'
zacatlan_loc = locator.geocode(str_Zacatlan)
print("{} Latitude = {}, Longitude = {}".format(str_Zacatlan,
zacatlan_loc.latitude, zacatlan_loc.longitude))
zacatlan_center = [zacatlan_loc.latitude, zacatlan_loc.longitude]

# Getting coordinates for Chignahuapan
str_Chignahuapan = 'Chignahuapan, Puebla, Mexico'
chignahuapan_loc = locator.geocode(str_Chignahuapan)
print("{} Latitude = {}, Longitude = {}".format(str_Chignahuapan,
chignahuapan_loc.latitude, chignahuapan_loc.longitude))
chignahuapan_center = [chignahuapan_loc.latitude, chignahuapan_loc.longitude]

Zacatlan, Puebla, Mexico Latitude = 19.96059515, Longitude = -98.00572136341704
Chignahuapan, Puebla, Mexico Latitude = 19.8213915, Longitude = -98.11775714603566


### I had to define manually the "center" of Zacatlan and Chignahupan towns because the ones gotten from Nominatim are badly located.

In [5]:
zacatlan_center2 = [19.936115, -97.960550]
#chignahuapan_center2 = [19.836236, -98.030375]
chignahuapan_center2 = [19.835442, -98.031382]

### Map of Zacatlan

In [6]:
map_zacatlan = folium.Map(location=zacatlan_center2, zoom_start=15)
folium.Marker(zacatlan_center, popup='Wrong center - Zacatlan').add_to(map_zacatlan)
folium.Marker(zacatlan_center2, popup='Zacatlan').add_to(map_zacatlan)
map_zacatlan

### Map of Chignahuapan

In [7]:
map_chignahuapan = folium.Map(location=chignahuapan_center2, zoom_start=15)
folium.Marker(chignahuapan_center, popup='Wrong center - Chignauapan').add_to(map_chignahuapan)
folium.Marker(chignahuapan_center2, popup='Chignauapan').add_to(map_chignahuapan)
map_chignahuapan

### Storing the correct centers as centers

In [8]:
zacatlan_center = zacatlan_center2.copy()
chignahuapan_center = chignahuapan_center2.copy()

Now let's create a grid of area candidates, equaly spaced, centered around city center and within ~6km from every town center. Our neighborhoods will be defined as circular areas with a radius of 300 meters, so our neighborhood centers will be 600 meters apart.

To accurately calculate distances we need to create our grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in meters (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in  meters).

Zacatlan and Chignahuapan are located in WGS84 UTM Zone 14N.

In [9]:
#!pip install shapely
import shapely.geometry

#!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=14, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=14, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)


### Constants for defining the scope of the analysis

In [10]:
RADIUS = 300
DISTANCE = RADIUS *2
NUM_CIRCLES = 11
MAX_DIST = DISTANCE * NUM_CIRCLES
LIMIT_CIRC = (NUM_CIRCLES*2) +1

### Corrdinates transformation check

In [11]:
print('Coordinate transformation check')
print('-------------------------------')
print('Zacatlan center longitude={}, latitude={}'.format(zacatlan_center[1], zacatlan_center[0]))
x, y = lonlat_to_xy(zacatlan_center[1], zacatlan_center[0])
print('Zacatlan center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Zacatlan center longitude={}, latitude={}'.format(lo, la))
print('-------------------------------')
print('Chignahuapan center longitude={}, latitude={}'.format(chignahuapan_center[1], chignahuapan_center[0]))
x, y = lonlat_to_xy(chignahuapan_center[1], chignahuapan_center[0])
print('Chignahuapan center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Chignahuapan center longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
Zacatlan center longitude=-97.96055, latitude=19.936115
Zacatlan center UTM X=608780.3127579108, Y=2204748.294231874
Zacatlan center longitude=-97.96055, latitude=19.936114999999997
-------------------------------
Chignahuapan center longitude=-98.031382, latitude=19.835442
Chignahuapan center UTM X=601431.1251023441, Y=2193562.4309149412
Chignahuapan center longitude=-98.031382, latitude=19.835442


### Creating hexagonal grid of cells for Zacatlan town

In [12]:
zacatlan_center_x, zacatlan_center_y = lonlat_to_xy(zacatlan_center[1], zacatlan_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = zacatlan_center_x - MAX_DIST
x_step = DISTANCE
y_min = zacatlan_center_y - MAX_DIST - (int(LIMIT_CIRC/k)*k*DISTANCE - (MAX_DIST*2))/2
y_step = DISTANCE * k 

latitudes_zac = []
longitudes_zac = []
distances_from_center_zac = []
xs_zac = []
ys_zac = []
for i in range(0, int(LIMIT_CIRC/k)):
    y = y_min + i * y_step
    x_offset = RADIUS if i%2==0 else 0
    for j in range(0, LIMIT_CIRC):
        x = x_min + j * x_step + x_offset
        distance_from_center_zac = calc_xy_distance(zacatlan_center_x, zacatlan_center_y, x, y)
        if (distance_from_center_zac <= (MAX_DIST+1)):
            lon, lat = xy_to_lonlat(x, y)
            latitudes_zac.append(lat)
            longitudes_zac.append(lon)
            distances_from_center_zac.append(distance_from_center_zac)
            xs_zac.append(x)
            ys_zac.append(y)

print(len(latitudes_zac), 'candidate neighborhood centers generated for Zacatlan.')

proj-2-from-pyproj-1
439 candidate neighborhood centers generated for Zacatlan.


### Visualizeing the grid for Zacatlan

In [13]:
map_zacatlan = folium.Map(location=zacatlan_center, zoom_start=13)
folium.Marker(zacatlan_center, popup='Zacatlan').add_to(map_zacatlan)
for lat, lon in zip(latitudes_zac, longitudes_zac):
    #folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_zacatlan) 
    folium.Circle([lat, lon], radius=RADIUS, color='blue', fill=False).add_to(map_zacatlan)
    #folium.Marker([lat, lon]).add_to(map_zacatlan)
map_zacatlan

### Creating hexagonal grid of cells for Chignahuapan town

In [14]:
chignahuapan_center_x, chignahuapan_center_y = lonlat_to_xy(chignahuapan_center[1], chignahuapan_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = chignahuapan_center_x - MAX_DIST
x_step = DISTANCE
y_min = chignahuapan_center_y - MAX_DIST - (int(LIMIT_CIRC/k)*k*DISTANCE - (MAX_DIST*2))/2
y_step = DISTANCE * k 

latitudes_chi = []
longitudes_chi = []
distances_from_center_chi = []
xs_chi = []
ys_chi = []
for i in range(0, int(LIMIT_CIRC/k)):
    y = y_min + i * y_step
    x_offset = RADIUS if i%2==0 else 0
    for j in range(0, LIMIT_CIRC):
        x = x_min + j * x_step + x_offset
        distance_from_center_chi = calc_xy_distance(chignahuapan_center_x, chignahuapan_center_y, x, y)
        if (distance_from_center_chi <= (MAX_DIST+1)):
            lon, lat = xy_to_lonlat(x, y)
            latitudes_chi.append(lat)
            longitudes_chi.append(lon)
            distances_from_center_chi.append(distance_from_center_chi)
            xs_chi.append(x)
            ys_chi.append(y)

print(len(latitudes_chi), 'candidate neighborhood centers generated for Chignahuapan.')

-2-from-pyproj-1
439 candidate neighborhood centers generated for Chignahuapan.


### Visualizing the grid for Chignahuapan

In [15]:
map_chignahuapan = folium.Map(location=chignahuapan_center, zoom_start=13)
folium.Marker(chignahuapan_center, popup='Chignahuapan').add_to(map_chignahuapan)
for lat, lon in zip(latitudes_chi, longitudes_chi):
    #folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_chignahuapan) 
    folium.Circle([lat, lon], radius=RADIUS, color='red', fill=False).add_to(map_chignahuapan)
    #folium.Marker([lat, lon]).add_to(map_chignahuapan)
map_chignahuapan

In [16]:
### Visualizing grid for both Zacatlan and Chignahuapan towns

In [17]:
all_center=[0,0]
all_center[0] = (zacatlan_center[0] + chignahuapan_center[0]) / 2
all_center[1] = (zacatlan_center[1] + chignahuapan_center[1]) / 2

import folium
map_all = folium.Map(location=chignahuapan_center, zoom_start=12)
folium.Marker(zacatlan_center, popup='Zacatlan').add_to(map_all)
folium.Marker(chignahuapan_center, popup='Chignahuapan').add_to(map_all)
for lat, lon in zip(latitudes_zac, longitudes_zac):
    folium.Circle([lat, lon], radius=RADIUS, color='blue', fill=False).add_to(map_all)
for lat, lon in zip(latitudes_chi, longitudes_chi):
    folium.Circle([lat, lon], radius=RADIUS, color='red', fill=False).add_to(map_all)
map_all




### OK, we now have the coordinates of centers of neighborhoods/areas to be evaluated, let's now use Google Maps API to get approximate addresses of those locations.

In [18]:
def get_address(api_key, latitude, longitude, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&latlng={},{}'.format(api_key, latitude, longitude)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None


### Checking reverse geocoding

In [19]:
addr = get_address(API_KEY, chignahuapan_center[0], chignahuapan_center[1])
#addr = get_address(API_KEY, 52.5219184, 13.4132147)
print('Reverse geocoding check')
print('-----------------------')
print('Address of [{}, {}] is: {}'.format(chignahuapan_center[0], chignahuapan_center[1], addr))

Reverse geocoding check
-----------------------
Address of [19.835442, -98.031382] is: Fco. Javier Mina 23, Teotlalpan, 73300 Chignahuapan, Pue., Mexico


### Obtaining Zacatlan and Chignahuapan location addresses. Making Pandas dataframes and saving data into local files.

In [20]:
print('Obtaining Zacatlan location addresses: ', end='')

# Try to load from local file system in case we did this before
df_locations_zac = pd.DataFrame()
loaded_zac = False
try:
    with open('locations_zac.pkl', 'rb') as f:
        df_locations_zac = pickle.load(f)
    print('Zacatlan locations loaded.')
    loaded_zac = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded_zac:
    addresses_zac = []
    for lat, lon in zip(latitudes_zac, longitudes_zac):
        address = get_address(API_KEY, lat, lon)
        if address is None:
            address = 'NO ADDRESS'
        address = address.replace(', Mexico', '') # We don't need country part of address
        addresses_zac.append(address)
        print(' .', end='')
    print(' done.')
    print(addresses_zac[0:10])
    df_locations_zac = pd.DataFrame({'Address': addresses_zac,
                             'Latitude': latitudes_zac,
                             'Longitude': longitudes_zac,
                             'X': xs_zac,
                             'Y': ys_zac,
                             'Distance from center': distances_from_center_zac})
    df_locations_zac.to_pickle('./locations_zac.pkl')
    df_locations_zac.to_excel('./locations_zac.xlsx')
print(len(df_locations_zac))
print(df_locations_zac.head(10))   


Obtaining Zacatlan location addresses: Zacatlan locations loaded.
439
                                             Address   Latitude  Longitude  \
0                                   M Alemán, Puebla  19.879876 -97.978110   
1                                   M Alemán, Puebla  19.879843 -97.972379   
2                         México 119, Zacatlán, Pue.  19.879810 -97.966649   
3                         Camino a Cuacuilco, Puebla  19.879777 -97.960918   
4                         Camino a Cuacuilco, Puebla  19.879743 -97.955187   
5                         Camino a Cuacuilco, Puebla  19.879709 -97.949457   
6                          Camino a Otlatlán, Puebla  19.879675 -97.943726   
7                                 Unnamed Road, Pue.  19.884653 -97.992407   
8                  Carlos Salinas de Gortari, Puebla  19.884620 -97.986676   
9  Carlos Salinas de Gortari, Teoconchila, San Jo...  19.884587 -97.980945   

               X             Y  Distance from center  
0  606980.312758

In [21]:
print('Obtaining Chignahuapan location addresses: ', end='')

# Try to load from local file system in case we did this before
df_locations_chi = pd.DataFrame()
loaded_chi = False
try:
    with open('locations_chi.pkl', 'rb') as f:
        df_locations_chi = pickle.load(f)
    print('Chignahuapan locations loaded.')
    loaded_chi = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded_chi:
    addresses_chi = []
    for lat, lon in zip(latitudes_chi, longitudes_chi):
        address = get_address(API_KEY, lat, lon)
        if address is None:
            address = 'NO ADDRESS'
        address = address.replace(', Mexico', '') # We don't need country part of address
        addresses_chi.append(address)
        print(' .', end='')
    print(' done.')
    print(addresses_chi[0:10])
    df_locations_chi = pd.DataFrame({'Address': addresses_chi,
                             'Latitude': latitudes_chi,
                             'Longitude': longitudes_chi,
                             'X': xs_chi,
                             'Y': ys_chi,
                             'Distance from center': distances_from_center_chi})
    df_locations_chi.to_pickle('./locations_chi.pkl')
    df_locations_chi.to_excel('./locations_chi.xlsx')
print(len(df_locations_chi))
print(df_locations_chi.head(10))  

Obtaining Chignahuapan location addresses: Chignahuapan locations loaded.
439
                                      Address   Latitude  Longitude  \
0                        Unnamed Road, Puebla  19.779194 -98.048904   
1                        Unnamed Road, Puebla  19.779163 -98.043177   
2  Chignahuapan - Tlaxco, Ciénega Larga, Pue.  19.779133 -98.037450   
3                        Unnamed Road, Puebla  19.779102 -98.031723   
4                        Unnamed Road, Puebla  19.779071 -98.025996   
5                        Unnamed Road, Puebla  19.779039 -98.020269   
6                     PUE 148, Aquixtla, Pue.  19.779008 -98.014542   
7                        Unnamed Road, Puebla  19.783965 -98.063195   
8                        Unnamed Road, Puebla  19.783934 -98.057467   
9                        Unnamed Road, Puebla  19.783904 -98.051740   

               X             Y  Distance from center  
0  599631.125102  2.187327e+06           6489.992296  
1  600231.125102  2.187327e+06

### Creating an unified locations data frame

In [24]:
df_locations = pd.concat([df_locations_zac, df_locations_chi], axis=0, ignore_index = True)
df_locations.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 878 entries, 0 to 877
Data columns (total 6 columns):
Address                 878 non-null object
Latitude                878 non-null float64
Longitude               878 non-null float64
X                       878 non-null float64
Y                       878 non-null float64
Distance from center    878 non-null float64
dtypes: float64(5), object(1)
memory usage: 41.3+ KB


### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on landmarks in each neighborhood.

In [25]:
latitudes = latitudes_zac+latitudes_chi
longitudes = longitudes_zac+longitudes_chi
print(len(latitudes))
print(len(latitudes_zac))
print(len(latitudes_chi))
print(len(longitudes))
print(len(longitudes_zac))
print(len(longitudes_chi))
print(latitudes[240:250])
print(longitudes[240:250])

878
439
439
878
439
439
[19.940859968905983, 19.940826607063606, 19.940793060283074, 19.940759328565996, 19.94072541191397, 19.940691310328628, 19.940657023811593, 19.94062255236449, 19.94058789598898, 19.9405530546867]
[-97.96911843871385, -97.96338566471081, -97.95765290234846, -97.95192015169111, -97.94618741280311, -97.9404546857488, -97.9347219705925, -97.92898926739859, -97.92325657623138, -97.91752389715519]


In [26]:
def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', Puebla', '')
    address = address.replace('México', '')
    return address
    
def get_venues_near_location(lat, lon, client_id, client_secret, version, radius=500, limit=100):
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, radius, limit)
    #print(url)
    #try:
    results = requests.get(url).json()['response']['groups'][0]['items']
    #results = requests.get(url).json()
    venues = [(item['venue']['id'],
                item['venue']['name'],
                get_categories(item['venue']['categories']),
                (item['venue']['location']['lat'], item['venue']['location']['lng']),
                format_address(item['venue']['location']),
                item['venue']['location']['distance']) for item in results]        
    #except Exception as e:
    #    print(e)
    #    venues = []
    return venues

In [27]:
test_venue = get_venues_near_location(latitudes[242], longitudes[242], CLIENT_ID, CLIENT_SECRET, VERSION, radius=RADIUS+50, limit=100)
test_venue

KeyError: 'groups'

In [None]:
# Let's now go over our neighborhood locations and get nearby restaurants; we'll also maintain a dictionary of all found restaurants and all found italian restaurants

def get_venues(lats, lons):
    list_venues = {}
    location_venues = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=RADIUS+50 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, CLIENT_ID, CLIENT_SECRET, VERSION, radius=RADIUS+50, limit=LIMIT)
        area_venues = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
            venue_el = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, x, y)
            if venue_distance<=RADIUS:
                area_venues.append(venue_el)
            list_venues[venue_id] = venue_el
        location_venues.append(area_venues)
        print(' .', end='')
    print(' done.')
    return venues, location_venues

# Try to load from local file system in case we did this before
list_venues = {}
location_venues = []
loaded = False
try:
    with open('venues_350.pkl', 'rb') as f:
        list_venues = pickle.load(f)
    with open('location_venues_350.pkl', 'rb') as f:
        location_venues = pickle.load(f)
    print('Venues data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    list_venues, location_venues = get_venues(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('venues_350.pkl', 'wb') as f:
        pickle.dump(list_venues, f)
    with open('location_venues_350.pkl', 'wb') as f:
        pickle.dump(location_venues, f)

In [None]:
print('Total number of venues:', len(list_venues))
print('Average number of venues in neighborhood:', np.array([len(r) for r in location_venues]).mean())

In [None]:
list_venues