<a href="https://colab.research.google.com/github/pilarandre25/geocoding/blob/main/geocoding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src='https://raw.githubusercontent.com/geografope/geocodificacion-con-python/main/img/banner.png'>

Este tutorial fue elaboradro por **Geografo.PE**

Redes sociales:
- Youtube: www.youtube.com/@ambarja
- Tiktok: https://www.tiktok.com/@geografo.pe
- Linkdin: https://www.linkedin.com/in/antonybarja/
- GitHub: https://github.com/geografope

Para más información, puedes visitar mi pagina personal:*https://geografo.pe*

### **1.Instalación de librerias a utilizar**


In [1]:
!pip install geopy
!pip install pandas
!pip install geopandas
!pip install mapclassify

Collecting mapclassify
  Downloading mapclassify-2.9.0-py3-none-any.whl.metadata (3.1 kB)
Downloading mapclassify-2.9.0-py3-none-any.whl (286 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m286.7/286.7 kB[0m [31m6.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: mapclassify
Successfully installed mapclassify-2.9.0


### **2.Llamado o activación de librerias**

In [2]:
from geopy.geocoders import Nominatim, ArcGIS, MapBox
import pandas as pd
import geopandas as gpd
from functools import partial

### **3.Lectura de los datos crudos en formato excel**

In [20]:
url = '/content/EJEMPLO.xlsx'
rawdata = pd.read_excel(url)

In [21]:
rawdata.shape

(10, 13)

### **4.Función para estandarizar extructura de texto para geocodificar**

In [22]:
# Forma estandar de la estructura del texto para geodificar con Nominatim:
# direccion,distrito,provincia,departamento,pais
# Ejemplo: "Calle Los Angeles 123, Carabayllo, Lima, Lima, Perú"
def concatenar_campos(row):
    return f"{row['Addr']}"

In [23]:
rawdata['nogeo'] = rawdata.apply(concatenar_campos, axis= 1)
rawdata.head()

Unnamed: 0,id,COD_DPTO,COD_MPIO,COD_DPTO_num,COD_MPIO_num,Address,City,Region,Country,Addr,TIPO_USARIO,ESTADO_CIVIL,GENERO,nogeo
0,1,5,5001,5,5001,CR 52 # 30 - 20,Medellín,Antioquia,Colombia,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia"
1,2,5,5001,5,5001,CRR 80 48 00,Medellín,Antioquia,Colombia,"CRR 80 48 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 80 48 00, Medellín, Antioquia, Colombia"
2,3,5,5001,5,5001,CR 51 # 79 - 10,Medellín,Antioquia,Colombia,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia"
3,4,5,5001,5,5001,CRR 64 C 88 00,Medellín,Antioquia,Colombia,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 64 C 88 00, Medellín, Antioquia, Colombia"
4,5,5,5001,5,5001,CR 63 # 123 - 12,Medellín,Antioquia,Colombia,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia"


In [24]:
rawdata_for_osm = rawdata.copy()
rawdata_for_mapbox = rawdata.copy()
rawdata_for_arcgis = rawdata.copy()

### **5.Gecodificación directa con la API de OSM**
- Nominatim: Es una herramienta open source que sirve para realizar proceso de geocodificación a través de OpenSteetMap.
Referencia: *https://github.com/osm-search/Nominatim*

In [25]:
# Gecodificación directa
geolocator = Nominatim(user_agent="geografo_pe",timeout = 5)
geocode = partial(geolocator.geocode, language="es")
def tidygeocode(row):
    location = geolocator.geocode(row['nogeo'])
    if location:
        return pd.Series({'latitude': location.latitude, 'longitude': location.longitude})
    else:
        return pd.Series({'latitude': None, 'longitude': None})

In [26]:
rawdata_for_osm[['latitude', 'longitude']] = rawdata_for_osm.apply(tidygeocode, axis=1)

In [27]:
rawdata_for_osm

Unnamed: 0,id,COD_DPTO,COD_MPIO,COD_DPTO_num,COD_MPIO_num,Address,City,Region,Country,Addr,TIPO_USARIO,ESTADO_CIVIL,GENERO,nogeo,latitude,longitude
0,1,5,5001,5,5001,CR 52 # 30 - 20,Medellín,Antioquia,Colombia,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",,
1,2,5,5001,5,5001,CRR 80 48 00,Medellín,Antioquia,Colombia,"CRR 80 48 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 80 48 00, Medellín, Antioquia, Colombia",,
2,3,5,5001,5,5001,CR 51 # 79 - 10,Medellín,Antioquia,Colombia,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",,
3,4,5,5001,5,5001,CRR 64 C 88 00,Medellín,Antioquia,Colombia,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",,
4,5,5,5001,5,5001,CR 63 # 123 - 12,Medellín,Antioquia,Colombia,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",,
5,6,5,5001,5,5001,CALLE 16 A SUR CARRERA 9 F,Medellín,Antioquia,Colombia,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu...",Afiliado,Casado,M,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu...",,
6,7,5,5001,5,5001,BUENOS AIRES,Medellín,Antioquia,Colombia,"BUENOS AIRES, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"BUENOS AIRES, Medellín, Antioquia, Colombia",6.230229,-75.556921
7,8,5,5001,5,5001,CR 46 CL 50,Medellín,Antioquia,Colombia,"CR 46 CL 50, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 46 CL 50, Medellín, Antioquia, Colombia",6.25277,-75.573975
8,9,5,5001,5,5001,CL 67 # 56 -22,Medellín,Antioquia,Colombia,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia",6.264485,-75.566958
9,10,5,5001,5,5001,CR 72 # 27 - 22,Medellín,Antioquia,Colombia,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia",,


### **6.Visualización de datos espaciales**

In [28]:
# Eliminar datos vacios
geo_rawdata = rawdata_for_osm.dropna()
# Dataframe a gepandas
geo_rawdata = gpd.GeoDataFrame(data = geo_rawdata, geometry=gpd.points_from_xy(geo_rawdata.longitude, geo_rawdata.latitude),crs = 4326)

In [29]:
# Visualización interactiva
geo_rawdata.explore(tiles = "Esri.WorldImagery",marker_kwds={'radius': 10} )

### **7.Geocodificación inversa**

In [None]:
# Geocodificación indirecta
def tidygeocode_inv(row):
  adress = geolocator.reverse([row['latitude'],row['longitude']])
  return(str(adress))

In [None]:
geocode_inv = geo_rawdata.copy()

In [None]:
geocode_inv['direccion_geo_inv'] = geocode_inv.apply(tidygeocode_inv, axis = 1)

In [None]:
geocode_inv = geocode_inv.drop(columns=['geometry'])
geocode_inv = gpd.GeoDataFrame(data = geocode_inv, geometry=gpd.points_from_xy(geo_rawdata.longitude, geo_rawdata.latitude),crs = 4326)

In [None]:
geocode_inv

### **8.Exportar datos espaciales**

In [30]:
# Exportar datos en formato gpkg
geo_rawdata.to_file('geocoding_directo.gpkg')
geocode_inv.to_file('geocoding_inverso.gpkg')

NameError: name 'geocode_inv' is not defined

### **9.Gecodificación usando la API de MapBox**
Para poder optener nuestra API de MapBox tenemos que registrarnos en el siguiente enlace:
 - Registro: *https://account.mapbox.com/auth/signup/*
 - Activar API: *https://account.mapbox.com/access-tokens/create*

#### *Geocodificación gratis: 100 000 al mes*

In [31]:
api_mapbox = 'PON_TU_API_KEY_AQUI'
geolocator = MapBox(api_key = api_mapbox)
def tidygeocode(row):
    location = geolocator.geocode(row['nogeo'])
    if location:
        return pd.Series({'latitude': location.latitude, 'longitude': location.longitude})
    else:
        return pd.Series({'latitude': None, 'longitude': None})

In [32]:
rawdata_for_mapbox[['latitude', 'longitude']] = rawdata_for_mapbox.apply(tidygeocode, axis=1)

GeocoderAuthenticationFailure: Non-successful status code 401

In [33]:
rawdata_for_mapbox

Unnamed: 0,id,COD_DPTO,COD_MPIO,COD_DPTO_num,COD_MPIO_num,Address,City,Region,Country,Addr,TIPO_USARIO,ESTADO_CIVIL,GENERO,nogeo
0,1,5,5001,5,5001,CR 52 # 30 - 20,Medellín,Antioquia,Colombia,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia"
1,2,5,5001,5,5001,CRR 80 48 00,Medellín,Antioquia,Colombia,"CRR 80 48 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 80 48 00, Medellín, Antioquia, Colombia"
2,3,5,5001,5,5001,CR 51 # 79 - 10,Medellín,Antioquia,Colombia,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia"
3,4,5,5001,5,5001,CRR 64 C 88 00,Medellín,Antioquia,Colombia,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 64 C 88 00, Medellín, Antioquia, Colombia"
4,5,5,5001,5,5001,CR 63 # 123 - 12,Medellín,Antioquia,Colombia,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia"
5,6,5,5001,5,5001,CALLE 16 A SUR CARRERA 9 F,Medellín,Antioquia,Colombia,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu...",Afiliado,Casado,M,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu..."
6,7,5,5001,5,5001,BUENOS AIRES,Medellín,Antioquia,Colombia,"BUENOS AIRES, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"BUENOS AIRES, Medellín, Antioquia, Colombia"
7,8,5,5001,5,5001,CR 46 CL 50,Medellín,Antioquia,Colombia,"CR 46 CL 50, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 46 CL 50, Medellín, Antioquia, Colombia"
8,9,5,5001,5,5001,CL 67 # 56 -22,Medellín,Antioquia,Colombia,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia"
9,10,5,5001,5,5001,CR 72 # 27 - 22,Medellín,Antioquia,Colombia,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia"


### **10.Gecodificación usando la API de ArcGIS**
Para poder optener nuestra API de ArcGIS tenemos que registrarnos en el siguiente enlace:
 * Registro: *https://developers.arcgis.com/sign-up/*
 * Activar API: *https://developers.arcgis.com/dashboard/#*

#### *Geocodificación gratis: 20 000 gratis*

In [34]:
api_arcgis = 'PON_TU_API_KEY_AQUI'
geolocator = ArcGIS(auth_domain = api_arcgis)
def tidygeocode(row):
    location = geolocator.geocode(row['nogeo'])
    if location:
        return pd.Series({'latitude': location.latitude, 'longitude': location.longitude})
    else:
        return pd.Series({'latitude': None, 'longitude': None})

In [35]:
rawdata_for_arcgis[['latitude', 'longitude']] = rawdata_for_arcgis.apply(tidygeocode, axis=1)



In [36]:
rawdata_for_arcgis

Unnamed: 0,id,COD_DPTO,COD_MPIO,COD_DPTO_num,COD_MPIO_num,Address,City,Region,Country,Addr,TIPO_USARIO,ESTADO_CIVIL,GENERO,nogeo,latitude,longitude
0,1,5,5001,5,5001,CR 52 # 30 - 20,Medellín,Antioquia,Colombia,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 52 # 30 - 20, Medellín, Antioquia, Colombia",6.298179,-75.557789
1,2,5,5001,5,5001,CRR 80 48 00,Medellín,Antioquia,Colombia,"CRR 80 48 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 80 48 00, Medellín, Antioquia, Colombia",6.25944,-75.59765
2,3,5,5001,5,5001,CR 51 # 79 - 10,Medellín,Antioquia,Colombia,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 51 # 79 - 10, Medellín, Antioquia, Colombia",6.273509,-75.562313
3,4,5,5001,5,5001,CRR 64 C 88 00,Medellín,Antioquia,Colombia,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CRR 64 C 88 00, Medellín, Antioquia, Colombia",6.281505,-75.57177
4,5,5,5001,5,5001,CR 63 # 123 - 12,Medellín,Antioquia,Colombia,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CR 63 # 123 - 12, Medellín, Antioquia, Colombia",6.187188,-75.643068
5,6,5,5001,5,5001,CALLE 16 A SUR CARRERA 9 F,Medellín,Antioquia,Colombia,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu...",Afiliado,Casado,M,"CALLE 16 A SUR CARRERA 9 F, Medellín, Antioqu...",6.22907,-75.542464
6,7,5,5001,5,5001,BUENOS AIRES,Medellín,Antioquia,Colombia,"BUENOS AIRES, Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"BUENOS AIRES, Medellín, Antioquia, Colombia",6.240703,-75.556065
7,8,5,5001,5,5001,CR 46 CL 50,Medellín,Antioquia,Colombia,"CR 46 CL 50, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 46 CL 50, Medellín, Antioquia, Colombia",6.248739,-75.565223
8,9,5,5001,5,5001,CL 67 # 56 -22,Medellín,Antioquia,Colombia,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia",Afiliado,Soltero,F,"CL 67 # 56 -22 , Medellín, Antioquia, Colombia",6.26474,-75.56972
9,10,5,5001,5,5001,CR 72 # 27 - 22,Medellín,Antioquia,Colombia,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia",Afiliado,Casado,M,"CR 72 # 27 - 22, Medellín, Antioquia, Colombia",6.229058,-75.592649


In [40]:
geo_rawdata_for_arcgis = gpd.GeoDataFrame(data = rawdata_for_arcgis, geometry=gpd.points_from_xy(rawdata_for_arcgis.longitude, rawdata_for_arcgis.latitude),crs = 4326)


In [41]:
geo_rawdata_for_arcgis.to_file('geocoding_directoARCGIS2.gpkg')