# Introduction
We will learn about 2 common manipulations for geospatial data: geocoding and table joins.

## Geocoding
It is the process of converting the name of a place or an address to a location on the map. Example google maps.

In [1]:
from geopandas.tools import geocode

To use the geocoder, we need to provide:<br>
1) The name or address as a Python string<br>
2) The name of the provider, to avoid having to provide an API key, we will use the OpenStreetMap Nominatim geocoder.<br>
<br>
If the geocoding is successful, it returns a GeoDataFrame with 2 columns<br>
1) the "geometry" column which contains the (latitude, longitude) location and<br>
2) The "Address" column contains the full address

In [2]:
from geopy.geocoders import Nominatim

nom = Nominatim(user_agent='ad_application')
place = 'Taj mahal'
result = nom.geocode(place)
result

Location(Taj Mahal, Taj Mahal Internal Path, Taj Ganj, Agra, Uttar Pradesh, 282001, India, (27.1750123, 78.04209683661315, 0.0))

In [3]:
result.latitude

27.1750123

In [4]:
result.longitude

78.04209683661315

Use case that we will want to geocode many different addresses. For instance, say we want to obtain the locations of 100 top universities in Europe.

In [6]:
import pandas as pd

universities = pd.read_csv('../Datasets/University.csv')
universities.head()

Unnamed: 0.1,Unnamed: 0,Universities
0,1,Andrews University
1,3,Alabama A&M University
2,4,Arizona State University (ASU)
3,5,Alabama State University
4,6,Athens State University


In [10]:
universities.drop('Unnamed: 0', axis=1, inplace=True)

In [11]:
universities.head()

Unnamed: 0,Universities
0,Andrews University
1,Alabama A&M University
2,Arizona State University (ASU)
3,Alabama State University
4,Athens State University


In [13]:
import numpy as np
import geopandas as gpd

def my_geocoder(row):
    try:
        point = nom.geocode(row).geometry.iloc[0]
        return pd.Seeries({'Latitude': point.y, 'Longitude': point.x, 'geometry':point})
    except:
        return None
    
universities[['Latitude', 'Longitude', 'geometry']] = universities.apply(lambda x: my_geocoder(x['Universities']), axis=1)

print("{}% of addresses were geocoded!".format((1-sum(np.isnan(universities['Latitude']))/len(universities))*100))

# Drop universities that were not successfully geocoded
universities = universities.loc[~np.isnan(universities['Latitude'])]

universities = gpd.GeoDataFrame(universities, geometry=universities.geometry)
universities.crs = {'init': 'epsg: 4326'}
universities.head()



ValueError: Must have equal len keys and value when setting with an iterable