# 1. Adding coordinates to a CSV with addresses

**Table of contents**:
* [Introduction](#Introduction)
* [Starting point: CSV without coordinates](#Starting-point:-CSV-without-coordinates)
* [Geocode addresses](#Geocode-addresses)
* [Display results on a map](#Display-results-on-a-map)

## Introduction
In order to begin implementing a location intelligence strategy, many organisations must first figure out **how to extract the coordinates of the geographic component of their data**, such as the postal address. We will address this challenge on this Notebook.

![Geocoding](./images/geocoding.png)

**Prerequisites:**
* AWS SDK for Python ([Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html))
* [Python-dotenv](https://pypi.org/project/python-dotenv/) reads key-value pairs from a .env file and can set them as environment variables.
* [pandas](https://pandas.pydata.org/) for data manipulation
* [leafmap](https://leafmap.org/) to visualize the coordinates

In [None]:
! pip3 install boto3
! pip3 install python-dotenv
! pip3 install pandas
! pip3 install leafmap

In [11]:
import boto3
from dotenv import load_dotenv
import os
import pandas as pd

## Starting point: CSV without coordinates

In [12]:
df = pd.read_csv("data/stores/uk_stores_without_coordinates.csv")
df

Unnamed: 0,id,name,address,hours
0,1,Birmingham New Street,"Station Street, Birmingham, B2 4QA",09:00-17:00
1,2,Bristol Temple Meads,"Station Approach, Bristol, BS1 6QF",09:00-17:00
2,3,Clapham Junction,"St John's Hill, Clapham, Greater London, SW11 2QP",09:00-17:00
3,4,Guildford,"Station Approach, Guildford, GU1 4UT",09:00-17:00
4,5,Leeds City,"New Station Street, Leeds, LS1 4DY",08:00-20:30
5,6,Liverpool Lime Street,"Lime Street, Liverpool, L1 1JD",09:00-17:30
6,7,London Bridge,"Tooley Street, London, SE1 3QX",08:00-16:30
7,8,London Cannon Street,"Cannon Street, London, EC4N 6AP",09:00-17:30
8,9,London Charing Cross,"The Strand, London, WC2N 5HF",08:00-18:30
9,10,London Euston,"Euston Road, London, NW1 2RT",09:00-17:30


## Geocode addresses

To initialise Boto3, **you must generate a `.evn` file in the root directory** with the following structure:

```
AWS_ACCESS_KEY = YOUR_ACCESS_KEY
AWS_SECRET_ACCESS_KEY = YOUR_SECRET_KEY
AWS_REGION = YOUR_REGION
```

In [13]:
# Load environment variables from .env file
load_dotenv()

# Create a client for Amazon Location service
amazon_location_client = boto3.client(
    "location", # https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/location.html
    aws_access_key_id = os.getenv("AWS_ACCESS_KEY"),
    aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY"),
    region_name = os.getenv("AWS_REGION"),
)

Now we will use [search_place_index_for_text](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/location/client/search_place_index_for_text.html) function to send a request to [SearchPlaceIndexForText API](https://docs.aws.amazon.com/location/latest/APIReference/API_SearchPlaceIndexForText.html). 

In this case, we will use a **Place index resource with data storage enabled**:
![Place index resource - Data storage options screenshot](./images/place-index-resource-data-storage.png)

In [14]:
def amazon_location_geocoder(address):
    result = amazon_location_client.search_place_index_for_text(
        IndexName = 'AWS-Esri-Demo-PlaceIndex', 
        Text = address
        MaxResults = 1, 
        # For fine grain control you can use: BiasPosition, FilterBBox, FilterCountries, ...
    )
    if len(result["Results"]) == 0:
        print(f'No results found for: {address}')
        return None
    else:
        top_result = result["Results"][0]
        # Add additional logic to ensure data match
        place = top_result["Place"]
        geometry = place.get("Geometry")
        if geometry:
            longitude = geometry.get("Point", [None, None])[0]
            latitude = geometry.get("Point", [None, None])[1]
        else:
            longitude = None
            latitude = None
        return {
            "street_number": place.get("AddressNumber"),
            "street_name": place.get("Street"),
            "city": place.get("Municipality"),
            "province": place.get("Region"),
            "postal_code": place.get("PostalCode"),
            "country": place.get("Country"),
            "label": place.get("Label"),
            "longitude": longitude,
            "latitude": latitude,
        }

In [15]:
for index, row in df.iterrows():
    address = df.at[index, 'address']
    result = amazon_location_geocoder(address)
    lon = result['longitude']
    lat = result['latitude']
    print(f'Geocoding: {address} - [{lon}, {lat}]')
    df.at[index,'longitude'] = lon
    df.at[index,'latitude'] = lat

df.to_csv('data/stores/uk_stores_with_coordinates.csv', index = False)

print(f"All stores have been geocoded and saved locally!")

Geocoding: Station Street, Birmingham, B2 4QA - [-1.897900002961, 52.477159980907]
Geocoding: Station Approach, Bristol, BS1 6QF - [-2.583003846773, 51.448783907766]
Geocoding: St John's Hill, Clapham, Greater London, SW11 2QP - [-0.169860022222, 51.463619997475]
Geocoding: Station Approach, Guildford, GU1 4UT - [-0.565452200312, 51.240223266611]
Geocoding: New Station Street, Leeds, LS1 4DY - [-1.54675119936, 53.795010905595]
Geocoding: Lime Street, Liverpool, L1 1JD - [-2.979140942791, 53.406507913708]
Geocoding: Tooley Street, London, SE1 3QX - [-0.085979970212, 51.503960001983]
Geocoding: Cannon Street, London, EC4N 6AP - [-0.088112832385, 51.511199101651]
Geocoding: The Strand, London, WC2N 5HF - [-0.125497431872, 51.508524845703]
Geocoding: Euston Road, London, NW1 2RT - [-0.134224371636, 51.526194853955]
Geocoding: Euston Road, London, N1 9AL - [-0.123195845726, 51.530367545945]
Geocoding: Bishopsgate, London, EC2M 7PY - [-0.082299979442, 51.518560020029]
Geocoding: Praed Street

## Display results on a map

In [17]:
import leafmap

m = leafmap.Map(center=[51.50852507962595, -0.12549702207699787], zoom=6, draw_control=False, measure_control=False)
m.add_basemap("Esri.WorldTopoMap")
m.add_xy_data('data/stores/uk_stores_with_coordinates.csv', x="longitude", y="latitude", layer_name="UK stores")
m

Map(center=[51.50852507962595, -0.12549702207699787], controls=(ZoomControl(options=['position', 'zoom_in_text…