## Problem 1: Geocode shopping centers (5 points)

The overall aim of problems 1-3 is to find out **how many people live within a walking distance (1.5 km) from certain shopping centres in Helsinki**.

Problem 1 concerns the locations of shopping centres: find their addresses and translate them into coordinates.

---

### a) Prepare an input file containing the addresses of shopping centres (1 point)

Find out the addresses of the following shopping centres (e.g., by using your favourite search engine), and collect them in a text file called `shopping_centres.txt`:

 - Itis
 - Forum
 - Iso-omena
 - Sello
 - Jumbo
 - REDI
 - Tripla 
 
The text file should be in semicolon-separated format (`;`) and include the following columns:

- `id` (integer) a unique identifier for each shopping centre (a
- `name` (string) of each shopping center
- `addr` (string) the address 


See an example of how to format the text file [in the lesson 3 materials](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/geocoding-in-geopandas.html). Remember to *add*, *commit*, and *push* the file to your git repository.

---


### b) Read the list of addresses (1 point)

Read the list of addresses you just prepared into a `pandas.DataFrame` called `shopping_centres`

In [10]:
import pathlib
NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"
DATA_DIRECTORY

PosixPath('/Users/cheunghy/Documents/GitHub/Automating-GIS-Process-2023-Exercise-3/data')

In [11]:
# ADD YOUR OWN CODE HERE
import pandas as pd
shopping_centres = pd.read_csv(DATA_DIRECTORY / "shopping_centres.txt", sep = '\t', encoding='latin-1')
shopping_centres.head()

Unnamed: 0,id,name,addr
0,1,Itis,"Itäkatu 1-7, 00930 Helsinki, Finland"
1,2,Forum,"Mannerheimintie 1420, 00100 Helsinki, Finland"
2,3,Iso-omena,"Piispansilta 11, 02230 Espoo, Finland"
3,4,Sello,"Leppävaarankatu 3-9, 02600 Espoo, Finland"
4,5,Jumbo,"Vantaanportinkatu 3, 01510 Vantaa, Finland"


In [12]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pandas
assert isinstance(shopping_centres, pandas.DataFrame)
for column in ("id", "name", "addr"):
    assert column in shopping_centres.columns


---

### c) Geocode the addresses (2 points)

Geocode the addresses using the Nominatim geocoding service. Join the results with the input data, and store them in a `geopandas.GeoDataFrame` with the same name (`shopping_centres`). 

Remember to define a custom `user_agent` string!

In [4]:
type(shopping_centres)

pandas.core.frame.DataFrame

In [4]:
# ADD YOUR OWN CODE HERE
import geopandas
import pyproj
shopping_centres_addr = geopandas.tools.geocode(
    shopping_centres["addr"],
    provider = "nominatim",
    user_agent = "autogis2023",
    timeout = 10
)
shopping_centres = shopping_centres_addr.join(shopping_centres)
shopping_centres.head()

Unnamed: 0,geometry,address,id,name,addr
0,POINT (25.07947 60.21046),"CAP-Autokoulu, 1-7, Itäkatu, Itäkeskus, Vartio...",1,Itis,"Itäkatu 1-7, 00930 Helsinki, Finland"
1,POINT (24.93738 60.17078),"Mannerheimintie, Keskusta, Kluuvi, Eteläinen s...",2,Forum,"Mannerheimintie 1420, 00100 Helsinki, Finland"
2,POINT (24.73884 60.16151),"Pentik, 11, Piispansilta, Matinkylä, Suur-Mati...",3,Iso-omena,"Piispansilta 11, 02230 Espoo, Finland"
3,POINT (24.81279 60.21846),"Dr. Denim, 3-9, Leppävaarankatu, Ruusutorppa, ...",4,Sello,"Leppävaarankatu 3-9, 02600 Espoo, Finland"
4,POINT (24.96282 60.29245),"Stockmann, 3, Vantaanportinkatu, Vantaanportti...",5,Jumbo,"Vantaanportinkatu 3, 01510 Vantaa, Finland"


In [5]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import geopandas
assert isinstance(shopping_centres, geopandas.GeoDataFrame)
for column in ("id", "name", "addr", "geometry"):
    assert column in shopping_centres.columns

Check that the coordinate reference system of the geocoded result is correctly defined, and **reproject the layer into ETRS GK-25** (EPSG:3879):

In [6]:
# ADD YOUR OWN CODE HERE
print(shopping_centres.crs)
shopping_centres.crs = pyproj.CRS("EPSG:3879")

EPSG:4326


In [7]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pyproj
assert shopping_centres.crs == pyproj.CRS("EPSG:3879")


---

### d) Save the resulting vector data set (1 point)

Save `shopping_centres` as a *GeoPackage* named `shopping_centres.gpkg`:

In [8]:
shopping_centres

Unnamed: 0,geometry,address,id,name,addr
0,POINT (25.079 60.210),"CAP-Autokoulu, 1-7, Itäkatu, Itäkeskus, Vartio...",1,Itis,"Itäkatu 1-7, 00930 Helsinki, Finland"
1,POINT (24.937 60.171),"Mannerheimintie, Keskusta, Kluuvi, Eteläinen s...",2,Forum,"Mannerheimintie 1420, 00100 Helsinki, Finland"
2,POINT (24.739 60.162),"Pentik, 11, Piispansilta, Matinkylä, Suur-Mati...",3,Iso-omena,"Piispansilta 11, 02230 Espoo, Finland"
3,POINT (24.813 60.218),"Dr. Denim, 3-9, Leppävaarankatu, Ruusutorppa, ...",4,Sello,"Leppävaarankatu 3-9, 02600 Espoo, Finland"
4,POINT (24.963 60.292),"Stockmann, 3, Vantaanportinkatu, Vantaanportti...",5,Jumbo,"Vantaanportinkatu 3, 01510 Vantaa, Finland"
5,POINT (24.980 60.187),"Redi, 5, Hermannin rantatie, Verkkosaari, Kala...",6,REDI,"Hermannin rantatie 5, 00580 Helsinki, Finland"
6,POINT (24.931 60.199),"Delhi Rasoi Tripla, 1, Fredikanterassi, Keski-...",7,Tripla,"Fredikanterassi 1, 00520 Helsinki, Finland"


In [9]:
# ADD YOUR OWN CODE HERE
shopping_centres.to_file(DATA_DIRECTORY / "shopping_centres.gpkg")


---

Well done! Now you can continue to [problem 2](Exercise-3-Problem-2.ipynb)