## Problem 1: Geocode shopping centers (5 points)

The overall aim of problems 1-3 is to find out **how many people live within a walking distance (1.5 km) from certain shopping centres in Helsinki**.

Problem 1 concerns the locations of shopping centres: find their addresses and translate them into coordinates.

---

### a) Prepare an input file containing the addresses of shopping centres

Find out the addresses of the following shopping centres (e.g., by using your favourite search engine), and collect them in a text file called `shopping_centres.txt`:

 - Itis
 - Forum
 - Iso-omena
 - Sello
 - Jumbo
 - REDI
 - Tripla 
 
The text file should be in semicolon-separated format (`;`) and include the following columns:

- `id` (integer) a unique identifier for each shopping centre (a
- `name` (string) of each shopping center
- `addr` (string) the address 


See an example of how to format the text file [in the lesson 3 materials](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/geocoding-in-geopandas.html). Remember to *add*, *commit*, and *push* the file to your git repository.

---


### b) Read the list of addresses

Read the list of addresses you just prepared into a `pandas.DataFrame` called `shopping_centres`

In [86]:
import pathlib
import json
import pandas as pd
import geopandas as gpd
from geopandas.tools import geocode

NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"

In [87]:
with open("query_7.json", "r") as read_file:
    json_data = json.load(read_file)
json_to_df = json_data["features"]

In [106]:
# ADD YOUR OWN CODE HERE
my_list = []
for feature in json_to_df:
    my_list.append(
        {'id': feature["id"],
         'name': feature["properties"]["INF_NOME"],
         'addr': f'{feature["properties"]["INF_MORADA"]}, {feature["properties"]["FREGUESIA"]}, Lisbon, Portugal'
        }        
    )
shopping_centres = pd.DataFrame(my_list)
shopping_centres

Unnamed: 0,id,name,addr
0,1,Armazéns da Ajuda,"Calçada da Ajuda, 89-93;, Ajuda, Lisbon, Portugal"
1,30,Galerias Saldanha Residence,"Avenida Fontes Pereira de Melo, 42-42E, Arroio..."
2,3,Armazéns do Chiado,"Rua Nova do Almada, 102-126;, Santa Maria Maio..."
3,4,Centro Comercial e Cultural Espaço Chiado,"Rua Nova da Trindade, 5-5G;, Santa Maria Maior..."
4,7,Centro Comercial Mouraria,"Rua Fernandes da Fonseca, 1-1B;, Santa Maria M..."


In [107]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pandas
assert isinstance(shopping_centres, pandas.DataFrame)
for column in ("id", "name", "addr"):
    assert column in shopping_centres.columns


---

### c) Geocode the addresses

Geocode the addresses using the Nominatim geocoding service. Join the results with the input data, and store them in a `geopandas.GeoDataFrame` with the same name (`shopping_centres`). 

Remember to define a custom `user_agent` string!

In [108]:
# ADD YOUR OWN CODE HERE
geocoded_addresses = geocode(
    shopping_centres["addr"],
    provider="nominatim",
    user_agent="autogis2022",
    timeout=10
)
geocoded_addresses.head()

Unnamed: 0,geometry,address
0,POINT (-9.19945 38.70381),"Calçada da Ajuda, Ajuda, Lisboa, 1300-008, Por..."
1,POINT (-9.14599 38.73178),"Avenida Fontes Pereira de Melo, Saldanha, Arro..."
2,POINT (-9.13916 38.70989),"Rua Nova do Almada, Chiado, Santa Maria Maior,..."
3,POINT (-9.14242 38.71215),"Rua Nova da Trindade, Sacramento, Santa Maria ..."
4,POINT (-9.13511 38.71676),"Rua Fernandes da Fonseca, Socorro, Santa Maria..."


In [109]:
shopping_centres = geocoded_addresses[["geometry"]].join(shopping_centres)

In [110]:
shopping_centres

Unnamed: 0,geometry,id,name,addr
0,POINT (-9.19945 38.70381),1,Armazéns da Ajuda,"Calçada da Ajuda, 89-93;, Ajuda, Lisbon, Portugal"
1,POINT (-9.14599 38.73178),30,Galerias Saldanha Residence,"Avenida Fontes Pereira de Melo, 42-42E, Arroio..."
2,POINT (-9.13916 38.70989),3,Armazéns do Chiado,"Rua Nova do Almada, 102-126;, Santa Maria Maio..."
3,POINT (-9.14242 38.71215),4,Centro Comercial e Cultural Espaço Chiado,"Rua Nova da Trindade, 5-5G;, Santa Maria Maior..."
4,POINT (-9.13511 38.71676),7,Centro Comercial Mouraria,"Rua Fernandes da Fonseca, 1-1B;, Santa Maria M..."


In [111]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import geopandas
assert isinstance(shopping_centres, geopandas.GeoDataFrame)
for column in ("id", "name", "addr", "geometry"):
    assert column in shopping_centres.columns

Check that the coordinate reference system of the geocoded result is correctly defined, and **reproject the layer into ETRS GK-25** (EPSG:3879):

In [112]:
# ADD YOUR OWN CODE HERE
shopping_centres = shopping_centres.to_crs("EPSG:3879")

In [113]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pyproj
assert shopping_centres.crs == pyproj.CRS("EPSG:3879")

In [115]:
# reprojecting back to ETRS89 / Portugal TM06: EPSG:3763 because EPSG:3879 is the Projected coordinate system for Finland. I am considering POrtuguese shopping centers.
shopping_centres = shopping_centres.to_crs("EPSG:3763")


---

### d) Save the resulting vector data set

Save `shopping_centres` as a *GeoPackage* named `shopping_centres.gpkg`:

In [116]:
# ADD YOUR OWN CODE HERE
shopping_centres.to_file('shopping_centres.gpkg', driver='GPKG')


---

Well done! Now you can continue to [problem 2](Exercise-3-Problem-2.ipynb)