## Problem 1: Geocode shopping centers (5 points)

The overall aim of problems 1-3 is to find out **how many people live within a walking distance (1.5 km) from certain shopping centres in Helsinki**.

Problem 1 concerns the locations of shopping centres: find their addresses and translate them into coordinates.

---

### a) Prepare an input file containing the addresses of shopping centres

Find out the addresses of the following shopping centres (e.g., by using your favourite search engine), and collect them in a text file called `shopping_centres.txt`:

 - Itis
 - Forum
 - Iso-omena
 - Sello
 - Jumbo
 - REDI
 - Tripla 
 
The text file should be in semicolon-separated format (`;`) and include the following columns:

- `id` (integer) a unique identifier for each shopping centre (a
- `name` (string) of each shopping center
- `addr` (string) the address 


See an example of how to format the text file [in the lesson 3 materials](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/geocoding-in-geopandas.html). Remember to *add*, *commit*, and *push* the file to your git repository.

---


### b) Read the list of addresses

Read the list of addresses you just prepared into a `pandas.DataFrame` called `shopping_centres`

In [1]:
# Import relevant libraries
import pandas as pd
import numpy as np

# Import and read the file, shopping_centres.txt
shopping_centres = pd.read_csv("shopping_centres.txt", sep=";")
shopping_centres

Unnamed: 0,id,name,addr
0,1100,Itis,"Itäkatu 1-7, 00930 Helsinki, Finnland"
1,1101,Forum,"Mannerheimintie 14–20, 00100 Helsinki, Finnland"
2,1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
3,1103,Sello,"Leppävaarankatu 3-9, 02600 Espoo, Finnland"
4,1104,Jumbo,"Vantaanportinkatu 3, 01510 Vantaa, Finnland"
5,1105,REDI,"Hermannin rantatie 5, 00580 Helsinki, Finnland"
6,1106,Tripla,"Mall of Tripla, Fredikanterassi 1, 00520 Helsi..."


In [2]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pandas
assert isinstance(shopping_centres, pandas.DataFrame)
for column in ("id", "name", "addr"):
    assert column in shopping_centres.columns


---

### c) Geocode the addresses

Geocode the addresses using the Nominatim geocoding service. Join the results with the input data, and store them in a `geopandas.GeoDataFrame` with the same name (`shopping_centres`). 

Remember to define a custom `user_agent` string!

In [3]:
! pip install geopandas
! pip install geopy



In [9]:
import geopandas as gpd

# Geocode the shopping centres data
geocoded_shopping_centres = gpd.tools.geocode(
    shopping_centres['addr'],
    provider='nominatim',
    user_agent='autogis2022',
    timeout=20
)

# Join the results to the original dataframe
shopping_centres = geocoded_shopping_centres.join(shopping_centres)

type(shopping_centres)

geopandas.geodataframe.GeoDataFrame

In [10]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import geopandas
assert isinstance(shopping_centres, geopandas.GeoDataFrame)
for column in ("id", "name", "addr", "geometry"):
    assert column in shopping_centres.columns

Check that the coordinate reference system of the geocoded result is correctly defined, and **reproject the layer into ETRS GK-25** (EPSG:3879):

In [11]:
# The coordinate reference system of the geocoded data
shopping_centres.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [17]:
# Reproject the CRS into ETRS GK-25 (EPSG:3879)
shopping_centres = shopping_centres.to_crs("EPSG:3879")
shopping_centres.crs

<Derived Projected CRS: EPSG:3879>
Name: ETRS89 / GK25FIN
Axis Info [cartesian]:
- N[north]: Northing (metre)
- E[east]: Easting (metre)
Area of Use:
- name: Finland - nominally onshore between 24°30'E and 25°30'E but may be used in adjacent areas if a municipality chooses to use one zone over its whole extent.
- bounds: (24.5, 59.94, 25.5, 68.9)
Coordinate Operation:
- name: Finland Gauss-Kruger zone 25
- method: Transverse Mercator
Datum: European Terrestrial Reference System 1989 ensemble
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich

In [18]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pyproj
assert shopping_centres.crs == pyproj.CRS("EPSG:3879")


---

### d) Save the resulting vector data set

Save `shopping_centres` as a *GeoPackage* named `shopping_centres.gpkg`:

In [19]:
# Save the final output as a geopackage
shopping_centres.to_file("shopping_centres.gpkg")


---

Well done! Now you can continue to [problem 2](Exercise-3-Problem-2.ipynb)