## Problem 3: How many people live near shopping centers? (10 points)

In the last step of this analysis, use a *spatial join* to relate data from a population grid data set to the buffer layer created in *problem 2* to find out how many people live in all population grid cells that are **within** 1.5 km distance from each shopping centre. 

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS).


*Feel free to divide your solution into more codeblocks than prepared! Remember to add comments to your code :)*

### a) Load the population grid data set and the buffer geometries (2 points)

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS). Load the data into a `GeoDataFrame` called `population_grid`.

(optional) If you want, discard unneeded columns and translate the remaining column names from Finnish to English.

In [31]:
# ADD YOUR OWN CODE HERE
import geopandas as gpd
population_grid = gpd.read_file(
    (
        "https://kartta.hsy.fi/geoserver/wfs"
        "?service=wfs"
        "&version=2.0.0"
        "&request=GetFeature"
        "&typeName=asuminen_ja_maankaytto:Vaestotietoruudukko_2020"
        "&srsName=EPSG:3879"
    ), 
)
population_grid = population_grid.set_crs('EPSG:3879')

population_grid = population_grid[["asukkaita", "geometry"]]
population_grid = population_grid.rename(columns={"asukkaita": "population"})

In [33]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import geopandas
import pyproj

assert isinstance(population_grid, geopandas.GeoDataFrame)
assert population_grid.crs == pyproj.CRS("EPSG:3879")



Load the buffers computed in *problem 2* into a `GeoDataFrame` called `shopping_centre_buffers`. Add an `assert` statement to check whether the two data frames are in the same CRS.

In [34]:
# ADD YOUR OWN CODE HERE
shopping_centre_buffers = gpd.read_file('./data/shopping_centres.gpkg', layer='buffers')
assert shopping_centre_buffers.crs == population_grid.crs

In [35]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
assert isinstance(shopping_centre_buffers, geopandas.GeoDataFrame)
assert shopping_centre_buffers.geometry.geom_type.unique() == ["Polygon"]
assert shopping_centre_buffers.crs == pyproj.CRS("EPSG:3879")


---

### b) Carry out a *spatial join* between the `population_grid` and the `shopping_centre_buffers`  (4 points)

Join the shopping centre’s `id` column (and others, if you want) to the population grid data frame, for all population grid cells that are **within** the buffer area of each shopping centre. [Use a *join-type* that retains only rows from both input data frames for which the geometric predicate is true](https://geopandas.org/en/stable/gallery/spatial_joins.html#Types-of-spatial-joins). 


In [36]:
# ADD YOUR OWN CODE HERE
shopping_centres_pop_data = shopping_centre_buffers.sjoin(
    population_grid,
    how='left',
    predicate='within'
)
shopping_centres_pop_data.head()

Unnamed: 0,geometry,index_right,population
0,"POLYGON ((25496286.135 6672912.233, 25496286.1...",3392.0,303.0
1,"POLYGON ((25496548.33 6672921.492, 25496548.32...",3438.0,133.0
2,"POLYGON ((25496770.508 6672992.639, 25496770.5...",3486.0,12.0
3,"POLYGON ((25497133.775 6672936.908, 25497133.7...",,
4,"POLYGON ((25496993.975 6672784.27, 25496993.96...",3486.0,12.0



---

### c) Compute the population sum around shopping centres (4 points)

Group the resulting (joint) data frame by shopping centre (`id` or `name`), and calculate the `sum()` of the population living inside the 1.5 km radius around them.

Print the results, for instance, in the form "12345 people live within 1.5 km from REDI".

In [37]:
shopping_centres = gpd.read_file('./data/shopping_centres.gpkg', layer='shopping_centres')
del shopping_centres['geometry']
data = gpd.pd.concat((shopping_centres, shopping_centres_pop_data), axis=1)



In [39]:
data.fillna(0, inplace=True)
data['population'] = data['population'].astype(int)

for row in data.to_dict(orient='records'):
    print(f"{row['population']} people live within 1.5 km from {row['name']}")

303 people live within 1.5 km from Kamppi Shopping Center
133 people live within 1.5 km from Forum
12 people live within 1.5 km from Citycenter
0 people live within 1.5 km from Kluuvi
12 people live within 1.5 km from Kämp Galleria
0 people live within 1.5 km from Itis (Itäkeskus)
647 people live within 1.5 km from REDI
298 people live within 1.5 km from Kaari
429 people live within 1.5 km from Columbus
213 people live within 1.5 km from Ruoholahti
1659 people live within 1.5 km from Arabia
350 people live within 1.5 km from Malmin Nova
384 people live within 1.5 km from Ogeli
462 people live within 1.5 km from Hertsi
188 people live within 1.5 km from Kontulan Ostari
492 people live within 1.5 km from Lauttis
534 people live within 1.5 km from Tripla Mall
12 people live within 1.5 km from Stockmann Department Store
307 people live within 1.5 km from Puhos Shopping Center
45 people live within 1.5 km from Kaisaniemi Shopping Center



---

### d) Reflection

Good job! You are almost done with this week’s exercise. Please quickly answer the following short questions:
    
- How challenging did you find problems 1-3 (on scale to 1-5), and why?
- What was easy?
- What was difficult?

Add your answers in a new *Markdown* cell below:

In [None]:
easy!