## Problem 3: How many people live near shopping centers? (8 points)

In the last step of this analysis, use a *spatial join* to relate data from a population grid data set to the buffer layer created in *problem 2* to find out how many people live in all population grid cells that are **within** 1.5 km distance from each shopping centre. 

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS).


*Feel free to divide your solution into more codeblocks than prepared! Remember to add comments to your code :)*

### a) Load the population grid data set and the buffer geometries

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS). Load the data into a `GeoDataFrame` called `population_grid`.

(optional) If you want, discard unneeded columns and translate the remaining column names from Finnish to English.

In [3]:
# Address SSL verification error
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

In [4]:
# Import relevant library
import geopandas

# Load population grid data directly from WFS
population_grid = geopandas.read_file(
    (
        "https://kartta.hsy.fi/geoserver/wfs"
        "?service=wfs"
        "&version=2.0.0"
        "&request=GetFeature"
        "&typeName=asuminen_ja_maankaytto:Vaestotietoruudukko_2020"
        "&srsName=EPSG:3879"
    ),
)
population_grid.crs = crs="EPSG:3879" 

In [12]:
population_grid.head()

Unnamed: 0,gml_id,index,asukkaita,asvaljyys,ika0_9,ika10_19,ika20_29,ika30_39,ika40_49,ika50_59,ika60_69,ika70_79,ika_yli80,geometry
0,Vaestotietoruudukko_2020.1,703,5,51,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6685998.998, 25472499.9..."
1,Vaestotietoruudukko_2020.2,710,8,44,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6684249.004, 25472499.9..."
2,Vaestotietoruudukko_2020.3,711,5,90,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6683999.005, 25472499.9..."
3,Vaestotietoruudukko_2020.4,715,13,34,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6682998.998, 25472499.9..."
4,Vaestotietoruudukko_2020.5,848,5,53,99,99,99,99,99,99,99,99,99,"POLYGON ((25472749.993 6690249.003, 25472749.9..."


In [13]:
# Select the columns necessary for the study
population_grid = population_grid[["asukkaita", "geometry"]]

# Translate the "asukkaita" column to English
population_grid = population_grid.rename(columns={"asukkaita":"population"})

In [14]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import geopandas
import pyproj

assert isinstance(population_grid, geopandas.GeoDataFrame)
assert population_grid.crs == pyproj.CRS("EPSG:3879")



Load the buffers computed in *problem 2* into a `GeoDataFrame` called `shopping_centre_buffers`. Add an `assert` statement to check whether the two data frames are in the same CRS.

In [17]:
# Load the buffers layer
shopping_centre_buffers = geopandas.read_file("shopping_centres.gpkg", layer="buffers")

# Ensure the two data frames has the same CRS
assert population_grid.crs == shopping_centre_buffers.crs, "CRS are different"

In [18]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
assert isinstance(shopping_centre_buffers, geopandas.GeoDataFrame)
assert shopping_centre_buffers.geometry.geom_type.unique() == ["Polygon"]
assert shopping_centre_buffers.crs == pyproj.CRS("EPSG:3879")


---

### b) Carry out a *spatial join* between the `population_grid` and the `shopping_centre_buffers`

Join the shopping centre’s `id` column (and others, if you want) to the population grid data frame, for all population grid cells that are **within** the buffer area of each shopping centre. [Use a *join-type* that retains only rows from both input data frames for which the geometric predicate is true](https://geopandas.org/en/stable/gallery/spatial_joins.html#Types-of-spatial-joins). 


In [28]:
# Join to the two data frames base on the "within" predicate
shopping_centres_with_population = population_grid.sjoin(
    shopping_centre_buffers,
    how="inner",
    predicate="within"
)

shopping_centres_with_population

Unnamed: 0,population,geometry,index_right,address,id,name,addr
1147,128,"POLYGON ((25484250.000 6672499.005, 25484250.0...",2,"Iso Omena, 11, Piispansilta, Matinkylä, Suur-M...",1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
1148,81,"POLYGON ((25484250.000 6672249.006, 25484250.0...",2,"Iso Omena, 11, Piispansilta, Matinkylä, Suur-M...",1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
1149,20,"POLYGON ((25484250.000 6671748.997, 25484250.0...",2,"Iso Omena, 11, Piispansilta, Matinkylä, Suur-M...",1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
1211,110,"POLYGON ((25484499.998 6672749.004, 25484499.9...",2,"Iso Omena, 11, Piispansilta, Matinkylä, Suur-M...",1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
1212,136,"POLYGON ((25484499.998 6672499.005, 25484499.9...",2,"Iso Omena, 11, Piispansilta, Matinkylä, Suur-M...",1102,Iso-omena,"Piispansilta 11, 02230 Espoo, Finnland"
...,...,...,...,...,...,...,...
5288,340,"POLYGON ((25505499.998 6677248.998, 25505499.9...",0,"Itis, 1-7, Itäkatu, Itäkeskus, Vartiokylä, Itä...",1100,Itis,"Itäkatu 1-7, 00930 Helsinki, Finnland"
5341,131,"POLYGON ((25505749.995 6677999.006, 25505749.9...",0,"Itis, 1-7, Itäkatu, Itäkeskus, Vartiokylä, Itä...",1100,Itis,"Itäkatu 1-7, 00930 Helsinki, Finnland"
5342,369,"POLYGON ((25505749.995 6677748.997, 25505749.9...",0,"Itis, 1-7, Itäkatu, Itäkeskus, Vartiokylä, Itä...",1100,Itis,"Itäkatu 1-7, 00930 Helsinki, Finnland"
5343,130,"POLYGON ((25505749.995 6677498.998, 25505749.9...",0,"Itis, 1-7, Itäkatu, Itäkeskus, Vartiokylä, Itä...",1100,Itis,"Itäkatu 1-7, 00930 Helsinki, Finnland"



---

### c) Compute the population sum around shopping centres

Group the resulting (joint) data frame by shopping centre (`id` or `name`), and calculate the `sum()` of the population living inside the 1.5 km radius around them.

Print the results, for instance, in the form "12345 people live within 1.5 km from REDI".

In [36]:
# Group the data frame by the shopping centre id
shopping_centres_with_population_grouped = shopping_centres_with_population.groupby(by="name")

# total_population_within_1500m = shopping_centres_with_population_grouped["population"].sum()

# Show the population that live within 1.5km from the various shopping centres
for key, group in shopping_centres_with_population_grouped:
    total_pop_with_1500m = group["population"].sum()
    print(f"{total_pop_with_1500m} people live within 1.5 km from {key}.")



56099 people live within 1.5 km from Forum.
26698 people live within 1.5 km from Iso-omena.
21020 people live within 1.5 km from Itis.
10907 people live within 1.5 km from Jumbo.
26605 people live within 1.5 km from REDI.
24601 people live within 1.5 km from Sello.



---

### d) Reflection

Good job! You are almost done with this week’s exercise. Please quickly answer the following short questions:
    
- How challenging did you find problems 1-3 (on scale to 1-5), and why?
- What was easy?
- What was difficult?

Add your answers in a new *Markdown* cell below: