## Problem 3: How many people live near shopping centers? (8 points)

In the last step of this analysis, use a *spatial join* to relate data from a population grid data set to the buffer layer created in *problem 2* to find out how many people live in all population grid cells that are **within** 1.5 km distance from each shopping centre. 

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS).


*Feel free to divide your solution into more codeblocks than prepared! Remember to add comments to your code :)*

### a) Load the population grid data set and the buffer geometries (2 points)

Use the same population grid data set as during [lesson 3](https://autogis-site.readthedocs.io/en/latest/lessons/lesson-3/spatial-join.html) (load it directly from WFS, don’t forget to assign a CRS). Load the data into a `GeoDataFrame` called `population_grid`.

(optional) If you want, discard unneeded columns and translate the remaining column names from Finnish to English.

In [1]:
import pathlib
NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"
DATA_DIRECTORY

PosixPath('/Users/cheunghy/Documents/GitHub/population-around-shopping-centres/data')

In [2]:
# ADD YOUR OWN CODE HERE
import geopandas
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
population_grid = geopandas.read_file(
    (
        "https://kartta.hsy.fi/geoserver/wfs"
        "?service=wfs"
        "&version=2.0.0"
        "&request=GetFeature"
        "&typeName=asuminen_ja_maankaytto:Vaestotietoruudukko_2020"
        "&srsName=EPSG:3879"
    ),
)
population_grid.crs = "EPSG:3879"  # for WFS data, the CRS needs to be specified manually

In [3]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
import pyproj

assert isinstance(population_grid, geopandas.GeoDataFrame)
assert population_grid.crs == pyproj.CRS("EPSG:3879")



Load the buffers computed in *problem 2* into a `GeoDataFrame` called `shopping_centre_buffers`. Add an `assert` statement to check whether the two data frames are in the same CRS.

In [4]:
# ADD YOUR OWN CODE HERE
shopping_centre_buffers = geopandas.read_file(DATA_DIRECTORY / "shopping_centres.gpkg", layer="buffers")
assert shopping_centre_buffers.crs == pyproj.CRS("EPSG: 3879")

In [5]:
# NON-EDITABLE CODE CELL FOR TESTING YOUR SOLUTION
assert isinstance(shopping_centre_buffers, geopandas.GeoDataFrame)
assert shopping_centre_buffers.geometry.geom_type.unique() == ["Polygon"]
assert shopping_centre_buffers.crs == pyproj.CRS("EPSG:3879")


---

### b) Carry out a *spatial join* between the `population_grid` and the `shopping_centre_buffers`  (2 points)

Join the shopping centre’s `id` column (and others, if you want) to the population grid data frame, for all population grid cells that are **within** the buffer area of each shopping centre. [Use a *join-type* that retains only rows from both input data frames for which the geometric predicate is true](https://geopandas.org/en/stable/gallery/spatial_joins.html#Types-of-spatial-joins). 


In [6]:
print(shopping_centre_buffers.shape)
#print(shopping_centre_buffers['geometry'][0])

(7, 5)


In [7]:
print(population_grid.shape)
#print(population_grid.geometry[0])

(5837, 14)


In [8]:
shopping_centre_buffers.head()

Unnamed: 0,address,id,name,addr,geometry
0,"CAP-Autokoulu, 1-7, Itäkatu, Itäkeskus, Vartio...",1,Itis,"Itäkatu 1-7, 00930 Helsinki, Finland","POLYGON ((4521.466 207.413, 4523.273 133.812, ..."
1,"Mannerheimintie, Keskusta, Kluuvi, Eteläinen s...",2,Forum,"Mannerheimintie 1420, 00100 Helsinki, Finland","POLYGON ((4521.324 207.374, 4523.131 133.772, ..."
2,"Pentik, 11, Piispansilta, Matinkylä, Suur-Mati...",3,Iso-omena,"Piispansilta 11, 02230 Espoo, Finland","POLYGON ((4521.125 207.365, 4522.932 133.763, ..."
3,"Dr. Denim, 3-9, Leppävaarankatu, Ruusutorppa, ...",4,Sello,"Leppävaarankatu 3-9, 02600 Espoo, Finland","POLYGON ((4521.199 207.421, 4523.006 133.820, ..."
4,"Stockmann, 3, Vantaanportinkatu, Vantaanportti...",5,Jumbo,"Vantaanportinkatu 3, 01510 Vantaa, Finland","POLYGON ((4521.349 207.495, 4523.156 133.894, ..."


In [9]:
population_grid.head()

Unnamed: 0,gml_id,index,asukkaita,asvaljyys,ika0_9,ika10_19,ika20_29,ika30_39,ika40_49,ika50_59,ika60_69,ika70_79,ika_yli80,geometry
0,Vaestotietoruudukko_2020.1,703,5,51,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6685998.998, 25472499.9..."
1,Vaestotietoruudukko_2020.2,710,8,44,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6684249.004, 25472499.9..."
2,Vaestotietoruudukko_2020.3,711,5,90,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6683999.005, 25472499.9..."
3,Vaestotietoruudukko_2020.4,715,13,34,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6682998.998, 25472499.9..."
4,Vaestotietoruudukko_2020.5,848,5,53,99,99,99,99,99,99,99,99,99,"POLYGON ((25472749.993 6690249.003, 25472749.9..."


In [10]:
# ADD YOUR OWN CODE HERE
shopping_centre_with_population_data = population_grid.sjoin(
    shopping_centre_buffers[['id', 'geometry']],
    how="left",
    predicate="within"
)
print(shopping_centre_with_population_data.shape)
shopping_centre_with_population_data.head()

(5837, 16)


Unnamed: 0,gml_id,index,asukkaita,asvaljyys,ika0_9,ika10_19,ika20_29,ika30_39,ika40_49,ika50_59,ika60_69,ika70_79,ika_yli80,geometry,index_right,id
0,Vaestotietoruudukko_2020.1,703,5,51,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6685998.998, 25472499.9...",,
1,Vaestotietoruudukko_2020.2,710,8,44,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6684249.004, 25472499.9...",,
2,Vaestotietoruudukko_2020.3,711,5,90,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6683999.005, 25472499.9...",,
3,Vaestotietoruudukko_2020.4,715,13,34,99,99,99,99,99,99,99,99,99,"POLYGON ((25472499.995 6682998.998, 25472499.9...",,
4,Vaestotietoruudukko_2020.5,848,5,53,99,99,99,99,99,99,99,99,99,"POLYGON ((25472749.993 6690249.003, 25472749.9...",,


In [11]:
# possibly the data of WFS has updated
shopping_centre_with_population_data[~shopping_centre_with_population_data['id'].isna()]

Unnamed: 0,gml_id,index,asukkaita,asvaljyys,ika0_9,ika10_19,ika20_29,ika30_39,ika40_49,ika50_59,ika60_69,ika70_79,ika_yli80,geometry,index_right,id



---

### c) Compute the population sum around shopping centres (2 points)

Group the resulting (joint) data frame by shopping centre (`id` or `name`), and calculate the `sum()` of the population living inside the 1.5 km radius around them.

Print the results, for instance, in the form "12345 people live within 1.5 km from REDI".

In [12]:
# ADD YOUR OWN CODE HERE
dissolved = shopping_centre_with_population_data.dissolve(by="id", aggfunc='count')
'''
for group in dissolved:
    print(f"{group.count} people live within 1.5 km from {group['name']}")
'''


  merged_geom = block.unary_union


'\nfor group in dissolved:\n    print(f"{group.count} people live within 1.5 km from {group[\'name\']}")\n'


---

### d) Reflection

Good job! You are almost done with this week’s exercise. Please quickly answer the following short questions:
    
- How challenging did you find problems 1-3 (on scale to 1-5), and why?
- What was easy?
- What was difficult?

Add your answers in a new *Markdown* cell below: