# Task 3: Analysing green spaces in Berlin and Munich
**Task is to link census 2022 data for Berlin and for Munich with Green Space indicator from IOER-Monitor (1 km Raster)**
- Census 2022 datasets offers population sum and population difference sum (in comparison to 2011) aggregated on a 500 meter Raster grid (using centroids as points)
- IOER-Monitor offers indicators about Green Space, please use it for year 2022 with raster grid of 1.000 meter
- linking should be just a simple lookup

## Load functions from SoRa R package
This steps are currently required to load all R functions from /R/ directory. In future, the SoRa R package will be installed directly.

In [None]:
# load R functions from SoRa R Package
path <- "/home/jovyan/R/"
sora_functions  <- dir(path)
for (i in sora_functions) {
  source(paste0(path, i))
}

Load ggplot2 for plots and maps

In [None]:
# load ggplot2
library(ggplot2)

## Check your changed SORA_API_KEY 
- the environment variable from .Renviron file


In [None]:
#check environment variable for SORA_API_KEY
Sys.getenv("SORA_API_KEY")

## Load, explore and prepare input survey data

In [None]:
path_data <- "/home/jovyan/data/"

Load datasets and explore it. We start with Berlin.

**Do you know, what the five columns mean?**

In [None]:
## data Berlin
berlin <- read.csv(paste0(path_data, "<--->.csv"))
head(berlin)

In [None]:
#show dimension of the data (number of rows and columns)
dim(berlin)

In [None]:
#draw map of Berlin with population difference sum (between year 2022 and 2011) for each 500m grid cell
ggplot(berlin, aes(x = x, y = y, color = pop_diff_sum)) +
  geom_point(size = 2) +
  scale_color_gradient2(low = "darkred", mid = "white", high = "darkgreen", midpoint = 0) +
  theme_minimal()

We can see, that number of population in Berlin has increased between 2011 and 2022 in many parts of the city.

In [None]:
## plot Berlin
plot(berlin$x, berlin$y, 
     xlab = "x", ylab = "y",
     main = "Census grid (500m raster) from Berlin",
     sub = "crs = 3035")
grid()

And now load and explore the data for Munich.

In [None]:
## data Munich
munich <- read.csv(paste0(path_data, "<--->.csv"))
head(munich)

In [None]:
#show dimension of the data (number of rows and columns)
dim(munich)

In [None]:
#draw map of Munich with population sum for each 500m grid cell
ggplot(munich, aes(x = x, y = y, color = pop_sum)) +
  geom_point(size = 2) +
  scale_color_gradient(low = "white", high = "red") +
  theme_minimal()

The center of Munich has a higher population density than the peripher parts.

In [None]:
## plot Munich
plot(munich$x, munich$y, 
     xlab = "x", ylab = "y",
     main = "Census grid (500m raster) from Munich",
     sub = "crs = 3035")
grid()

The following function can be used in your scientific R script, to stop the execution if Geolinking Service SoRa is not available:

In [None]:
## check is sora available, stop if there is a problem
stopifnot(sora_available())

### Prepare and execute the linking jobs for Berlin and Munich
hint:
- this simple linking method only needs parameter "method", no more parameters.

In [None]:
## linking job berlin and munich

## reduce survey data to only id, x and y and add Coordinate Reference System (CRS) using sora_custom() function
sora_data_berlin <- sora_custom(.data = berlin, crs = 3035)

sora_data_munich <- sora_custom(.data = munich, crs = 3035)

## define spatial dataset
spat_data <- sora_spatial(id = "ioer-monitor-f01rg-2022-1000m")

In [None]:
# define linking
linking <- sora_linking(
  method = "<--->"
)

In [None]:
# start the linking request

job_id_berlin <- sora_request(dataset = sora_data_berlin, link_to = spat_data, method = linking)

**Please wait some seconds ; )**

In [None]:
# start the linking request

job_id_munich <- sora_request(dataset = sora_data_munich, link_to = spat_data, method = linking)

### Get results

First, try of the linking job is done. If TRUE, you can get the result data.

In [None]:
sora_job_done(job_id_berlin)

In [None]:
sora_job_done(job_id_munich)

In [None]:
## get results for Berlin

if (sora_job_done(job_id_berlin)){
  results_berlin <- sora_results(job_id = job_id_berlin)
  head(results_berlin)
}

In [None]:
## get results for Munich

if (sora_job_done(job_id_munich)){
  results_munich <- sora_results(job_id = job_id_munich)
  head(results_munich)
}

### Merge the datasets

Merge result data with origin census datasets (to include columns for population)

In [None]:
# merge Berlin data
linked_berlin <- merge(berlin, results_berlin, by="id")
head(linked_berlin)

In [None]:
# merge Munich data
linked_munich <- merge(munich, results_munich, by="id")
head(linked_munich)

Plot the result

### Plots of Berlin

In [None]:
# plot Population Sum of census 2022
plot(linked_berlin$pop_sum, linked_berlin$value, type = "p", main = "population sum vs \n proportion of open space", xlab = "Population sum", ylab = "Proportion of open space")

In [None]:
plot(linked_berlin$pop_diff_sum, linked_berlin$value, type = "p", main = "population difference vs \n proportion of open space", xlab = "Population difference", ylab = "Proportion of open space")

### Plots of Munich

In [None]:
plot(linked_munich$pop_sum, linked_munich$value, type = "p", main = "population sum vs \n proportion of open space", xlab = "Population sum", ylab = "Proportion of open space")

In [None]:
plot(linked_munich$pop_diff_sum, linked_munich$value, type = "p", main = "population difference vs \n proportion of open space", xlab = "Population difference", ylab = "Proportion of open space")