I am trying to understand what the drivers are for high shares of necessary land. So in the following, I will look at certain attributes of regions and figure out whether they come along with higher values of land demand. 

In [1]:
import numpy as np
import pandas as pd
import geopandas as gpd

import sys
sys.path.append("..")
from src.conversion import area_in_squaremeters
from src.eligible_land import FARM, FOREST, GlobCover, ProtectedArea

In [2]:
regions = gpd.read_file("../build/municipal/regions.geojson").set_index("id")
pop = pd.read_csv("../build/municipal/population.csv", index_col=0)["population_sum"]
land_cover = pd.read_csv("../build/municipal/land-cover.csv", index_col=0)
protected_areas = pd.read_csv("../build/municipal/protected-areas.csv", index_col=0)
industrial_demand_share = pd.read_csv("../build/municipal/demand.csv", index_col=0)["industrial_demand_fraction"]
necessary_land = pd.read_csv("../build/municipal/full-protection/necessary-land.csv", index_col=0)["fraction_land_necessary"]
necessary_land_no_protection = pd.read_csv("../build/municipal/zero-protection/necessary-land.csv", index_col=0)["fraction_land_necessary"]

In [3]:
necessary_land[necessary_land == np.inf] = np.nan
necessary_land_no_protection[necessary_land_no_protection == np.inf] = np.nan
pop_density = pop / (area_in_squaremeters(regions) / 1e6)
farmland_share = (land_cover[[f"lc_{cover.value}" for cover in FARM]].sum(axis=1) / 
                  land_cover[[f"lc_{cover.value}" for cover in GlobCover]].sum(axis=1))
forest_share = (land_cover[[f"lc_{cover.value}" for cover in FOREST]].sum(axis=1) / 
                land_cover[[f"lc_{cover.value}" for cover in GlobCover]].sum(axis=1))
protection_share = (protected_areas["pa_{}".format(ProtectedArea.PROTECTED)] / 
                    protected_areas[[f"pa_{cover.value}" for cover in ProtectedArea]].sum(axis=1))

In [4]:
urban = pop_density > 1000
farmland = farmland_share > 0.5
forest = forest_share > 0.5
industry = industrial_demand_share > 0.0

## Effect on necessary land

In [5]:
print(necessary_land[urban].mean())
print(necessary_land[~urban].mean())

0.384707554584
0.22619123013


Urban areas have higher needs for land demand. Let's remove them in the following, to find drivers in non urban regions.

Another logical driver would be environmental protection. So let's look at the impact of that:

In [6]:
print(necessary_land[protection_share > 0.5].mean())
print(necessary_land[protection_share <= 0.5].mean())

0.797939770031
0.0738162191692


Ok so environmental protection is an important driver.

In [7]:
print(necessary_land_no_protection[urban].mean())
print(necessary_land_no_protection[~urban].mean())

0.374397537554
0.0469200157844


When ignoring environmental protection, not so much changes in the cities. But there are dramatical changes in the rural areas: necessary land goes down from 23% to 5%.

That basically says that environmental protection is the largets driver in rural areas, and population density the largest driver in urban areas.

Let's still look at a few more driver candidates:

In [8]:
print(necessary_land[~urban & industry].mean())
print(necessary_land[~urban & ~industry].mean())
print(industry.sum() / necessary_land.count())

0.722735605418
0.225463455971
0.00176794869734


Industrial loads show a large impact on the land demand. But they only make up 0.2% of all regions.

In [9]:
print(necessary_land[~urban & farmland].mean())
print(necessary_land[~urban & ~farmland].mean())

0.0699881333357
0.413204565942


The more farmland available, the significantly lower the land demand. 

In [10]:
print(necessary_land[~urban & forest].mean())
print(necessary_land[~urban & ~forest].mean())

0.825422393076
0.0678863004357


The more forest available, the significantly higher the land demand.

How come? Maybe it's simply that there is almost no population whereever there is forest? Let's have a look.

In [11]:
print(pop[~urban & forest].mean())
print(pop[~urban & ~forest].mean())

2747.82045342
2957.33924023


In [12]:
print(pop[~urban & farmland].mean())
print(pop[~urban & ~farmland].mean())

2974.63907996
2840.40255246


The driver for both effects does not seem to be population in the regions, as the average population size is equal. 

What then drives this effect? Maybe forests are more likely to be protected?

In [13]:
print(protection_share[~urban & forest].mean())
print(protection_share[~urban & ~forest].mean())

0.532708602582
0.374648303198


They are more likely to be protected but this does not seem to explain all the data.

In [14]:
print(necessary_land_no_protection[~urban & forest].mean())
print(necessary_land_no_protection[~urban & ~forest].mean())

0.0512577997423
0.0457735454908


Nope, that was wrong: When exluding environmental protection, the visible effect is rather small.

In [15]:
print(necessary_land_no_protection[~urban & farmland].mean())
print(necessary_land_no_protection[~urban & ~farmland].mean())

0.0486088453954
0.0448985087129


The same goes for farmland: if we exclude that, there is no effect for farmland visible anymore

### Wait a minute:

Farmland shouldn't/cannot be protected. Can we please quantify the overlap here?

In [16]:
import rasterio

In [17]:
with rasterio.open("../build/protected-areas-europe.tif") as src:
    protected_areas = src.read(1)
with rasterio.open("../build/land-cover-europe.tif") as src:
    land_cover = src.read(1)

In [18]:
protected_farm_land = (np.isin(land_cover, FARM)) & (protected_areas == ProtectedArea.NOT_PROTECTED)

In [19]:
protected_farm_land.sum() / land_cover.sum()

0.00086783555423798945

This is almost negligble (fortunately!). How come that environmental protection then has an impact on whether farmland shows an effect or not?

### End Wait a minute

It simply must be the areas that are not farm land in the end that drive it. Could it be that the more farmland in one region, the less environmental protection?

In [20]:
print(protection_share[farmland].mean())
print(protection_share[~farmland].mean())

0.347809230507
0.454027169977


Yes, regions with more farmland simply have less protected areas. Other effect playing along might be that farmland is rather flat and can be built upon compared to regions with not so much farmland.

## Effect on undersupplied population

Let's do the same analysis but this time considering the population that is undersupllied.

In [21]:
LAND_THRESHOLD = 1.0

In [22]:
undersupplied_pop = pop[necessary_land > LAND_THRESHOLD].transform(lambda x: x / x.sum())

In [23]:
print(undersupplied_pop[urban].sum())

0.949490266582


95% of the undersupplied population lives in urban areas.

In [24]:
print(undersupplied_pop[~urban & (protection_share > 0.5)].sum())

0.0284415574676


Of the remaining ~5%, 2.8% live in regions with high environmental protection. This leaves only 2.2% unexplained.



In [25]:
print(undersupplied_pop[~urban & (protection_share <= 0.5) & industry].sum())

0.015857597175


Of the remaining ~2.2%, 1.6% live in regions with high industrial loads. This leaves only 0.6% unexplained.

Is this mechanism sensitive to the land threshold? Let's use another.

In [26]:
undersupplied_pop = pop[necessary_land > 0.3].transform(lambda x: x / x.sum())
print(undersupplied_pop[urban].sum())
print(undersupplied_pop[~urban & (protection_share > 0.5)].sum())
print(undersupplied_pop[~urban & (protection_share <= 0.5) & industry].sum())

0.912331101357
0.0393102148922
0.00829310881584


When using 30% as a threshold, the numbers are 91%, 4%, and 0.8% so they still explain almost all undersupplied population. So no, they don't seem to be very sensitive to the threshold.

## Conclusion

Outside of urban areas the main driver is environmental protection and industrial loads. For municipalities this means that aspiring autarky would mean to either cut environmental protection, or use not protected and not eligible land for energy farming; like agricultural areas or forests.