## RQ1 Creating FAO forest map

[Add Description]

Need to create a FAO Definition Approximation following the steps from Johnson et al (2023)

Steps:
1. Data Preparation
    1. ~~Clip the 5 GER LULC shapefiles to the Natura 2000 areas~~ (SKIP FOR NOW)
    2. Merge the 3 required GER LULC shapefiles into 1 shapefile
    3. Reclassify & vectorise the 5m JAXA raster
2. Intersect JAXA and GER LULC shapefiles
3. Conditional Reclassing
4. Convert to 5m raster

In [1]:
# SETUP

# Note: this .ipynb file depends on files & folder structures created in rq1_step1_data_prep.ipynb

# Import packages
import glob
import os

import pandas as pd
import geopandas as gpd


### Step 1: Data Preparation

#### Step 1.1: Clip GER LULC SHPs

~~In the rq1_step1_data_prep.ipynb file, I clipped all the output **rasters** to the Germany Natura 2000 areas - in this first data prep step, I do the same for the GER LULC shapefiles, as I will be working with vector data for creating the FAO-aligned forest map. ~~

IMPORTANT: Skipping this clipping step for now as I didn't actually manage to clip the output rasters. 

In [3]:
# CLIP GER LULC SHPS

# Store paths to reprojected GER LULC SHPs in a list
#ger_lulc_paths = glob.glob('./processing/clc5_class*xx_3035_DE.shp')

# Create a function which which clips the shp to the Germany Natura areas (& saves to processing folder)
#def clip_shp_to_natura(input_paths):
#    # Load Germany Natura 2000 areas
#    natura_de_gdf = gpd.read_file("./outputs/natura2000_3035_DE.shp")
    # Iterate through the GER LULC shp paths 
#    for path in input_paths:
        # Open the shp for each path 
#        ger_lulc_shp = gpd.read_file(path)
        
        # Clip input GER LULC shp to Natura shp
#        shp_clip  = gpd.clip(ger_lulc_shp, natura_de_gdf)

        # For output file naming: extract the input file name (with extension)
#        name_w_ext = os.path.split(path)[1] 
        # For output file naming: remove extension from input file name 
#        name_wo_ext = os.path.splitext(name_w_ext)[0]
        # For output file naming: create the new name for clipped shp
#        new_name = name_wo_ext + "_clipped.shp"

        # Write the reprojected shp to the processing folder
#        shp_clip.to_file('./processing/' + new_name)

# Run the function for the German LULC zipped shps
#clip_shp_to_natura(ger_lulc_paths)


#### Step 1.2: Merge GER LULC to 1 shp 

In order to make the next steps easier, all the GER LULC shps can be merged into 1 master shp. 

Help for merging/appending shps: https://geopandas.org/en/stable/docs/user_guide/mergingdata.html 

In [2]:
# MERGE GER LULC SHPS

# Load all the GER LULC SHPs
ger_lulc_class2_shp = gpd.read_file("./processing/clc5_class2xx_3035_DE.shp")
ger_lulc_class3_shp = gpd.read_file("./processing/clc5_class3xx_3035_DE.shp")
ger_lulc_class4_shp = gpd.read_file("./processing/clc5_class4xx_3035_DE.shp")

# Append the shapefiles together
merged_ger_lulc_shp = pd.concat([ger_lulc_class2_shp,
                                 ger_lulc_class3_shp,
                                 ger_lulc_class4_shp,
                                 ])

# Check outputs
print(merged_ger_lulc_shp[1:20])

# Write the merged output to file
merged_ger_lulc_shp.to_file('./processing/clc5_classes234_3035_DE.shp')

  _init_gdal_data()


   CLC18                                           geometry
1    211  POLYGON ((4146745.819 2717004.087, 4146734.385...
2    211  POLYGON ((4165806.543 2717207.554, 4165818.199...
3    211  POLYGON ((4146387.414 2717223.442, 4146387.103...
4    211  POLYGON ((4146153.622 2717666.462, 4146159.853...
5    211  POLYGON ((4151519.606 2717859.941, 4151522.605...
6    211  POLYGON ((4149840.835 2718128.946, 4149845.087...
7    211  POLYGON ((4169308.498 2718137.965, 4169495.029...
8    211  POLYGON ((4147925.397 2718503.85, 4147932.249 ...
9    211  POLYGON ((4171379.434 2717936.674, 4171376.431...
10   211  POLYGON ((4300299.085 2716461.898, 4300302.822...
11   211  POLYGON ((4173074.078 2718192.411, 4173072.01 ...
12   211  POLYGON ((4172046.043 2718344.172, 4172045.252...
13   211  POLYGON ((4164891.098 2718556.067, 4164894.27 ...
14   211  POLYGON ((4164024.105 2718509.396, 4164030.984...
15   211  POLYGON ((4172833.873 2718627.483, 4172835.567...
16   211  POLYGON ((4167303.57 2718765.2

  ogr_write(


#### Step 1.3: Reclassify & vectorise clipped Jaxa

In order to implement the workflow for creating the FAO-aligned forest map, all input data must be vectors. In this step I vectorise the clipped JAXA 5m raster - but first I reclassify the raster to convert the information to a true Forest-Nonforest map.

For JAXA, the reclassification to true Forest-Nonforest is as follows:

| Original Value | Original Label               | New Value | New Label  |
| -------------- | ---------------------------- | --------- | ---------- |
| 0              | No Data                      | 0         | Non-Forest |
| 1              | Forest (>90% canopy cover)   | 1         | Forest     |
| 2              | Forest (10-90% canopy cover) | 1         | Forest     |
| 3              | Non-Forest                   | 0         | Non-Forest |
| 4              | Water                        | 0         | Non-Forest |

Both forest categories are converted to forest here as this fits with the FAO canopy cover thresholds. 

Help with reclassifying (use gdal_calc): https://gis.stackexchange.com/questions/245170/reclassifying-raster-using-gdal 

Help with raster to vector: https://py.geocompx.org/05-raster-vector

