# Area statistics for protected areas in mainland Norway

[![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ac-willeke/pygdal-geo-engineer/blob/main/notebooks/2024.01_conservation_and_preservation.ipynb) [![github](https://img.shields.io/badge/GitHub-View%20on%20GitHub-blue?logo=github)](https://github.com/ac-willeke/pygdal-geo-engineer/blob/main/notebooks/2024.01_conservation_and_preservation.ipynb)

**Author**: Willeke A'Campo

**Description:** This notebook calculates area statistics for protected areas in mainland Norway. Area statistics are calculated for the datasetes **Verneområder** og **Foreslatte verneområder** from the Norwegian Environment Agency. 

The area statistics can be divided in three groups:

1. Area variables for the protected areas:
    - Area (terrestrial and marine)
    - Perimeter (terrestrial and marine)
    - Land area (terrestrial)
    - Perimeter land area (terrestrial)

2. Overlay statistics for the protected area and the following datasets:
    - AR50 - Bonitet
        - Area proportion of land cover quality classes 
    - Bioklimatiske soner
        - Area proportion of bioclimatic zone classes
    - Infrastrukturindeks: 
        - Mean value of infrastructure index per protected area 
        - Area proportion of infrastructure index classes
    - Høydelag
        - Area proportion of elevation classes 

3. Spatial indices for the protected areas:
    - Density of protected area per 10x10km (SSB grid)
    - Shape Index for each protected area (land + marine)
    - Shape index for each protected area (land)






## Configure Workflow

In [None]:
# Variables to control the execution of the workflow
gis_server = True # True = use GIS server, False = use local files
load_data = False # True = load data from source, False = data is already loaded
test_area = False # True = test area, False = full area
prepare_duckdb = False # True = prepare duckdb, False = duckdb is already prepared
run_ar50 = False # True = run AR50 area calc , False = AR50 is already run

# protected area
protected_area = 

In [None]:
import os 
from osgeo import gdal
from pathlib import Path
from itertools import islice

import pandas as pd
import geopandas as gpd
from shapely import wkt
from shapely.geometry import box

# mapping libraries
# https://leafmap.org/faq/#how-to-use-a-specific-plotting-backend
import leafmap.foliumap as leafmap

In [None]:
project_path= Path.cwd().parents[0]
shell_path= os.path.join(project_path, "src", "shell")
python_path= os.path.join(project_path, "src", "python")

if gis_server:
    user = "willeke.acampo" #input("Username: firstname.lastname")
    data_path = f"/home/NINA.NO/{user}/Mounts/scratch/wilaca/vern_og_bevaring/data"
else:
    data_path = os.path.join(project_path, "data")

print(f"Gdal Version: {gdal.__version__} ")
print(f"Project Path: {project_path}")
print(f"Path to shell scripts: {shell_path}")
print(f"Path to python scripts: {python_path}")
print(f"Path to data: {data_path}")

In [None]:
# import local scripts
import sys 
sys.path.append(python_path)

from ogr_utils import import_gpkg, print_layer_schema
from lookup import create_lookup_dict, lookup_value
from duckdb_utils import *

## Download and Import Data

Get the data and import into geopackages `vern_25833.gpkg` and `bevaring_25833.gpkg` using ogr2ogr.

```python
# change path to data folder
os.chdir(os.path.join(data_path, "raw"))
print(f"Download data to: {os.getcwd()}")
```

```bash
# ArcGIS REST: Naturvern (02.2024)
ogr2ogr -f "GPKG" -t_srs EPSG:25833 -nln naturvernomrade -nlt MULTIPOLYGON vern_25833.gpkg "https://kart.miljodirektoratet.no/arcgis/rest/services/vern/mapserver/0/query?where=1%3D1&outfields=*&f=json&resultRecordCount=1"

# ArcGIS REST: Foreslåtte verneområder (02.2024)
ogr2ogr -f "GPKG" -t_srs EPSG:25833 -nln foreslatt_vern -nlt MULTIPOLYGON vern_25833.gpkg "https://kart.miljodirektoratet.no/arcgis/rest/services/vern/mapserver/4/query?where=1%3D1&outfields=*&f=json&resultRecordCount=1"

# AR50 - 2022 
# Stored on NINA's server /GeoSpatialData

# Bioklima
# Received from V. Bakkestuen (NINA)

# SSB grid
curl -o Ruter_10KM_norge.zip "https://www.ssb.no/natur-og-miljo/_attachment/375082?_ts=1685c0e69b8"
unzip Ruter_10KM_norge.zip

# Fylkerområder
# Stored on NINA's server /GeoSpatialData

# Infrastruktur
# Received from V. Bakkestuen (NINA)
```

Downloaded datasets are stored in the `/data` folder:

**vern_25833.gpkg** contains the following layers:
| Layer | Dataset Name | Description | Year | Source |
|-------| ------------ | ----------- | ---- | ------ |
| naturvernomrade | Naturvernområder| Nature protected areas | 2024 | [Miljødirektoratet](https://kartkatalog.miljodirektoratet.no/Dataset/Details/0) |
| foreslatt_vern | Foreslåtte naturvernområder | Planned nature protected areas | 2024 | [Miljødirektoratet](https://kartkatalog.miljodirektoratet.no/Dataset/Details/1)| 

**bevaring_25833.gpkg** contains the following layers:
|Layer| Dataset Name | Description | Year | Source |
|-----| ------------ | ----------- | ---- | ------ |
| ar50_2022 | AR50 | Land cover classes | 2022 | [NIBIO](https://kart8.nibio.no/nedlasting/dashboard) |
| bioklima_soner_2017 | Bioklimatiske soner | Bioclimatic zones | 2017 | [Artsdatabanken](https://data.artsdatabanken.no/Natur_i_Norge/Natursystem/Beskrivelsessystem/Regional_naturvariasjon/Bioklimatisk_sone) |

**other files**

| Filename | Dataset Name | Description | Year | Source |
|----------| ------------ | ----------- | ---- | ------ |
| ssb_grid_5km.geojson | SSB rutenett (5x5 km)| Grid for Norway | 2024 | [SSB](https://kart.ssb.no/) |
| fylker_2024.geojson | Fylker, 2024 | Provincial Boundaries | 2024 | [SSB](https://www.ssb.no/en/kart/griddata) |
| Infra25m.tif | Infrastrukturindeks | Infrastructure index (25m)| 2024 | Internal NINA datasett |

In [None]:
if load_data: 
    # run shell scripts from /src/shell
    os.chdir(shell_path)

    input_file = "vern_25833.gpkg"
    path = os.path.join(data_path, "interim")

    ! gdalinfo --version
    ! chmod +x gdal_gpkg-info.sh
    ! ./gdal_gpkg-info.sh {path} {input_file}

## Data Preparation

### Sudy Area | Norway Mainland

This workflow is created and run for Norway mainland. If you like to test or use the workflow we recommend running it first on a study area. Below are bounding box coordinates provided for Dovrefjell, Fosen or Trondheim area. 

**Create Bounding Box** 

In [None]:
# bounding box Dovrefjell
bounding_box = "fosen"  # "dovrefjell" or "trondheim"

if bounding_box == "dovrefjell":
    xmin, ymin, xmax, ymax = 160000.00, 6900000.00, 260000.00, 6950000.00
if bounding_box == "fosen":
    xmin, ymin, xmax, ymax = 180000.00, 7010000.00, 290000.00, 7150000.00
if bounding_box == "trondheim":
    xmin, ymin, xmax, ymax = 260520.12, 7032142.5, 278587.56, 7045245.27

# Create a bounding box
boxBB =  box(xmin, ymin, xmax, ymax)
crs = 'EPSG:25833'

gdf_BB = gpd.GeoDataFrame(geometry=[boxBB])
gdf_BB['name'] = 'Dovre_BB'
gdf_BB.crs = crs
bounds = gdf_BB.bounds.to_numpy().tolist()[0]

In [None]:
if load_data: 
    # run shell scripts from /src/shell
    os.chdir(shell_path)
    for file in ["vern", "bevaring", "admin"]:
        input_file = os.path.join(data_path, "interim", f"{file}_25833.gpkg")
        output_file = os.path.join(data_path, "tmp", f"{file}_25833_bbox.gpkg")
        
        #for layer_name in layer_names:
        ! chmod +x gdal_copy-file-bbox.sh
        ! ./gdal_copy-file-bbox.sh {input_file} {output_file} {xmin} {ymin} {xmax} {ymax}

### AR50 - Bonitet

Translate the Bonitet classes using a lookup table.
 
&emsp; $Bonitet = (artype, artreslag, arskogbon, arjordbr, ardyrkning, arveget)$ 

In [None]:
if load_data: 
    # create a temporary test gpkg
    tmp_gpkg = os.path.join(data_path, "tmp", "tmp.gpkg")
    in_gpkg = os.path.join(data_path, "interim", "bevaring_25833.gpkg")
    print(tmp_gpkg)

    sql_query = f'SELECT * FROM "ar50_flate" LIMIT 5'
    path = os.path.join(data_path, "tmp")
    ! ogr2ogr -f "GPKG" -nln "ar50_flate" -sql "$sql_query" {tmp_gpkg} {in_gpkg}

In [None]:
# ar50 layer
layer_name, new_field_name = "ar50_flate", "ar50_bonitet"

if test_area:
    # import gpkg into ogr object 
    # add field name if it does not exist
    tmp_gpkg = os.path.join(data_path, "tmp", "bevaring_25833_bbox.gpkg")
    ds, lyr = import_gpkg(tmp_gpkg, layer_name, new_field_name)
    #print_layer_schema(lyr)
else:
    # import gpkg into ogr object 
    # add field name if it does not exist
    in_gpkg = os.path.join(data_path, "interim", "bevaring_25833.gpkg")
    ds, lyr = import_gpkg(in_gpkg, layer_name, new_field_name)
    #print_layer_schema(lyr)

# Convert the first 5 features of the layer to a DataFrame
# use islice to limit the number of features to 5, to reduce computation time
df_AR50 = pd.DataFrame(feature.items() for feature in islice(lyr, 5))

# Print the DataFrame
df_AR50

In [None]:
# lookup table 
lookup_csv = os.path.join(data_path, "AR50_bonitet_lookup.csv")
lookup_df = pd.read_csv(lookup_csv)

# rename cols to correspond with AR50 lyr
lookup_df.rename(
    columns={
        "ARTYPE kode": "artype",
        "ARTRESLAG kode": "artreslag",
        "ARSKOGBON kode": "arskogbon",
        "ARJORDBR kode": "arjordbr",
        "ARDYRKING kode": "ardyrking",
        "ARVEGET kode": "arveget",
        "Bonitet kode": "ar50_bonitet",
    },
    inplace=True,
)

# reorder cols to correspond with AR50 lyr
lookup_df = lookup_df[
    [
        "artype",
        "arskogbon",
        "artreslag",
        "arveget",
        "arjordbr",
        "ardyrking",
        "Beskrivelse",
        "ar50_bonitet",
    ]
]

display(lookup_df.head(11))

In [None]:
# create lookup dict
# keys must be in same order as gpkg lyr fields
keys = (
    "artype",
    "arskogbon",
    "artreslag",
    "arveget",
    "arjordbr",
    "ardyrking",
    )

value= "ar50_bonitet"

lookup_dict = create_lookup_dict(
    lookup_df,
    keys=keys,
    value=value
)

# print first two entries of dict
print({k: lookup_dict[k] for k in list(lookup_dict)[:11]})

In [None]:
if load_data:
    # loop through the features and reclassify the attribute value "ar50_bonitet"
    features_to_update = []

    for feature in lyr:
        # get the attribute values
        artype = feature.GetField("artype")
        arskogbon = feature.GetField("arskogbon")
        artreslag = feature.GetField("artreslag")
        arveget = feature.GetField("arveget")
        arjordbr = feature.GetField("arjordbr")
        ardyrking = feature.GetField("ardyrking")

        key = (int(artype), int(arskogbon), int(artreslag), int(arveget), int(arjordbr), int(ardyrking))
        if key in lookup_dict:
            new_value = lookup_dict[key]
            feature.SetField("ar50_bonitet", new_value)
            features_to_update.append(feature)

    print(f"Number of features to update: {len(features_to_update)}")
    # Batch update features
    for feature in features_to_update:
        lyr.SetFeature(feature)

    print("Finished updating Bonitet.")

    df_AR50_bon = pd.DataFrame(feature.items() for feature in islice(lyr, 5))
    display(df_AR50_bon.head())

# close OGR object
ds = None

In [None]:
if load_data:
    # Define the SQL queries
    sql_marine = f"SELECT * FROM ar50_flate WHERE ar50_bonitet = 17"
    sql_terrestisk = f"SELECT * FROM ar50_flate WHERE ar50_bonitet != 17"

    if test_area:
        # import gpkg into ogr object 
        # add field name if it does not exist
        in_gpkg = os.path.join(data_path, "tmp", "bevaring_25833_bbox.gpkg")
        out_gpkg = os.path.join(data_path, "tmp", "interim_25833_bbox.gpkg")
    else:
        # import gpkg into ogr object 
        # add field name if it does not exist
        in_gpkg = os.path.join(data_path, "interim", "bevaring_25833.gpkg")
        out_gpkg = os.path.join(data_path, "interim", "interim_25833.gpkg")

    print(out_gpkg)

    # Create the ar50_marine layer
    ! ogr2ogr -f "GPKG" -nln "ar50_marine" -sql "$sql_marine" -append {out_gpkg} {in_gpkg}
    # Create the ar50_terrestisk layer
    ! ogr2ogr -f "GPKG" -nln "ar50_terrestisk" -sql "$sql_terrestisk" -append {out_gpkg} {in_gpkg}

### Infrastructure index

- Calculate and display the distribution of the infrastructure index for the whole of norway 
- Calculate and display the distribution of the infrastructure index per region
- Calculate and display the distribution of the infrastructure index per protected area
- Define the infrastructure index classes


### Topographic height

Classify the topographic height into 4 classes:
- 0-300m
- 301-600m
- 601-900m
- over 900m

## Load data into Spatial Database (DuckDB)


### Load data into DuckDB

In [None]:
if test_area:
    db_path = os.path.join(data_path, "tmp", "verg_og_bevaring_tmp.db")
else:
    db_path = os.path.join(data_path, "interim", "verg_og_bevaring.db")

**Load Bioklima soner into DuckDB**

In [None]:
if prepare_duckdb:
    # AR50 into duckdb
    if test_area:
        in_gpkg = os.path.join(data_path, "tmp", "bevaring_25833_bbox.gpkg")
        db_path = os.path.join(data_path, "tmp", "verg_og_bevaring_tmp.db")
        load_gpkg_layers(db_path, in_gpkg, "bioklima_2017")
        load_gpkg_layers(db_path, in_gpkg, "soner_2017_1km")
    else:
        in_gpkg = os.path.join(data_path, "interim", "bevaring_25833.gpkg")
        db_path = os.path.join(data_path, "interim", "verg_og_bevaring.db")
        load_gpkg_layers(db_path, in_gpkg, "bioklima_2017")
        load_gpkg_layers(db_path, in_gpkg, "soner_2017_1km")

**Load AR50 into DuckDB**

In [None]:
prepare_duckdb_ar50 = False
if prepare_duckdb_ar50:
    # AR50 into duckdb
    if test_area:
        in_gpkg = os.path.join(data_path, "tmp", "bevaring_25833_bbox.gpkg")
        db_path = os.path.join(data_path, "tmp", "verg_og_bevaring_tmp.db")
        load_gpkg_layers(db_path, in_gpkg, "ar50_flate")
        load_gpkg_layers(db_path, in_gpkg, "ar50_flate_msk")
    else:
        in_gpkg = os.path.join(data_path, "interim", "bevaring_25833.gpkg")
        db_path = os.path.join(data_path, "interim", "verg_og_bevaring.db")
        load_gpkg_layers(db_path, in_gpkg, "ar50_flate")

**Load Protected Areas into DuckDB**

In [None]:
# import protected areas into DuckDB

# if Test is True, import test gpkg otherwise import the original gpkg
if prepare_duckdb:
    if test_area:
        in_gpkg = os.path.join(data_path, "tmp", "vern_25833_bbox.gpkg")
        db_path = os.path.join(data_path, "tmp", "verg_og_bevaring_tmp.db")
        load_gpkg_layers(db_path, in_gpkg, "naturvernomrade")
        load_gpkg_layers(db_path, in_gpkg, "foreslatt_vern")
    else:
        in_gpkg = os.path.join(data_path, "interim", "vern_25833.gpkg")
        db_path = os.path.join(data_path, "interim", "verg_og_bevaring.db")
        load_gpkg_layers(db_path, in_gpkg, "naturvernomrade")
        load_gpkg_layers(db_path, in_gpkg, "foreslatt_vern")

### Preprocess DuckDB Tables 

**Convert BLOB columns to GEOM columns**

In [None]:
if prepare_duckdb:
    # cast BLOB (Binary Large OBject) to geometry for spatial operations
    
    import duckdb

    con = duckdb.connect(db_path)
    con.install_extension('spatial')
    con.load_extension('spatial')
    
    # duckdb tables names to list
    tables = con.execute("SHOW TABLES;").fetchdf()
    tables = tables["name"].to_list()
    print(tables)
    
    for table in tables:
        
        # TODO add check if geom field exists
        
        blob_to_geom(
            db_path=db_path,
            tbl_name=table,
            blob_field="geometry",
            geom_field="geom",   
        )

### Display DuckDB tables 

In [None]:
import duckdb

con = duckdb.connect(db_path)
con.install_extension('spatial')
con.load_extension('spatial')

In [None]:
# print tables
con.sql("SHOW TABLES;")

In [None]:
# display first 5 rows of the table
table = "foreslatt_vern"
con.sql(f"SELECT * FROM {table} LIMIT 5")

In [None]:
# close
con.close()

## Display Data on the Map

https://ipyleaflet.readthedocs.io/en/latest/installation/index.html#using-pip 


In [None]:
# Display data on map using leafmap
# TODO display duckdb tables as well, not only WMS

init_location = [62.223207, 9.550195]  # Hjerkinn
zoom_start = 13  
min_zoom = 8  # keeps user from zooming out too far
basemap = leafmap.Map(
    location=init_location, 
    zoom=zoom_start, 
    min_zoom=min_zoom, 
    max_bounds=True
    )

# set background
basemap.add_basemap("SATELLITE", opacity=0.7)

# add WMS verneområder
wms_url ="https://kart.miljodirektoratet.no/arcgis/services/vern/mapserver/WMSServer"
wms_layer = "naturvern_klasser_omrade"
wms_name = "Protected Areas"
basemap.add_wms_layer(
    url=wms_url, 
    layers=wms_layer, 
    name=wms_name, 
    wms_format="image/png",
    )

if test_area:
    # add bounding box - Hjerkinn
    basemap.add_gdf(gdf_BB, layer_name="Test Area", fill_color="blue", fill_opacity=0.2, weight=2)

basemap

## Methods

**Bioklimatiske soner**

| Protected area | Sone 1 (%) | Sone 2 (%) | Sone 3 (%) | ... |
|----------------|------------|------------|------------|-----|
| *NaturvernId*  |*bioklima_1*|*bioklima_2*|*bioklima_3*|...  |
| area A         |            |            |            |     |

In [None]:
# TODO
run_bioklima = True
if run_bioklima:
    tbl_study_area = "foreslatt_vern"
    id = "identifikasjon_lokalId"
    tbl_bioklima = "soner_2017_1km"
    bioklima_field = "Sone_Kode"
    bioklima_zones = ["6SO-1", "6SO-2", "6SO-3", "6SO-4", "6SO-5"]
    for area_class in bioklima_zones: # 1-18
        print(f"Calculating Bioklima class {area_class}")
        new_field = f"s_{area_class}" # ar50_bon1
        new_field = new_field.replace("-", "_")
        remove_field(db_path, tbl_study_area, new_field)
        bioklima_area_class(db_path, tbl_study_area, id, tbl_bioklima, bioklima_field, area_class, new_field)
    
    for area_class in bioklima_zones: # 1-18
        new_field = f"s_{area_class}" # ar50_bon1
        new_field = new_field.replace("-", "_")
        con = duckdb.connect(db_path)
        con.install_extension('spatial')
        con.load_extension('spatial')
        con.sql(f"UPDATE foreslatt_vern SET {new_field} = 0 WHERE {new_field} IS NULL")
        con.close()

In [None]:
con = duckdb.connect(db_path)
con.sql(f"SELECT * FROM foreslatt_vern LIMIT 5")

### 1. Overlap protected areas and AR50 (land cover)

**Divide the protected areas into terrestrial and marine areas**

AR50 polygons with classified as "Hav" are considered marine areas, all other areas are considered terrestrial.
- Bonitet = 17 (Hav) 
- Arealtype = 82 (Hav)

<br>

**AR50 - Bonitet**
Calculated area overlap with AR50 Bonitet classes. 

| Protected area | Fulldyrka og overflatedyrka jord (m2) | Innmarksbeite (m2) | Skog, høg og særs høg bonitet (m2) | ... |
|----------------|--------------------------------------|-------------------|-----------------------------------|-----|
| *NaturvernId*  |*AR50_bon_1*|*AR50_bon2*|*AR50_bon3*|...|
| area A         |                                      |                   |                                   |     |


In [None]:
# calculate ar50 area overlapp with protected areas
tbl_study_area = "foreslatt_vern"
id = "identifikasjon_lokalId"
tbl_ar50 = "ar50_flate"
ar50_field = "ar50_bonitet"

if run_ar50:
    for area_class in range(1, 19): # 1-18
        print(f"Calculating Bonitet class {area_class}")
        new_field = f"ar50_bon{area_class}" # ar50_bon1
        remove_field(db_path, tbl_study_area, new_field)
        ar50_area_class(db_path, tbl_study_area, id, tbl_ar50, ar50_field, area_class, new_field)

    # if ar50_bonx is null, set to 0
    for area_class in range(1, 19): # 1-18
        new_field = f"ar50_bon{area_class}" # ar50_bon1
        con = duckdb.connect(db_path)
        con.install_extension('spatial')
        con.load_extension('spatial')
        con.sql(f"UPDATE foreslatt_vern SET {new_field} = 0 WHERE {new_field} IS NULL")
        con.close()

In [None]:
# print first 5 rows of the table "foreslatt_vern"
con = duckdb.connect(db_path)
con.sql(f"SELECT * FROM foreslatt_vern LIMIT 5")

# print as dataframe
df = con.execute(f"SELECT * FROM foreslatt_vern LIMIT 5").fetchdf()

# display columns: identfifikasjon_lokalId, ar50_bonx
col_id= [id]
cols_ar50 = [col for col in df.columns if "ar50_bon" in col and col != "sum_ar50_bon_m2"]
cols = col_id + cols_ar50
df_ar50 = df[cols]
display(df)

# close
con.close()

**Check if sum of all AR50 classes correspons with the total area**

In [None]:
# Calculating sum of all bonitet classes
remove_field(db_path, tbl_study_area, "sum_ar50_bon_m2")
sum_area_cols(db_path, tbl_study_area, id, cols_ar50, "sum_ar50_bon_m2")

# Calculating area of each polygon in the study area
remove_field(db_path, tbl_study_area, "areal_m2")
geom_area(db_path, tbl_study_area, "geom", "areal_m2")

# Calculateing the difference between the sum of all bonitet classes and the area of each polygon
remove_field(db_path, tbl_study_area, "area_diff_m2")
area_difference(db_path, tbl_study_area, "sum_ar50_bon_m2", "areal_m2", "area_diff_m2")

In [None]:
# print first 5 rows of the table "foreslatt_vern"
con = duckdb.connect(db_path)

# print as dataframe
df = con.execute(f"SELECT * FROM foreslatt_vern").fetchdf()

# display columns: identfifikasjon_lokalId, ar50_bonx
col_id= [id]
cols = col_id + ["navn", "sum_ar50_bon_m2", "areal_m2", "area_diff_m2"]
df_ar50 = df[cols]

# sort by area_diff_m2
df_ar50 = df_ar50.sort_values(by="area_diff_m2", ascending=True)
display(df_ar50)

# close
con.close()

In [None]:
# print number of rows 
print(f"Number of rows: {df.shape[0]}")

# print number of unique ids "identifikasjon_lokalId"
print(f"Number of unique ids: {df['identifikasjon_lokalId'].nunique()}")

# print number of rows with value in "ar50_bon17"
print(f"Number of areas with Marine area: {df['ar50_bon17'].count()}")

### 2. Geometry variables for protected areas

| Protected area | Total area (km2) | Perimeter (km) | Land area (km2) | Perimeter land area (km) |
|----------------|------------|----------------|-----------------|-------------------------|
| *NaturvernId* | *areal_m2* | *omkrets_m* | *landareal_m2* | *landomkrets_m2*|
| area A         |    *        |                |                 |                         |

***
$\mathbf{\text{Shape Index}}$<br>
***
The shape index is a measure of how compact the shape is compared to a circle with the same area. The shape index is calculated for the entire protected area and the land area.

- $P$ = perimeter
- $r$ = radius
- $A$ = area


**Shape Index:**&emsp;      $SI = \frac{P}{2\pi r}$

**Radius:**&emsp; $r = \sqrt{\frac{A}{\pi}}$

- SI = 1, shape is a perfect circle
- SI > 1, shape is less compact than a circle
- SI < 1, is not possible 

In [None]:
# create new table with only terrestrial part of the protected areas
remove_table(db_path, "f_vern_land")
remove_table(db_path, "f_vern_sjo")
split_by_polygon(
    db_path, tbl_study_area, id, tbl_ar50, "ar50_bonitet", 17, "f_vern_sjo", "f_vern_land"
)

In [None]:
dict_geom = {
    "foreslatt_vern": ("areal_m2", "omkrets_m", "formindeks"),
    "f_vern_land": ("landareal_m2", "landomkrets_m", "land_formindeks"),
    "f_vern_sjo": ("sjoareal_m2", "sjoomkrets_m", "sjo_formindeks"),
}

for key, value in dict_geom.items():
    
    # areal
    remove_field(db_path, key, "areal_not_grouped")
    #geom_area(db_path, key, "geom", "areal_not_grouped")
    
    # areal by ID
    remove_field(db_path, key, value[0])
    geom_area_byID(db_path, key, "identifikasjon_lokalId", "geom", value[0])
    
    # omkrets
    remove_field(db_path, key, "omkrets_not_grouped")
    #geom_peri(db_path, key, "geom", "omkrets_not_grouped")
    
    # omkrets by ID
    remove_field(db_path, key, value[1])
    geom_peri_byID(db_path, key, "identifikasjon_lokalId", "geom", value[1])
    
    # formindeks
    remove_field(db_path, key, "formindeks_not_grouped")
    #geom_index(db_path, key, "geom", "formindeks_not_grouped")
    
    # formindeks by ID
    remove_field(db_path, key, value[2])
    geom_index_byID(db_path, key, "identifikasjon_lokalId", "geom", value[2])


In [None]:
# print first 5 rows of the table "foreslatt_vern"
con = duckdb.connect(db_path)

con.install_extension('spatial')
con.load_extension('spatial')

In [None]:
con.sql("SHOW TABLES;")

In [None]:
con.sql(f"SELECT * FROM f_vern_land LIMIT 5")

In [None]:
con.sql(f"SELECT * FROM f_vern_sjo LIMIT 5")

**Export to GDF and GPKG-layer <>

In [None]:
# to dataframe
df_land = con.execute(f"SELECT ST_AsText(geom) as geometry, * FROM f_vern_land").fetchdf()

# drop "geom"
df_land = df_land.drop(columns=["geom"])

# Convert the geom column to gpd GeoDataFrame
df_land['geometry'] = df_land['geometry'].apply(wkt.loads)

# to gpd
gdf_land = gpd.GeoDataFrame(df_land, geometry='geometry')
print(f"Number of unique ids: {df_land['identifikasjon_lokalId'].nunique()}")

display(gdf_land.head())

# Export GDF to file 
filepath = os.path.join(data_path, "processed", "vern_og_bevaring")
gdf_land.crs = "EPSG:25833"
# Write to existing .geopackage
gdf_land.to_file(os.path.join(filepath + '.gpkg'), driver='GPKG', layer='f_vern_land', mode='w')


In [None]:
# to dataframe
df_sjo = con.execute(f"""
    SELECT ST_AsText(geom) as geometry, * 
    FROM f_vern_sjo
    WHERE ST_GeometryType(geom) IN ('POLYGON', 'MULTIPOLYGON')
""").fetchdf()

# delete POLYGON EMPTY
df_sjo = df_sjo[df_sjo['geometry'] != "POLYGON EMPTY"]

# drop "geom"
df_sjo = df_sjo.drop(columns=["geom"])

# Convert the geom column to gpd GeoDataFrame
df_sjo['geometry'] = df_sjo['geometry'].apply(wkt.loads)

# to gpd
gdf_sjo = gpd.GeoDataFrame(df_sjo, geometry='geometry')
print(f"Number of unique ids: {gdf_sjo['identifikasjon_lokalId'].nunique()}")

display(gdf_sjo)

# Export GDF to file 
filepath = os.path.join(data_path, "processed", "vern_og_bevaring")
gdf_sjo.crs = "EPSG:25833"
# Write to existing .geopackage
gdf_sjo.to_file(os.path.join(filepath + '.gpkg'), driver='GPKG', layer='f_vern_sjo', mode='w')

In [None]:
# drop geometry column in duckdb and keep only unique ids in table

# drop geom column
remove_field(db_path, "f_vern_land", "geom")
remove_field(db_path, "f_vern_sjo", "geom")

# drop duplicates 
remove_duplicates(db_path, "f_vern_land", "identifikasjon_lokalId")
remove_duplicates(db_path, "f_vern_sjo", "identifikasjon_lokalId")

In [None]:
# number of rows in the new table
con.sql(f"SELECT COUNT(*) FROM f_vern_land")

In [None]:
# number of unique ids in the new table
con.sql(f"SELECT COUNT(DISTINCT identifikasjon_lokalId) FROM f_vern_land")

In [None]:
remove_table(db_path, "f_vern_land_sjo")
join_tables_create_new(
    db_path, 
    tbl1="f_vern_land",
    tbl2="f_vern_sjo",
    id_field="identifikasjon_lokalId",
    new_tbl="f_vern_land_sjo"
)

remove_field(db_path, "f_vern_land_sjo", "identifikasjon_lokalId_1") 
con.sql("SHOW TABLES;")

In [None]:
con.sql(f"SELECT * FROM f_vern_land_sjo LIMIT 5")

In [None]:
# number of rows in the new table
con.sql(f"SELECT COUNT(*) FROM f_vern_land_sjo")

In [None]:
# number of unique ids in the new table
con.sql(f"SELECT COUNT(DISTINCT identifikasjon_lokalId) FROM f_vern_land_sjo")

In [None]:
remove_table(db_path, "f_vern_geovar")
join_tables_create_new(
    db_path, 
    tbl1="foreslatt_vern",
    tbl2="f_vern_land_sjo",
    id_field="identifikasjon_lokalId",
    new_tbl="f_vern_geovar"
)

In [None]:
con.sql("SHOW TABLES;")

In [None]:
con.sql(f"SELECT COUNT(DISTINCT identifikasjon_lokalId) FROM f_vern_geovar")

**Clean table**

In [None]:
# if sjoareal is 95% of the total area, then it is a marine area
# set sjoareal, omkrets and index to 0 if it is less than 5% of the total area or where sjoareal is NULL
# set landareal, omkrets and index  to 0 if it is less than 5% of the total area

# update table 
con.sql(f"""
    UPDATE f_vern_geovar
    SET sjoareal_m2 = 0, sjoomkrets_m = 0
    WHERE sjoareal_m2 < 0.05 * areal_m2
""")

# update table
con.sql(f"""
    UPDATE f_vern_geovar
    SET landareal_m2 = 0, landomkrets_m = 0
    WHERE landareal_m2 < 0.05 * areal_m2
""")

# if null set to 0
con.sql(f"""
    UPDATE f_vern_geovar
    SET sjoareal_m2 = 0, sjoomkrets_m = 0
    WHERE sjoareal_m2 IS NULL
""")

# if null set to 0
con.sql(f"""
    UPDATE f_vern_geovar
    SET landareal_m2 = 0, landomkrets_m = 0
    WHERE landareal_m2 IS NULL
""")

# remove cols
remove_field(db_path, "f_vern_geovar", "area_m2")
remove_field(db_path, "f_vern_geovar", "identifikasjon_lokalId_1")


# print first 5 rows of the table "f_vern_geovar"
con.sql(f"SELECT * FROM f_vern_geovar LIMIT 5")

In [None]:
# to dataframe
df = con.execute(f"SELECT ST_AsText(geom) as geometry, * FROM f_vern_geovar").fetchdf()

# drop "geom"
df = df.drop(columns=["geom"])

# Convert the geom column to gpd GeoDataFrame
df['geometry'] = df['geometry'].apply(wkt.loads)

# to gpd
gdf = gpd.GeoDataFrame(df, geometry='geometry')
print(f"Number of unique ids: {gdf['identifikasjon_lokalId'].nunique()}")


display(gdf)

In [None]:
# Export GDF to file 
filepath = os.path.join(data_path, "processed", "vern_og_bevaring")
gdf.crs = "EPSG:25833"
# Write to existing .geopackage
gdf.to_file(os.path.join(filepath + '.gpkg'), driver='GPKG', layer='f_vern_geovar', mode='w')

# Write to .csv
gdf.to_csv(os.path.join(filepath + '.csv'))

**Infrastrukturindeks**

| Protected area | Mean value | Lav (%) | Middels (%) | Høy (%) | ... |
|----------------|------------|---------|-------------|---------|-----|
| area A         |            |         |             |         |     |


**Høydelag**

| Protected area | 0-300 m (%) | 301-600 m (%) | 601-900 m (%) | over 900 m (%) |
|----------------|-------------|---------------|---------------|----------------|
| area A         |             |               |               |                |


### 3. Regional Statistics


Administrative regions:
- Municipalities
- Counties
- Regions 
    - **Nord**: Finnmark, Troms, Nordland
    - **Midt**: Trøndelag, Møre og Romsdal
    - **Sør**: Agder, Vestfold, Telemark
    - **Øst**: Østfold, Akershus, Oslo, Innlandet, Buskerud
    - **Vest**: Vestland, Rogaland


| Protected area | Municipality | County | Region |
|----------------|---------|-------|--------|
| *NaturvernId*  |*kommune*|*fylke*|*region*|
| area A         |         |       |        |

*Regional datasett*:

| Region | Area | Protected Area | Land Protected Area | Marine Protected Area |
|--------|------|----------------|---------------------|-----------------------|
| *Region* | *areal* | *a_vern* | *a_landvern* | *a_marinvern* |
| Nord ||||||


###  4. Grid Statistics (10x10 km2)

**Rutenettstatistikk:**

| SSB grid cell | area | Density (land + marine) | Density (land) | 
| --------------|------|-------------------------|----------------|
| *grid_ID* | *grid_areal* | *tetthet_tot_vern* | *tetthet_landvern* | 
| cell 1| 100 km2 | ||





***
$\mathbf{\text{Density}}$<br>
***
The density of protected area per 10x10km (SSB grid) is calculated by dividing the area of the protected area by 100 km^2.

**Density:**&emsp;      $Density = \frac{A}{100 km^2}$



In [None]:
# code for density calculation here
# check land cover fraction notebook


# Load admin 
# group by region

# calc region area
geom_area()
# Select by region
# SUM protected area in region 
# Area % = SUM(protected_area_m2)/(areal_m2) * 100

