# Area Habitat Statistics

## Your Area Statistics

Land cover, Habitat types and Water hydroperiod are all Living wales layers spanning years from 2018-2023. Using this notebook we can quantify the area and percentage covered by each land cover class or habitat type as well as the proportion of the land covered by water (hydroperiod).

This notebook stems from a novel and unique collaboration between Monmouthshire County, Aberystwyth University, Dwr Cymru Welsh Water and Natural Resources Wales that linked *Living Wales*, to national initiatives that give free and open access to remotely sensed data and products to support wise use of the Welsh landscape and a better collective outlook for current and future generations.

## Information about your area.

This Notebook extracts information on your area for different years from the newly developed Welsh Data Cube (WDC), which houses all satellite data acquired over Wales since 2018 and derived products with these including land cover, broad habitats and water/moisture persistence.  

**Land Cover** is the physical and biological cover of the land surface and includes vegetation (managed or semi-natural), water and bare surfaces.  The land cover maps generated through Living Land Management use the legends of the United Nation's Food and Agriculture Organisation (FAO) Land Cover Classification System (LCCS).

**Habitats** represent the natural environments in which individual or groups of plant or animal species lives.  The habitat maps are generated from satellite data and are based on Wales' Phase 1 Habitat Taxonomy.

The **water/moisture persistence** is obtained from time-series of radar data that are acquired almost every day over Wales and indicate relative frequency of wet conditions across the landscape.  

In [None]:
import os
import sys

import datacube
import pandas as pd
import numpy as np
import geopandas as gpd

from datacube.utils.geometry import Geometry, CRS
from ipyleaflet import GeoData

from matplotlib.colors import ListedColormap
import matplotlib.colors as colors
import matplotlib.pyplot as plt
from matplotlib.patches import Patch

sys.path.append("../wales_utils/data_cube_utilities")
from display_tools import map_geom, rgb
from wdc_datahandling import geopolygon_masking

## Define a function to produce summary statistics

For code that will be used multiple times in a notebook it is cleaner to define a function that can be called later. Here we create a function that will perform summary statistics and print them out. We'll come back to this function later in the notebook.

In [None]:
def stat_summary(xarr, scheme):
    """
    A function to perform summary statistics on an xarray object and return as a pandas
    dataframe.
    """
    # Search habitat types in farm
    farm_types = np.unique(xarr, return_counts=True)

    # Create dictionary to store outputs. Will convert this to a pandas data frame
    out_stat_dict = {"CATEGORY": [], "HECTARE": []}

    for color, label in scheme.items():
        if (label[0] in farm_types[0]) & (label[0] != 0):
            out_stat_dict["CATEGORY"].append(label[1])
            area_ha = (farm_types[1][list(farm_types[0]).index(label[0])] * 100) / 10000
            out_stat_dict["HECTARE"].append(area_ha)

    # Convert to a pandas dataframe
    out_stat_df = pd.DataFrame.from_dict(out_stat_dict)

    # Calculate percentage
    out_stat_df["PERCENT"] = out_stat_df["HECTARE"] / out_stat_df["HECTARE"].sum()
    return out_stat_df

## Load in Data
Here we load in the Living Wales land cover and habitat products. These have been calculated and indexed into the open data cube so can be loaded as products.

In [None]:
dc = datacube.Datacube(app="Farms")

In [None]:
product = "lw_landcover_lw"

measurements = dc.list_measurements()
measurements.loc[product]

In [None]:
product = "lw_habitats_lw"

measurements = dc.list_measurements()
measurements.loc[product]

## Area selection 

A shapefile is used to define the area of interest. Within the 'vectors' folder there is an example shape file for the BBNP (`habstat_Boundary.*`) you can use this or upload one of your personal datasets. If you are using the example then download to your machine from the folder view and upload to the `uploads` folder.

### Using Personal datasets

If you want to use your own shapefiles these should be uploaded to the `uploads` folder and the name modified in the code below.

In [1]:
# Provide a name for the shapefile you have uploaded 
file_name = "habstat1_Boundary"
# Provide a name for your area to use in output files.
# We are just using 'test_area' here
area_name = "test_area"
# Select a year to use
year = "2022"

In [None]:
# Open and read the shapefiles
boundary_path = f"../uploads/{file_name}.shp"
boundary_exists = os.path.isfile(boundary_path)


if boundary_exists:
    boundary = gpd.read_file(boundary_path)
else:
    print("Could not find file, please check the name")
    print(os.listdir("../uploads/"))

In [None]:
boundary.head(3)

In [None]:
# Transform shapefile boundaries into geographic data (and affect a style)
geo_data = GeoData(
    geo_dataframe=boundary.to_crs(epsg=4326),
    style={
        "color": "black",
        "fillColor": "#3366cc",
        "opacity": 0.05,
        "weight": 1.9,
        "dashArray": "2",
        "fillOpacity": 0.6,
    },
    hover_style={"fillColor": "red", "fillOpacity": 0.2},
    name="Boundary",
)

# map the geographic data on dynamic map
m = map_geom(geo_data)
m

In [None]:
geom = Geometry(geom=boundary.iloc[0].geometry, crs=CRS("epsg:27700"))
geom

In [None]:
query = {
    "geopolygon": geom,
    "time": (year + "-01-01", year + "-12-31"),
    "output_crs": "EPSG:27700",
    "resolution": (-10, 10),
    "dask_chunks": {"y": 2048, "x": 2048},
}

In [None]:
# Load land cover data for our polygon and time period
lc_dataset = dc.load(product="lw_landcover_lw", **query)
lc_dataset_masked = geopolygon_masking(lc_dataset, geopolygon=geom)

In [None]:
level3plus = (
    lc_dataset_masked.level3.where(lc_dataset_masked.level3 == 112)
    + lc_dataset_masked.lifeform
).fillna(0) + (lc_dataset_masked.level3.where(lc_dataset_masked.level3 != 112)).fillna(
    0
)
lc_dataset["level3plus"] = level3plus

In [None]:
# Level3plus colour scheme
level3plus_scheme = {
    "#FFFFFF": [0.0, "Not classified"],
    "#D1E133": [111.0, "Cultivated or managed terrestrial vegetation"],
    "#007A02": [113.0, "Semi-natural terrestrial woody vegetation"],
    "#95c748": [114.0, "Semi-natural terrestrial herbaceous vegetation"],
    "#4EEEE8": [123.0, "Cultivated or managed aquatic vegetation"],
    "#02C077": [124.0, "Semi-natural aquatic vegetation"],
    "#DA5C69": [215.0, "Artificial surface"],
    "#F3AB69": [216.0, "Bare surface"],
    "#4D9FDC": [220.0, "Water"],
}

# Colour map
level3plus_cmap = ListedColormap(list(level3plus_scheme.keys()))
# Level3plus classes
# Define a normalization from values -> colors
level3plus_norm = colors.BoundaryNorm(
    [value[0] for value in level3plus_scheme.values()], 9
)

## Plot Land Cover Map

In [None]:
# Plotting
lc_fig, ax = plt.subplots(figsize=(20, 10))

lc_plot = ax.imshow(
    lc_dataset.level3plus.isel(time=0),
    cmap=level3plus_cmap,
    norm=level3plus_norm,
    extent=[
        lc_dataset.x.min().data,
        lc_dataset.x.max().data,
        lc_dataset.y.min().data,
        lc_dataset.y.max().data,
    ],
)

# Specify if want to show bounds on image (if they exist)
show_bounds = True
if boundary_exists and show_bounds:
    boundary.boundary.plot(ax=ax, ec="#e72323", linewidth=3)

patches = [
    Patch(color=color, label=label[1]) for color, label in level3plus_scheme.items()
]

ax.legend(handles=patches, bbox_to_anchor=(1.35, 0.3), facecolor="white")

# ax.set_axis_off()
plt.show()

### Save Figures to file
The figure can be saved out as a PNG file for inclusion in reports. It is also possible to skip the `show` command and just save the figure to a file, this can be useful when producing a lot of figures (e.g., for different years).

In [None]:
lc_fig.savefig(f"Land_cover_{area_name}_{year}.png")

## Quantify the area of each land cover in your area

Here we are using the `stat_summary` function defined at the start of the notebook to print summary satistics.

In [None]:
landcover_stats_df = stat_summary(lc_dataset.level3plus, level3plus_scheme)
landcover_stats_df

### Calculate the total area
Not the results are in a Pandas data frame it is possible to perform different statistics on them, for example the sum to get the total area

In [None]:
total_area = landcover_stats_df["HECTARE"].sum()
print(f"Total area {total_area:.2f} ha")

## Produce some plots
Pandas also has built in functions to produce plots from the data. Here we are going to produce a pie chart to show the proportion of each category

In [None]:
landcover_stats_df.set_index("CATEGORY").plot.pie(
    y="HECTARE", legend=False, ylabel="Area"
)

### Save out to a CSV file
It is also possible to save out pandas dataframes to a CSV file so they can be opened in Excel. Here we are specifying we only want outputs to two decimal places using `float_format`.

In [None]:
landcover_stats_df.to_csv(
    f"Land_cover_{area_name}_{year}_stats.csv", float_format="%.2f", index=False
)

## Habitat data
We can perform a similar process of subsetting data and calculating statistics using the habitat data provided by Living Wales.

In [None]:
# Load habitat data for our polygon and time period
habitat_dataset = dc.load(product="lw_habitats_lw", **query)
habitat_dataset_masked = geopolygon_masking(habitat_dataset, geopolygon=geom)

In [None]:
# Level3plus colour scheme
broadhabitat_scheme = {
    "#FFFFFF": [0.0, "Not classified"],
    "#00C502": [1.0, "Broadleaved woodland"],
    "#006902": [2.0, "Needle-leaved woodland"],
    "#CEF191": [3.0, "Semi-natural grassland"],
    "#C91FCC": [4.0, "Heathland and Scrub"],
    "#F2A008": [5.0, "Bracken"],
    "#F8F8C9": [6.0, "Bog"],
    "#177E88": [7.0, "Fen/Marsh/Swamp"],
    "#FFFF00": [8.0, "Cultivated or managed vegetation"],
    "#00DDA4": [9.0, "Coastal habitat"],
    "#0E00ED": [10.0, "Open Water"],
    "#908E8D": [11.0, "Natural Bare Surfaces"],
    "#000000": [12.0, "Artificial Bare Surfaces"],
    "#DAC654": [13.0, "Young trees/Felled/Coppice"],
    "#5d994e": [14.0, "Woodland and scrub"],
}

# Habitat colour scheme
broadhabitat_cmap = ListedColormap(list(broadhabitat_scheme.keys()))
# Habitat classes
broadhabitat_norm = colors.BoundaryNorm(
    [value[0] for value in broadhabitat_scheme.values()], 15
)

### Plot Habitat map

In [None]:
# Plotting
habitat_fig, ax = plt.subplots(figsize=(20, 10))

habitat_plot = ax.imshow(
    habitat_dataset_masked.broad.isel(time=0),
    cmap=broadhabitat_cmap,
    norm=broadhabitat_norm,
    extent=[
        habitat_dataset.x.min().data,
        habitat_dataset.x.max().data,
        habitat_dataset.y.min().data,
        habitat_dataset.y.max().data,
    ],
)

show_bounds = True
if boundary_exists and show_bounds:
    boundary.boundary.plot(ax=ax, ec="#e72323", linewidth=3)


patches = [
    Patch(color=color, label=label[1]) for color, label in broadhabitat_scheme.items()
]

ax.legend(handles=patches, bbox_to_anchor=(1.35, 1), facecolor="white")

# ax.set_axis_off()
plt.show()

### Save the figure to a file

In [None]:
habitat_fig.savefig(f"Broad_habitats_{area_name}_{year}.png")

### Quantify the area of each habitat in your area

In [None]:
habitat_stats_df = stat_summary(habitat_dataset_masked.broad, broadhabitat_scheme)
habitat_stats_df

You may want to produce some plots or save to a CSV, as for land cover.

In [None]:
 You may want to 

## Water persistance data

In [None]:
# Level3plus colour scheme
waterper_scheme = {
    "#FFFFFF": [0.0, "Not affected"],
    "#0a549e": [1.0, "9+ months"],
    "#2172b6": [2.0, "8 months"],
    "#3e8ec4": [3.0, "7 months"],
    "#60a6d2": [4.0, "6 months"],
    "#89bfdd": [5.0, "5 months"],
    "#b0d2e8": [6.0, "4 months"],
    "#cde0f2": [7.0, "3 months"],
    "#cde0f2": [8.0, "2 months"],
    "#e8f2fb": [9.0, "1 month"],
}

# Water/wetness persistence colour scheme
waterper_cmap = ListedColormap(list(waterper_scheme.keys()))
# Habitat classes
waterper_norm = colors.BoundaryNorm(
    [value[0] for value in waterper_scheme.values()], 11
)

### Map Water persistence

In [None]:
# Plotting
waterper_fig, ax = plt.subplots(figsize=(20, 10))

waterper_plot = ax.imshow(
    lc_dataset_masked.waterpersist.isel(time=0),
    cmap=waterper_cmap,
    norm=waterper_norm,
    extent=[
        lc_dataset.x.min().data,
        lc_dataset.x.max().data,
        lc_dataset.y.min().data,
        lc_dataset.y.max().data,
    ],
)

show_bounds = True
if boundary_exists and show_bounds:
    boundary.boundary.plot(ax=ax, ec="#e72323", linewidth=3)

patches = [
    Patch(color=color, label=label[1]) for color, label in waterper_scheme.items()
]

ax.legend(handles=patches, bbox_to_anchor=(1.35, 1), facecolor="white")
ax.set_axis_off()
plt.show()

### Save the figure to a file

In [None]:
waterper_fig.savefig(f"Soil_moisture_persistence_{area_name}_{year}.png")

### Show how much land has persistant water on the surface for how long

In [None]:
waterper_stats_df = stat_summary(lc_dataset_masked.waterpersist, waterper_scheme)
waterper_stats_df