In [1]:
%load_ext watermark
import pandas as pd
import numpy as np

# Review of land-use process (in progress)

## Notes 1'500 meter buffer (circular)

The land-use chapter in the federal report generated alot of interest. It inspired an academic article and collaboration with Wagenigen Research and University. The principal was applied using an empirical Bayes method with the Solid-Waste-Team at the EPFL.

1. the area of the buffer is  $\pi * 1500²$ or 7065000 m²

### Land-use and land-cover

__Land-cover:__ Are elements such as buildings, forest, orchard or undefined. There is at least one land-cover element for each survey location. If so then it takes 100% of the available dry land. 

__Land-use:__ Are elements that may or may not be in the buffer. These are items like schools, hospitals, water-treatment plants. Individually they are only a small portion of the available dry land. 

#### Extracting land-cover and land-use:

For this method we are using the land-cover layer from swissTLM regio

In QGIS:

1. create a buffer around each survey point
   * make sure that the survey location and feature_type is in the attributes of the new buffer layer
   * the survey locations are loaded as points from .csv file
   * reproject the points layer to the project CRS 

2. use the new buffer layer as an overlay to the land-cover layer
   * use the overlay intersection tool
   * select the fields to keep from the buffer (slug and feature type)
   * select the fields to keep from the land-cover layer
   * run the function
   * this creates a temporary layer called _intersection_

3. get the surface area of all the land-cover and land-use features in each buffer of the temporary layer
   * use the field calculator for the attribute table of the layer
   * in the field calculator, make a new field and enter the formula `\$area`
   * for this example the method is elipsoid _bessel 1841 (epsg 7001)_
   * this is set in the properties of the QGIS project
   * Export the layer as .csv

4. verify the land-use features per location
   * drop duplicate values: use location, feature and area to define duplicates
   * attention! different names for lake and reservoir
     * change Stausee to See

5. make a dry land feature
   * this is the surface area of the buffer that is not covered by water
   * substract the area of See from the area of the buffer
   * identify survey locations that have siginifcant water features but are not listed as lakes
  
6. Scale the land-use attributes of interest to the available dry-land
  
__Example making dry land columns and scaling the land-use__

    
```python
# locations with significant still water feature but
# not listed as a lake
see_not_lake = l_u[(l_u.feature == "See")&(l_u.feature_ty != "l")]
snl = see_not_lake.slug.unique()

# lakes and locations in snl
# recall that feature type has a designator for lakes
lakes = lg[(lg.feature_ty == 'l') | lg.slug.isin(snl)].copy()

# from this subset of data separate the surface area covered by water
# set the slug to the index and substract the surface area of the water
# from the surface area of the buffer
lake_only = lakes[lakes.feature == "See"]
lo = lake_only[["slug", "area"]].set_index("slug")

# substract the lake value from the area of the buffer
lo["dry"] = 7065000 - lo.area
lodry = lo["dry"]

# merge the original land use data from lakes with the
# the dry land results
lgi = lakes.merge(lo["dry"], left_on="slug", right_index=True)
# remove the lake feature from the features columns
lgi = lgi[lgi.feature != "See"].copy()

# scale the landuse feature to the available dry land
lgi["scale"] = (lgi.area/lgi.dry).round(3)

# repeat the process for locations that do not have a lake feature
# these locations are accounted for above
eliminate = [*snl, *lo.index]
# recuperate all other locations
rivers_parcs = lg[~lg.slug.isin(eliminate)].copy()
# define the dry land as the area of the buffer
rivers_parcs["dry"] = 7065000
# scale the features with the dry land value
rivers_parcs["scale"] = rivers_parcs.area/rivers_parcs.dry

# combine the two results
land_cover = pd.concat([rivers_parcs, lgi])
```



### quantity

In [31]:
def collect_vitals(data):
    total = data.quantity.sum()
    median = data.pcs_m.median()
    samples = data.loc_date.nunique()
    ncodes = data.code.nunique()
    nlocations = data.slug.nunique()
    nbodies = data.feature_name.nunique()
    ncities = data.city.nunique()
    min_date = data["date"].min()
    max_date = data["date"].max()
    
    return total, median, samples, ncodes, nlocations, nbodies, ncities, min_date, max_date

def find_missing(more_than, less_than):
    return np.setdiff1d(more_than, less_than)
def find_missing_loc_dates(done, dtwo):
    locs_one = done.loc_date.unique()
    locs_two = dtwo.loc_date.unique()
    return find_missing(locs_one, locs_two)

def aggregate_gcaps_gfoams_gfrags(data, codes,columns=["Gfoams", "Gfrags", "Gcaps"]):
    for col in columns:
        change = codes.loc[codes.parent_code == col].index
        data.loc[data.code.isin(change), "code"] = col
        
    return data

def make_a_summary(vitals, add_summary_name=False):

    a_summary = f"""
    Number of objects: {vitals[0]}
    
    Median pieces/meter: {vitals[1]}
    
    Number of samples: {vitals[2]}
    
    Number of unique codes: {vitals[3]}
    
    Number of sample locations: {vitals[4]}
    
    Number of features: {vitals[5]}
    
    Number of cities: {vitals[6]}
    
    Start date: {vitals[7]}
    
    End date: {vitals[8]}
    
    """

    if add_summary_name:
        a_summary = f"""
        Summary name = {add_summary_name}

        {a_summary}
        """
        
    return a_summary
def combine_survey_files(list_of_files):

    files = []
    for afile in list_of_files:
        files.append(pd.read_csv(afile))
    return pd.concat(files)

def indexed_feature_data(file, index: str = "code"):
    df = pd.read_csv(file)
    df.set_index(index, drop=True, inplace=True)
    return df
code_cols = ['material', 'description', 'source', 'parent_code', 'single_use', 'groupname']

group_by_columns = [
    'loc_date', 
    'date', 
    'feature_name', 
    'slug',     
    'parent_boundary',
    'length',
    'groupname',
    'city',
    'code', 
]
agg_this = {
    "quantity":"sum",
    "pcs_m": "sum"
}




survey_data = [
    "data/end_process/after_may_2021.csv",
    "data/end_process/iqaasl.csv",
    "data/end_process/mcbp.csv",
    "data/end_process/slr.csv",
]

code_data =  "data/end_process/codes.csv"
beach_data = "data/end_process/beaches.csv"
land_cover_data = "data/end_process/land_cover.csv"
land_use_data = "data/end_process/land_use.csv"

surveys = combine_survey_files(survey_data)
codes = indexed_feature_data(code_data, index="code")
beaches = indexed_feature_data(beach_data, index="slug")
land_cover = pd.read_csv(land_cover_data)
land_use = pd.read_csv(land_use_data)

In [22]:
land_cover[land_cover.slug == "parc-des-pierrettes"]

Unnamed: 0,slug,feature_ty,feature,area,dry,scale
141,parc-des-pierrettes,l,undefined,68285,3832559,0.018
623,parc-des-pierrettes,l,Siedl,3606885,3832559,0.941
624,parc-des-pierrettes,l,Wald,157389,3832559,0.041


In [32]:
land_use[land_use.slug == "parc-des-pierrettes"]

Unnamed: 0,slug,feature_ty,dry,feature,area,scale
685,parc-des-pierrettes,l,3832559,Abwasserreinigungsareal,49488,0.012913
686,parc-des-pierrettes,l,3832559,Friedhof,11325,0.002955
687,parc-des-pierrettes,l,3832559,Oeffentliches Parkareal,175591,0.045816
688,parc-des-pierrettes,l,3832559,Schrebergartenareal,17306,0.004516
689,parc-des-pierrettes,l,3832559,Schul- und Hochschulareal,1189161,0.310279


In [24]:
%watermark -a hammerdirt-analyst -co --iversions

Author: hammerdirt-analyst

conda environment: cantonal_report

numpy : 1.25.2
pandas: 2.0.3

