# swisslandstats preprocessing

We will create an extract of the Canton of Vaud (Switerland) derived from the [Swiss Land Statistics (SLS) datasets from the Swiss Federal Statistical Office](https://www.bfs.admin.ch/bfs/en/home/services/geostat/swiss-federal-statistics-geodata/land-use-cover-suitability/swiss-land-use-statistics.html). We will first use GNU Make to download the data:

The SLS dataset is provided as a comma-separated value (CSV) file where each row corresponds to one of the hectometric pixels that configure the Swiss territory, and features three main goups of columns [1], i.e.:

* the coordinates of the pixels centroid `E`, `N` in the LV95 coordinate reference system (CRS) or `X`, `Y` in the LV03 CRS
* columns starting with `FJ` denote the exact year when the observation was taken
* columns starting with `LU` denote the actual land use/land cover (LULC) code of each pixel

We will now preprocess this data using the [swisslandstats-geopy](https://github.com/martibosch/swisslandstats-geopy) library:

In [None]:
from os import path

import swisslandstats as sls

Below are the parameters needed to run this notebook (to be filled directly in the notebook or using [papermill](https://github.com/nteract/papermill)):

In [None]:
lulc_columns = ["LU85_4", "LU97_4", "LU09_4", "LU18_4"]
dst_dir = "../data/processed"
nominatim_query = ""

We will now:

1. read the data into a land data frame

In [None]:
ldf = sls.load_dataset(dataset_key="sls")

2. clip it to the extent of interest:

In [None]:
if nominatim_query:
    ldf = ldf.clip_by_nominatim(nominatim_query)

3. dump the LULC columns of interest into geotiff files:

In [None]:
for lulc_column in lulc_columns:
    ldf.to_geotiff(path.join(dst_dir, f"{lulc_column}.tif"), lulc_column)

## References

1. Bosch, M. (2019). swisslandstats-geopy: Python tools for the land statistics datasets from the Swiss Federal Statistical Office. Journal of Open Source Software, 4(41), 1511.