# TroMoM
### Tropical Mosquito Monitor

In this notebook, we read the preprocessed data as if the data were provided by the customer.
Then, the data gets analyzed and further processed to the point of outputting the hazard map.

The processing steps, as specified in the project plan, are the following:


![](data_flow.png)

### Imports

In [None]:
import os

import numpy as np

import rasterio

### Read Data
As a first step, read preprocessed analysis-ready data from file.

In [None]:
""" Expected data structure

<area_name>_EPSG<epsg>_<date:YYYY-MM-DD> (one folder per sample, defined by time and area)
|
|- surface_temperature.tiff
|- soil_moisture.tiff
|- ndvi.tiff
|- population_density.tiff
"""
dir_data = "data/processed/borneo_EPSG4326_2023-02-15"    # example data directory


def read_data(data_dir_list):
    out = []
    for data_dir in data_dir_list:
        temp = rasterio.open(os.path.join(data_dir, "surface_temperature.tiff"))
        moist = rasterio.open(os.path.join(data_dir, "soil_moisture.tiff"))
        ndvi = rasterio.open(os.path.join(data_dir, "ndvi.tiff"))
        pop = rasterio.open(os.path.join(data_dir, "population_density.tiff"))

        print(temp.shape, type(temp))

        out.append(np.concatenate((moist, temp, ndvi, pop)))

    return out

In [None]:
data_dir_list = os.listdir("data/processed")
# optionally provide spatial/temporal filtering options

data = read_data(data_dir_list)     # a list of stacked data samples

### Process Data
Now, the actual processing for our product starts, following the data flow specified at the top.
The simplest algorithm would be to threshold all data layers separately, so we specify the necessary thresholds.

In [None]:
# need to be determined by research/looking at example values in swampy areas

# example values, typical values need to be determined after download of data
thresh_temp = [20, 35]
thresh_moisture = [.3, .9]
thresh_ndvi = [.3, 1]
thresh_pop = [.1, 1]

In [None]:
def classify_by_threshold(data_list, threshs):
    out = []
    for data in data_list:
        assert data.shape[0] == len(threshs), "Number of thresholds must equal number of data layers."
        classif = np.zeros(data.shape[:2])  # output per sample

        for i, layer_, thresh_ in enumerate(zip(data, threshs)):
            lower, upper = thresh_
            classif[layer_ > lower & layer_ < upper] += 10**i   # keep track which conditions are met

        out.append(classif)

    return out