# Milestone 2 example workflow: Lake Water Quality Monitoring

## Introduction

Water quality monitoring in lakes and reservoirs has a number of applications, from climate science to aquaculture. Earth observation techniques enable us to compliment ground sensors by monitoring over a larger area or attempt to provide information where no ground sensors are present.

Here we want to go through the process of calculating some of the properties used in modelling water quality. For our case study, we will focus on a part of Lake Victoria in Southern Uganda.

We want to measure values for different water quality properties. [Several different properties can be estimated using remote sensing](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5017463/pdf/sensors-16-01298.pdf). Commonly we would work with in-situ data to create some empirical relationships. Here, rather than obtaining physical concentrations we can simply estimate relative quantities and will focus on the following as an example:
1. Chlorophyll-a, chl-a
2. Colored dissolved organic matter, CDOM
3. Water surface temperature

In [1]:
import rasterio
import geopandas as gpd
import contextily as ctx
from shapely.geometry import asPolygon
import numpy as np
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import glob

# yt packages
import yt
import yt.extensions.geotiff

In [None]:
footprint_geom = asPolygon(
    np.array(
        [
            [
              31.629638671875,
              -1.2962761196418089
            ],
            [
              34.11529541015625,
              -1.2962761196418089
            ],
            [
              34.11529541015625,
              0.6426867176331666
            ],
            [
              31.629638671875,
              0.6426867176331666
            ],
            [
              31.629638671875,
              -1.2962761196418089
            ]
        ]
    )
)

In [None]:
footprint = gpd.GeoDataFrame([{'id':0, 'geometry': footprint_geom}])
footprint.crs = "epsg:4326"
footprint
ax = footprint.to_crs(epsg=3857).boundary.plot(figsize=(14,12))
ctx.add_basemap(ax, url=ctx.providers.Stamen.TonerLite)
ax.set_axis_off()

For chl-a, we can estimate values using the different reflectance bands in the following equations:

`chl-a ~ R(443 nm)/R(560 nm) ~ (B1/B3)` <= Case 1: for water dominated by phytoplankton

`chl-a ~ R(705 nm)/R(665 nm) ~ (B5/B4)` <= Case 2: more complex mixture of optically active components (including CDOM)

A more precise index which is commonly used to provide an estimated upper limit on the chlorophyll content is the Maximum Chlorophyll Index, `MCI = R(705 nm) - R(605 nm) - 0.53*(R(740 nm) - R(605 nm)) = B5 - B4 - 0.53*(B6 - B4)`.

And we can estimate CDOM absorption at 440 nm using the following:

`CDOM(440 nm) ~ 8 * (B3 / B2)^(-1.4)`

## Loading, resampling and plotting Sentinel-2 data
Sentinel-2 datasets products with European Space Agency (ESA) level 1C and 2A processing can be loaded and queried within yt_geotiff. All band files within the Sentinel-2 directory are up/down sampled accordingly to match the original load image. Cubic spline interpolation can be applied in order to improve the visual appearance. 

In [None]:
fn1= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B01.jp2'
fn2= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B02.jp2'
fn3= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B03.jp2'
fn4= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B04.jp2'
fn5= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B05.jp2'
fn6= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B06.jp2'
fn7= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B07.jp2'
fn8= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B08.jp2'
fn9= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B8A.jp2'
fn10= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B09.jp2'
fn11= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B10.jp2'
fn12= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B11.jp2'
fn13= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/T36MVE_20210315T075701_B12.jp2'
fn14= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B1.TIF'
fn15= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B2.TIF'
fn16= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B3.TIF'
fn17= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B4.TIF'
fn18= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B5.TIF'
fn19= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B6.TIF'
fn20= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_SR_B7.TIF'
fn21= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/LC08_L2SP_171060_20210227_20210304_02_T1_ST_B10.TIF'
fn22= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/s1b-iw-grd-vv-20210324t062938-20210324t063003-026153-031ee7-001.tiff'
fn23= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/s1b-iw-grd-vh-20210324t062938-20210324t063003-026153-031ee7-001.tiff'

In [174]:
# Search for image files in directory
directory= 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/S2_Landsat_test/'
types = ('*.jp2', '*.TIF') # the tuple of file types
filenames = ""
files_grabbed = []
for files in types:
    files_grabbed.extend(glob.glob(directory+files))

In [180]:
file_num = range(0,len(files_grabbed))

In [181]:
file_num

range(0, 22)

In [182]:
#ds = yt.load(fn5,fn18) # S2 + L8: Test 1
ds = yt.load(*[str(files_grabbed[fn]) for fn in file_num])
#ds = yt.load(fn5) # S2:
#ds = yt.load(fn18) # L8


yt : [INFO     ] 2021-04-12 16:40:16,708 Parameters: domain_dimensions         = [1830 1830    1]
yt : [INFO     ] 2021-04-12 16:40:16,710 Parameters: domain_left_edge          = [ 399960. 9890200.       0.] m
yt : [INFO     ] 2021-04-12 16:40:16,711 Parameters: domain_right_edge         = [5.0976e+05 1.0000e+07 1.0000e+00] m


In [183]:
ds.field_list

[('bands', 'LS_B1'),
 ('bands', 'LS_B10'),
 ('bands', 'LS_B2'),
 ('bands', 'LS_B3'),
 ('bands', 'LS_B4'),
 ('bands', 'LS_B5'),
 ('bands', 'LS_B6'),
 ('bands', 'LS_B7'),
 ('bands', 'S2_B01'),
 ('bands', 'S2_B02'),
 ('bands', 'S2_B03'),
 ('bands', 'S2_B04'),
 ('bands', 'S2_B05'),
 ('bands', 'S2_B06'),
 ('bands', 'S2_B07'),
 ('bands', 'S2_B08'),
 ('bands', 'S2_B09'),
 ('bands', 'S2_B10'),
 ('bands', 'S2_B11'),
 ('bands', 'S2_B12'),
 ('bands', 'S2_B8A'),
 ('bands', 'S2_TCI')]

In [184]:

#p = ds.plot(('bands', 'LS_B5'), height=(10., 'km'), width=(20., 'km'), center=ds.arr([471696,9989860],'m')) # s2
p = ds.plot(('bands', 'LS_B5'), height=(10., 'km'), width=(20., 'km'), center=ds.arr([471590,-10285],'m')) #l
p.set_log(('bands', 'LS_B5'), True)
#p.set_zlim(('bands', 'LS_B5'), 10000, 20000)
p.set_cmap(('bands', 'S2_B05'), 'turbo')
p.show()


p = ds.plot(('bands', 'S2_B05'), height=(10., 'km'), width=(20., 'km'), center=ds.arr([471696,9989860],'m'))# s2
p = ds.plot(('bands', 'S2_B05'), height=(10., 'km'), width=(20., 'km'), center=ds.arr([471590,-10285],'m'))#l
p.set_log(('bands', 'S2_B05'), False)
p.set_cmap(('bands', 'S2_B05'), 'B-W LINEAR')
p.show()


RuntimeError: Region right edge[1] < left edge: width = 0.0

In [None]:
ds.derived_field_list

## 1) Picking out water bodies

Sentinel-2 NDWI for water body detection can be constructed by using:

"Green" Band 3 (559nm) and "NIR" Band 8A (864nm)

NDWI = (Green - NIR)/(Green + NIR)

### Query the NDWI derivable field and convert to numpy array

In [None]:
# Define dimensions
width = ds.arr(5., 'km')
height = ds.arr(5.,'km')
rectangle_centre = ds.arr([457770,9946294],'m')

rectangular_yt_container = ds.rectangle_from_center(rectangle_centre,width,height)
ndwi_data = rectangular_yt_container[('bands', 'NDWI')].d

In [None]:
ndwi_data.shape

### Have a look at the histogram of pixel values for NDWI

In [None]:
fig1, (ax1) = plt.subplots(1,1, figsize=(14,6))
sns.set_theme(style="ticks")
sns.histplot(data=ndwi_data, ax=ax1, color="b")
ax1.set_xlabel('NDWI')

### Isolate water pixels based on a threshold:

##### < 0.1 - Non-water
##### => 0.1 - Water

In [None]:
water_pixels = ndwi_data[ndwi_data>=0.1]

### Calculate area coverage of water pixels:

In [None]:
#ad = ds.all_data() IN km**2
water = rectangular_yt_container.cut_region(["obj['bands', 'NDWI'] >= 0.1"])
print (water["index", "area"].sum())

## 2) Maximum Chlorophyll Index

Calculating MCI using the following `MCI = B5 - B4 - 0.53*(B6 - B4)` from above.

In [None]:
mci_data= water[('bands', 'MCI')].d

In [None]:
fig2, (ax1) = plt.subplots(1,1, figsize=(14,6))
sns.histplot(data=mci_data, ax=ax1, color="g")
ax1.set_xlabel('MCI')

## 3) Water temperature

Temperature estimated with Landsat 8

In [None]:
# Define dimensions
width = ds.arr(5000., 'm')
height = ds.arr(5000.,'m')
rectangle_centre = ds.arr([444808,9951471],'m')

rectangular_yt_container = ds.rectangle_from_center(rectangle_centre,width,height)

In [None]:
rectangular_yt_container[('bands', 'LS_temperature')]

In [None]:
p = ds.plot(('bands', 'LS_temperature'), height=height, width=width, center=rectangle_centre)
p.set_log(('bands', 'LS_temperature'), False)
p.set_cmap(('bands', 'LS_temperature'), 'B-W LINEAR')
p.show()

In [None]:
ds.filename_list

## Sentinel-1 data (THIS MIGHT BE REMOVED)

In [None]:
filepath = 'C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/s1_test/temp.tiff'

In [None]:
ds = yt.load(filepath)

In [None]:
ds.field_list

In [None]:
# Define dimensions
width = ds.arr(1000, 'm')
height = ds.arr(1000,'m') 
rectangle_centre = ds.arr([447768,9950530],'m')

rectangular_yt_container = ds.rectangle_from_center(rectangle_centre,width,height)

In [None]:
rectangular_yt_container[('bands','temp')]

In [None]:
with rasterio.open(filepath) as src:
    print(src.crs)
    print(src.transform)
    
    #rasterio.transform.from_gcps(gcps)

In [None]:
with rasterio.open(filepath, 'r') as src:
    gcps, gcp_crs = src.gcps

In [None]:
gcp_crs

In [None]:
src.dtypes

In [None]:
rasterio.transform.from_gcps(gcps)

In [None]:
new_dataset = rasterio.open("C:/Users/arevi/OneDrive/YT_GITHUB_v2/TEST_DATASETS/new2.tiff",
                             'w', 
                             driver="GTiff",
                             height=src.shape[0],
                             width=src.shape[1],
                             count=1,
                             dtype='uint8',
                             crs='+proj=latlong',
                             transform=rasterio.transform.from_gcps(gcps))


write temp file