# <b>MODIS Water MODAPS Ancillary Validation Notebook</b>

Purpose: Used to perform validation of ancillary masked C61 MOD44W products. Compares those products to pre-masked C61 and the previous version, C6 MOD44W.

*Note: We are following an incremental development lifecycle. This notebook is the first rendition which fit most of the requirements. Expect incremental releases which continue towards the goal of fully meeting requirements and increasing capabilities of the user.*

Installation requirements:

```bash
pip install localtileserver
```

TODO:
- ipysheet for user to input comments
- load layers from toolbar
- move everything inside a class to avoid user input

Some references:

- https://towardsdatascience.com/bring-your-jupyter-notebook-to-life-with-interactive-widgets-bc12e03f0916
- https://github.com/giswqs/geodemo/blob/master/geodemo/common.py

Version: 0.0.1
Date: 02/09/2022

*For DSG internal use*

### <b> WARNING </b>

Do not run all cells at once, doing so will shut down the local tile servers before you, the user, can interact.

Uncomment if localtileserver is not installed

In [1]:
# !pip install localtileserver

In [1]:
import sys
sys.path.append('../src')
from ingest_mw_hdf import search_hdf_file_path, get_hdf_subdataset_path, read_hdf
from diff_products import read_path

In [2]:
import os
import re
import json
import joblib
import tempfile
import ipysheet
import numpy as np
import pandas as pd
import rasterio as rio
import rioxarray as rxr
import xarray as xr
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import ipywidgets as widgets
import warnings
import tempfile

from osgeo import gdal
from pprint import pprint

from glob import glob
from ipysheet import from_dataframe
from localtileserver import TileClient, get_leaflet_tile_layer, examples
from ipyleaflet import Map, Marker, basemaps, ScaleControl, LayersControl, AwesomeIcon
from ipyleaflet import LegendControl, FullScreenControl, MarkerCluster, Popup

os.environ['LOCALTILESERVER_CLIENT_PREFIX'] = \
    f"{os.environ['JUPYTERHUB_SERVICE_PREFIX'].lstrip('/')}/proxy/{{port}}"

import localtileserver
from localtileserver import get_leaflet_tile_layer, TileClient

## Tile and year selection

Choose which tile (see MODIS grid) and which year. Reference the grid image. 

The `h` followed by two numerical digits represent the <b>horizontal</b> tile ID. Use the column space to determine this ID. 

The `v` followed by two numerical digits represent the <b>vertical</b> tile ID. Use the row space to determine this ID. 

For example, the tile that is 9 columns to the right and 5 rows down is `h09v05`.

Example:
```python
TILE = 'h09v05'
```

![MODIS Grid Overlay](../imgs/modis_overlay.png)

In [3]:
TILE = 'h12v04'
# TILE = 'h12v05'
# TILE = 'h12v09'
# TILE = 'h11v10'

In [4]:
YEAR = 2020

Shouldn't need to change anything under this 

In [5]:
MOD44W_C6_BASEPATH = '/explore/nobackup/people/mcarrol2/MODIS_water/v5_outputs/'
MOD44W_C61_BASEPATH = '/explore/nobackup/projects/ilab/data/MODIS/PRODUCTION/MODAPS_test1_11302022'
MOD44W_C61_VERSION = '001'
MOD44W_C61_ANCILLARY_BASEPATH = '/explore/nobackup/people/cssprad1/projects/modis_water/' + \
    'code/ancillary_masks/modis_water_src_change/data'
MOD44W_C61_ANCILLARY_VERSION = '001'
MOD44W = 'MOD44W'

C6_FILE_TYPE = '.tif'
C61_FILE_TYPE = '.hdf'
C61_ANCILLARY_FILE_TYPE = '.bin'

TMP_FILE_TYPE = '.tif'

C61_ANCILLARY_ALG_TYPE = 'Simple'
C61_ANCILLARY_WATER_MASK = 'AnnualWaterProduct'
C61_ANCILLARY_WATER_MASK_QA = 'AnnualWaterProductQA'
C61_ANCILLARY_SEVEN_CLASS = 'AnnualSevenClass'

HDF_NAME_PRE_STR: str = 'MOD44W.A'
HDF_PRE_STR: str = 'HDF4_EOS:EOS_GRID:"'
WATER_MASK_POST_STR: str = '":MOD44W_250m_GRID:water_mask'
SEVEN_CLASS_POST_STR: str = '":MOD44W_250m_GRID:seven_class'
QA_MASK_POST_STR: str = '":MOD44W_250m_GRID:water_mask_QA'

SEVEN_CLASS = 'seven_class'
WATER_MASK = 'water_mask'
WATER_MASK_QA = 'qa_mask'

if YEAR > 2019:
    warnings.warn('Using 2019 C6 MOD44W')
    MOD44_C6_YEAR = 2019
else:
    MOD44_C6_YEAR = YEAR

tiles_basemap: str = 'https://mt1.google.com/vt/lyrs=s&x={x}&y={y}&z={z}'
water_c6_cmap: list = ['#E3B878', '#2d7d86']
water_c61_cmap: list = ['#194d33', '#8ed1fc']
water_c61_ancillary_cmap: list = ['#0057d7', '#6482ff']
water_qa_cmap: list = ['#79d2a6', '#ff6900', '#e4efe9']
water_ancillary_qa_cmap: list = ['#31ff00', '#91ffd3', '#e64a19', '#dd00ff', '#F78DA7', '#ff6900', '#767676', '#79d2a6']

CACHE_DIR = '.cache'
os.makedirs(CACHE_DIR, exist_ok=True)



In [6]:
mod44w_c6_path = os.path.join(MOD44W_C6_BASEPATH, str(MOD44_C6_YEAR), f'MOD44W_{TILE}_{MOD44_C6_YEAR}_v5.tif')
if not os.path.exists(mod44w_c6_path):
    raise FileNotFoundError(f'Could not find the MOD44W C6 file: {mod44w_c6_path}')

In [7]:
def parse_qa(qa_array):
    qa_array_parsed = np.where(qa_array == 0, 0, -1)
    qa_array_parsed = np.where(qa_array == 4, 1, qa_array_parsed)
    qa_array_parsed = np.where(qa_array == 6, 2, qa_array_parsed)
    qa_array_parsed = np.where(qa_array == 9, 3, qa_array_parsed)
    return qa_array_parsed

def parse_ancillary_qa(qa_array):
    qa_array_parsed = np.where(qa_array == 0, 0, qa_array)
    qa_array_parsed = np.where(qa_array == 10, 0, qa_array_parsed)
    qa_array_parsed = np.where(qa_array == 9, 8, qa_array_parsed)
    print(np.unique(qa_array_parsed))
    return qa_array_parsed

In [8]:
hdf_base_path = os.path.join(MOD44W_C61_BASEPATH, MOD44W, str(YEAR), MOD44W_C61_VERSION)
c61_file_path = search_hdf_file_path(hdf_base_path, YEAR, TILE)

c61_water_mask_subdataset_path = get_hdf_subdataset_path(c61_file_path, WATER_MASK_POST_STR)
c61_qa_mask_subdataset_path = get_hdf_subdataset_path(c61_file_path, QA_MASK_POST_STR)

mod44w_c61_data_array = read_hdf(c61_water_mask_subdataset_path)['ndarray']
mod44w_c61_qa_data_array = read_hdf(c61_qa_mask_subdataset_path)['ndarray']

mod44w_c61_ancillary_regex = os.path.join(MOD44W_C61_ANCILLARY_BASEPATH, str(YEAR), TILE, f'MOD44W.A{YEAR}.{TILE}.{C61_ANCILLARY_ALG_TYPE}.{C61_ANCILLARY_WATER_MASK}.*{C61_ANCILLARY_FILE_TYPE}')
mod44w_c61_ancillary_qa_regex = os.path.join(MOD44W_C61_ANCILLARY_BASEPATH, str(YEAR), TILE, f'MOD44W.A{YEAR}.{TILE}.{C61_ANCILLARY_ALG_TYPE}.{C61_ANCILLARY_WATER_MASK_QA}.*{C61_ANCILLARY_FILE_TYPE}')

mod44w_c61_ancillary_path = sorted(glob(mod44w_c61_ancillary_regex))[0]
mod44w_c61_ancillary_qa_path = sorted(glob(mod44w_c61_ancillary_qa_regex))[0]

mod44w_c61_ancillary_data_dict = read_path(mod44w_c61_ancillary_path)
mod44w_c61_ancillary_data_array = mod44w_c61_ancillary_data_dict['ndarray']
mod44w_c61_ancillary_qa_data_array = read_path(mod44w_c61_ancillary_qa_path)['ndarray']

mod44w_c61_qa_data_array = parse_qa(mod44w_c61_qa_data_array)
mod44w_c61_ancillary_qa_data_array = parse_ancillary_qa(mod44w_c61_ancillary_qa_data_array)

[0 1 2 3 4 5 6 7 8]


In [9]:
crs = 'PROJCS["Sinusoidal",GEOGCS["Sphere",DATUM["Sphere",SPHEROID["Sphere",6371000,0]],PRIMEM["Greenwich",0],' + \
    'UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Sinusoidal"]' + \
    ',PARAMETER["longitude_of_center",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0]' + \
',UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]'
mod44w_cs_ds = gdal.Open(mod44w_c6_path)
transform = mod44w_cs_ds.GetGeoTransform()
mod44w_c6_ds = None

In [13]:
def open_and_write_temp(data_array, transform, projection, 
                        year, tile, name = None, files_to_rm = None) -> str:
    tmpdir = tempfile.gettempdir()
    userid = !whoami
    name_to_use = data_array.name if not name else name
    tempfile_name = f'MOD44W.A{year}001.{tile}.061.{name_to_use}.{userid}.tif'
    tempfile_fp = os.path.join(tmpdir, tempfile_name)
    if os.path.exists(tempfile_fp):
        os.remove(tempfile_fp)
    driver = gdal.GetDriverByName('GTiff')
    outDs = driver.Create(tempfile_fp, 4800, 4800, 
                          1, gdal.GDT_Float32, 
                          options=['COMPRESS=LZW'])
    outDs.SetGeoTransform(transform)
    outDs.SetProjection(projection)
    outBand = outDs.GetRasterBand(1)
    outBand.WriteArray(data_array)
    outBand.SetNoDataValue(250)
    outDs.FlushCache()
    outDs = None
    outBand = None
    driver = None
    return tempfile_fp

In [11]:
temporary_files_to_delete = []

mod44w_c61_water_mask = open_and_write_temp(mod44w_c61_data_array, transform, crs, YEAR, TILE, name='c61_mask', files_to_rm=temporary_files_to_delete)
mod44w_c61_water_mask_qa = open_and_write_temp(mod44w_c61_qa_data_array, transform, crs,  YEAR, TILE, name='qa_mask', files_to_rm= temporary_files_to_delete)

mod44w_c61_ancillary_water_mask = open_and_write_temp(mod44w_c61_ancillary_data_array, transform, crs, YEAR, TILE, name='ancillary_c61_mask', files_to_rm=temporary_files_to_delete)
mod44w_c61_ancillary_water_mask_qa = open_and_write_temp(mod44w_c61_ancillary_qa_data_array, transform, crs,  YEAR, TILE, name='ancillary_qa_mask', files_to_rm= temporary_files_to_delete)

In [12]:
mod44w_c6_client = TileClient(mod44w_c6_path)
mod44w_c61_water_client = TileClient(mod44w_c61_water_mask)
mod44w_c61_water_qa_client = TileClient(mod44w_c61_water_mask_qa)
mod44w_c61_ancillary_water_client = TileClient(mod44w_c61_ancillary_water_mask)
mod44w_c61_ancillary_qa_client = TileClient(mod44w_c61_ancillary_water_mask_qa)

In [14]:
mod44w_c6_water_mask_layer = get_leaflet_tile_layer(
    mod44w_c6_client, nodata=0, show=False, 
    vmin=0, vmax=1,
    cmap=water_c6_cmap, 
    name=f'MOD44W C6 Water Mask {YEAR} {TILE}',
    max_zoom=20)

mod44w_c61_water_mask_layer = get_leaflet_tile_layer(
    mod44w_c61_water_client, nodata=0, show=False,
    vmin=0, vmax=1,
    cmap=water_c61_cmap, 
    name=f'MOD44W C61 Water Mask {YEAR} {TILE}',
    max_zoom=20)

mod44w_c61_qa_layer = get_leaflet_tile_layer(
    mod44w_c61_water_qa_client, nodata=0,
    n_colors=3, show=False,
    vmin=1, vmax=3,
    cmap=water_qa_cmap, 
    name=f'MOD44W C61 QA Mask {YEAR} {TILE}',
    max_zoom=20)


mod44w_c61_ancillary_water_mask_layer = get_leaflet_tile_layer(
    mod44w_c61_ancillary_water_client, nodata=0, show=False,
    vmin=0, vmax=1,
    cmap=water_c61_ancillary_cmap, 
    name=f'MOD44W C61 ANCILLARY MASKED Water Mask {YEAR} {TILE}',
    max_zoom=20)

mod44w_c61_ancillary_qa_layer = get_leaflet_tile_layer(
    mod44w_c61_ancillary_qa_client , nodata=0,
    n_colors=5, show=False,
    vmin=1, vmax=8,
    scheme='discrete',
    cmap=water_ancillary_qa_cmap, 
    name=f'MOD44W C61 ANCILLARY MASKED QA Mask {YEAR} {TILE}',
    max_zoom=20)

In [15]:
legend_dict = {}

c61_qa_mask_legend_dict = {'QA- High Slope Surface': '#79d2a6', #'#8ED1FC',
                           'QA- Burn Scar (from MCD64A1)': '#ff6900', # '#ABB8C3'
                           'QA- No data (outside of projected area)': '#e4efe9'}

c61_ancillary_qa_mask_legend_dict = {
    'ANCILLARY QA- High Conf Water': '#31ff00',
    'ANCILLARY QA- Low Conf Water': '#91ffd3',
    'ANCILLARY QA- Low Conf Land': '#e64a19',
    'ANCILLARY QA- Ocean Mask': '#dd00ff',
    'ANCILLARY QA- Ocean Mask but no water': '#F78DA7',
    'ANCILLARY QA- Burn Scar (from MCD64A1)': '#ff6900', # '#ABB8C3'
    'ANCILLARY QA- Impervious Surface:': '#767676',
    'ANCILLARY QA- High Slope Surface': '#79d2a6',} #'#8ED1FC',
    # 'ANCILLARY QA- Changed to land': '#f6e656'}

c6_water_mask_legend_dict = {'C6- Water': '#2d7d86'}
c61_water_mask_legend_dict = {'C61- Water': '#8ed1fc'} 
c61_ancillary_water_mask_legend_dict = {'C61 ANCILLARY MASKED- Water': '#6482ff'}

legend_dict.update(c61_qa_mask_legend_dict)
legend_dict.update(c61_ancillary_qa_mask_legend_dict)
legend_dict.update(c6_water_mask_legend_dict)
legend_dict.update(c61_water_mask_legend_dict)
legend_dict.update(c61_ancillary_water_mask_legend_dict)

c61_mask_legend = LegendControl(legend_dict)

In [16]:
def get_location(cache_dir: str, tile: str, def_location: list) -> list:
    cache_fp = os.path.join(cache_dir, f'{tile}.marker.location.sv')
    if os.path.exists(cache_fp):
        location = joblib.load(cache_fp)
    else:
        location = def_location
    return location

def cache_location(tile: str, location: list) -> None:
    cache_fp = os.path.join(CACHE_DIR, f'{tile}.marker.location.sv')
    output = joblib.dump(location, cache_fp)
    return None

def initialize_marker(tile: str, location: list, cache_dir: str) -> Marker:
    name = 'Location Marker'
    title = name
    location = get_location(cache_dir, tile, location)
    marker = Marker(name=name, title=name, location=location)
    return marker

def initialize_message(location: list) -> widgets.HTML:
    ll_message = widgets.HTML()
    ll_message.value = str(location)
    return ll_message

In [17]:
m = Map(
    center=mod44w_c6_client.center(),
    zoom=mod44w_c6_client.default_zoom,
    basemap=basemaps.Esri.WorldImagery,
    scroll_wheel_zoom=True,
    keyboard=True,
    layout=widgets.Layout(height='600px')
)
marker_location = mod44w_c6_client.center()
marker = initialize_marker(tile=TILE, location=marker_location, cache_dir=CACHE_DIR)
latlon_message = initialize_message(marker.location)

def handle_click(**kwargs):
    latlon_message.value = str(marker.location)
    marker.popup = latlon_message
    cache_location(tile=TILE, location=marker.location)

m.add_layer(marker)
marker.on_click(handle_click)
m.add_layer(mod44w_c6_water_mask_layer)
m.add_layer(mod44w_c61_water_mask_layer)
m.add_layer(mod44w_c61_ancillary_water_mask_layer)
m.add_layer(mod44w_c61_qa_layer)
m.add_layer(mod44w_c61_ancillary_qa_layer)
m.add_control(c61_mask_legend)
m.add_control(ScaleControl(position='bottomleft'))
m.add_control(LayersControl(position='topright'))
m.add_control(FullScreenControl())

## MODIS Water Ancillary Validation Map Visualization

<b>Usage Tips:</b>

- ![Layer Control](../imgs/layer_control.png)    Hover over to select and deselect which layers are visible

- ![Full Screen Control](../imgs/full_screen.png)    Click for full screen

- Use the scroll wheel on the mouse to zoom in and out, or use [+] and [-]

The legend shows all layers no matter what's visible but each element is prefixed with which layer it indicates. I.e.: 

- "QA-": MOD44W C61 QA Mask

- "ANCILLARY QA-": MOD44W C61 with ancillary masking applied qa

- "C6-": MOD44W C6 Water Mask

- "C61-": MOD44W C61 Water Mask

- "C61 ANCILLARY MASKED-": MOD44W C61 with ancillary masking applied

In [25]:
display(m)

Map(center=[45.00004911190665, -79.30705743147163], controls=(ZoomControl(options=['position', 'zoom_in_text',…

In [26]:
userid = !whoami
notes_path = f'../notes/{TILE}-{userid[0]}-notes.csv'
if os.path.exists(notes_path):
    notes_df = pd.read_csv(notes_path)
    notes_df = notes_df.drop(columns=['Unnamed: 0'])
    sheet_notes = ipysheet.from_dataframe(notes_df)
else:
    tile = [' ' for _ in range(75)]
    year = [' ' for _ in range(75)]
    location = [' ' for _ in range(75)]
    note = [' ' for _ in range(75)]
    data = {'Tile': tile, 'Year': year, 'Location': location, 'Note': note}
    notes_df = pd.DataFrame(data=data)
    sheet_notes = ipysheet.from_dataframe(notes_df)
sheet_notes.column_width = [3,3,4,10]
sheet_notes.layout = widgets.Layout(width='100%',height='100%')
sheet_notes

Sheet(cells=(Cell(column_end=0, column_start=0, numeric_format=None, row_end=74, row_start=0, squeeze_row=Fals…

## Save notes

Run this cell to save notes in the current working directory

In [27]:
sheet_notes_df = ipysheet.to_dataframe(sheet_notes)
sheet_notes_df.to_csv(notes_path)

### <b>DO NOT RUN THIS CELL UNTIL FINISHED WITH VALIDATION</b>
*Note: This will shut down the local tile servers*

*Ignore warnings as such:*
```
Server for key (default) not found.
```

In [28]:
for path_to_delete in temporary_files_to_delete:
    if os.path.exists(path_to_delete):
        os.remove(path_to_delete)
    temporary_files_to_delete.remove(path_to_delete)

mod44w_c6_client.shutdown(True)