# Introduction

Preprocesses NASA VNP46A2 HDF5 files. This Notebook takes raw `.h5` files and completes the following preprocessing tasks:

* Extracts radiance and quality flag bands;
* Masks radiance for fill values, clouds, and sea water;
* Fills masked data with NaN values;
* Creates a georeferencing transform;
* Creates export metadata; and,
* Exports radiance data to GeoTiff format.

This Notebook uses the following folder structure:

```
├── 01-code-scripts
│   ├── clip_vnp46a1.ipynb
│   ├── clip_vnp46a1.py
│   ├── concatenate_vnp46a1.ipynb
│   ├── concatenate_vnp46a1.py
│   ├── download_laads_order.ipynb
│   ├── download_laads_order.py
│   ├── preprocess_vnp46a1.ipynb
│   ├── preprocess_vnp46a1.py
│   ├── preprocess_vnp46a2.ipynb
│   ├── preprocess_vnp46a2.py
│   └── viirs.py
├── 02-raw-data
├── 03-processed-data
├── 04-graphics-outputs
└── 05-papers-writings
```

Running the Notebook from the `01-code-scripts/` folder works by default. If the Notebook runs from a different folder, the paths in the environment setup section may have to be changed.

# Environment Setup

In [None]:
# Load Notebook formatter
%load_ext nb_black
# %reload_ext nb_black

In [None]:
# Import packages
import os
import warnings
import glob
import viirs

In [None]:
# Set options
warnings.simplefilter("ignore")

In [None]:
# Set working directory
os.chdir("..")

# User-Defined Variables

In [None]:
# Define path to folder containing input VNP46A2 HDF5 files
hdf5_input_folder = os.path.join(
    "02-raw-data", "hdf", "south-korea", "vnp46a2"
)

# Defne path to output folder to store exported GeoTiff files
geotiff_output_folder = os.path.join(
    "03-processed-data", "raster", "south-korea", "vnp46a2-grid"
)

# Data Preprocessing

In [None]:
# Preprocess each HDF5 file (extract bands, mask for fill values,
#  poor-quality, no retrieval, clouds, sea water, fill masked values
#  with NaN, export to GeoTiff)
hdf5_files = glob.glob(os.path.join(hdf5_input_folder, "*.h5"))
processed_files = 0
total_files = len(hdf5_files)
for hdf5 in hdf5_files:
    viirs.preprocess_vnp46a2(
        hdf5_path=hdf5, output_folder=geotiff_output_folder
    )
    processed_files += 1
    print(f"Preprocessed file: {processed_files} of {total_files}\n\n")

# Notes and References

**File download:**

VNP46A2 HDF5 files were first downloaded using the `01-code-scripts/download_laads_order.py` script. This script requires a user to have a valid [NASA Earthdata](https://urs.earthdata.nasa.gov/) account and have placed an order for files.

<br>

**Useful links:**

* [VNP46A2 Product Information](https://ladsweb.modaps.eosdis.nasa.gov/missions-and-measurements/products/VNP46A2/)
* [VIIRS Black Marble User Guide](https://viirsland.gsfc.nasa.gov/PDF/VIIRS_BlackMarble_UserGuide.pdf)
* [NASA Earthdata Scripts](https://git.earthdata.nasa.gov/projects/LPDUR/repos/nasa-viirs/browse/scripts)

<br>

**File naming convention:**

VNP46A2.AYYYYDDD.hXXvYY.CCC.YYYYDDDHHMMSS.h5

* VNP46A2 = Short-name
* AYYYYDDD = Acquisition Year and Day of Year
* hXXvYY = Tile Identifier (horizontalXXverticalYY)
* CCC = Collection Version
* YYYYDDDHHMMSS = Production Date – Year, Day, Hour, Minute, Second
* h5 = Data Format (HDF5)

<br>

**Bands of interest (User Guide pp. 12-13):**

| Scientific Dataset          | Units             | Description            | Bit Types               | Fill Value | Valid Range | Scale Factor | Offset |
|:-----------------------------|:-------------------|:------------------------|:-------------------------|:------------|:-------------|:--------------|:--------|
| DNB_BRDF-Corrected_NTL | nW_per_cm2_per_sr | BRDF corrected DNB NTL | 16-bit unsigned integer | 65,535      | 0 - 65,534   | 0.1          | 0.0    |
| Mandatory Quality Flag                      | Unitless          | Mandatory quality flag       | 8-bit unsigned integer | 255      | 0 - 3   | N/A          | N/A    |
| QF_Cloud_Mask               | Unitless          | Quality flag for cloud mask     | 16-bit unsigned integer | 65,535      | 0 - 65,534   | N/A          | N/A    |
| Snow_Flag                   | Unitless       | Flag for snow cover               | 8-bit unsigned integer   | 255      | 0 - 1      | N/A        | N/A    |

<br>

**Masking Criteria/Workflow:**

* mask where `dnb_brdf_corrected_ntl == 65535` (Fill Value)
* mask where `mandatory_quality_flag == 2` (Poor Quality)
* mask where `mandatory_quality_flag == 255` (No Retrieval)
* mask where `cloud_detection_bitmask == 2` (Probably Cloudy)
* mask where `cloud_detection_bitmask == 3` (Confident Cloudy)
* mask where `land_water_bitmask == 3` (Sea Water)

<br>

**Preprocessing Workflow:**

* Extract bands
* Apply scale factor
* Mask for fill values
* Mask for poor quality and no retrieval
* Mask for clouds
* Mask for sea water
* Fill masked values
* Create transform
* Create metadata
* Export array to GeoTiff

<br>

**QF_Cloud_Mask (base-10) (Adapted from User Guide p. 14):**

| Bit | Flag Description Key                          | Interpretation                                                                            |
|:-----|:-----------------------------------------------|:-------------------------------------------------------------------------------------------|
| 0   | Day/Night                                     | 0 = Night <br> 1 = Day                                                                         |
| 1-3 | Land/Water Background                         | 0 = Land & Desert <br> 1 = Land no Desert <br> 2 = Inland Water <br> 3 = Sea Water <br> 5 = Coastal |
| 4-5 | Cloud Mask Quality                            | 0 = Poor <br> 1 = Low <br> 2 = Medium <br> 3 = High                                                  |
| 6-7 | Cloud Detection Results & Confidence Indicator | 0 = Confident Clear <br> 1 = Probably Clear <br> 2 = Probably Cloudy <br> 3 = Confident Cloudy     |
| 8   | Shadow Detected                               | 0 = No <br> 1 = Yes                                                                             |
| 9   | Cirrus Detection (IR) (BTM15 –BTM16)          | 0 = No Cloud <br> 1 = Cloud                                                                   |
| 10  | Snow/Ice Surface                              | 0 = No Snow/Ice <br> 1 = Snow/Ice     |

<br>

**Mandatory_Cloud_Flag (base-10) (User Guide p. 16):**

| Value | Retrieval Quality | Algorithm Instance                                                      |
|:-------|:-------------------|:-------------------------------------------------------------------------|
| 0     | High-quality      | Main algorithm (Persistent nighttime lights)                            |
| 1     | High-quality      | Main algorithm (Ephemeral Nighttime Lights)                             |
| 2     | Poor-quality      | Main algorithm (Outlier, potential cloud contamination or other issues) |
| 255   | No retrieval      | Fill value                  |




**Snow_Flag (base-10) (User Guide p. 16)**:

| Flag Description Key | Value         | Interpretation                        |
|:----------------------|:---------------|:---------------------------------------|
| Snow/Ice Surface     | 0<br>1<br>255 | No Snow/Ice<br>Snow/Ice<br>Fill Value |