## Script to replace missing MODIS Terra Files with the respective File from the Aqua Satellite 

After downloading the ten-year daily MODIS Terra and Aqua time-series a script is needed to search for missing Terra files and replace them with the respective file from the Aqua file folder. In the rare case that there is neither a Terra nor an Aqua file, a new file only containing zeros is created and saved in the Terra file folder as well.

#### Before running this script: 

 1. Use the script Download_MODIS.ipynb to download the needed MODIS Snow Cover Terra and Aqua data

In [None]:
# import the needed modules
import os
import shutil
import xarray
import h5py
import glob
import sys
import rasterio
from osgeo import gdal
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pandas as pd

In [None]:
# define folders and variables

# folders
MODIS_terra_file_folder = 'path/to/you/MODIS/Terra/folder/'
MODIS_aqua_file_folder = 'path/to/your/MODIS/Aqua/folder/'

# variables
NDSI_threshold_not_so_certain_snow = 10
NDSI_threshold_quite_certain_snow = 40

year_min = 2012
year_max = 2022

### Define important Functions

#### 1. Read MODIS Files

Function to read the MODIS Terra files. It creates an array of zeros with the same shape as the original data. This array will be used to store the the processed snow cover data. Then, conditional assignments are performed to classify different types of pixels in the MODIS data, including snow, snow free, water, and missing data. Additionally, the quality assurance band is read in as an array for each file and eachs pixel´s qa information converted into a binary representation. This is needed to reassign certain pixels to the "water" category.

#### 2. Name MODIS Files

Utility function to format the file names of the files from the Aqua folder that replace the missing Terra files. It takes an integer "n", which corresponds to the numer of the aqua file and converts it into a string with leading zeros, which is especially important for leap years.

In [None]:
# functions:

# read a MODIS .hdf snow file:
def read_MODIS_snow(file):
    hdf_ds = gdal.Open(file, gdal.GA_ReadOnly)
    band_ds = gdal.Open(hdf_ds.GetSubDatasets()[0][0], gdal.GA_ReadOnly) # 'Snow_Cover_Daily_Tile' v5, 'NDSI_Snow_Cover' v6
    data = band_ds.ReadAsArray()
    temp = np.copy(data)
    data = np.full(temp.shape, 0, dtype=np.uint8) # Missing Data (cloud, polar night etc.)
    data = np.where(temp < NDSI_threshold_not_so_certain_snow, 32, data) # Snow Free (NDSI < 0.1)
    data = np.where((temp >= NDSI_threshold_not_so_certain_snow) & (temp < NDSI_threshold_quite_certain_snow), 64, data) # Snow not so certain  (NDSI >= 0.1 & NDSI < 0.4)
    data = np.where((temp >= NDSI_threshold_quite_certain_snow) & (temp <= 100), 128, data) # Snow (NDSI >= 0.4)
    data = np.where(temp == 237, 16, data) # Inland Water
    data = np.where(temp == 239, 8, data) # Ocean
    snow_qa_ds = gdal.Open(hdf_ds.GetSubDatasets()[2][0], gdal.GA_ReadOnly) # "NDSI_Snow_Cover_Algorithm_Flags_QA"
    snow_qa = snow_qa_ds.ReadAsArray()
    snow_qa = snow_qa[:,:,np.newaxis]
    snow_qa_bits = np.unpackbits(snow_qa, axis=-1, bitorder='little') # Bit 0: Inland water
    data = np.where((data > 16) & (snow_qa_bits[:,:,0]==1), 16, data) #Pixels are re-assigned to water
    return data

# get the right MODIS file name (especially for leap years)
def with_leading_zeros(n):
    if n < 10:
        return "00" + str(n)
    elif n < 100:
        return "0" + str(n)
    else:
        return str(n)

In [None]:
# create two lists (MODIS terra and aqua) and fill them with the respective file names

# create two empty lists
terra_files, aqua_files = [], [] 

for filename in os.listdir(MODIS_terra_file_folder)+os.listdir(MODIS_aqua_file_folder):
            #add_log_entry(log_file,'File found %s'%filename)
            version = filename.split('.')[3]
            if version=='006' or version=='061':
                if filename.startswith('MOD10A1.') & filename.endswith('hdf'):
                    terra_file = os.path.join(MODIS_terra_file_folder, filename)
                    #add_log_entry(log_file,'Terra File found %s'%terra_file)
                    terra_files.append(terra_file)
                if filename.startswith('MYD10A1.') & filename.endswith('hdf'):
                    aqua_file = os.path.join(MODIS_aqua_file_folder, filename)
                    #add_log_entry(log_file,'Aqua File found %s'%aqua_file)
                    aqua_files.append(aqua_file)

#### Replace missing Terra Files

First, the files from the Terra and Aqua folder are listed. Then, for each year it is checked whether it is a leap year. Within the year loop the function iterates over days from 1 to 365 (or 366 for leap years). For each day, it checks if there are any Terra files for that day by searching for files matching a specific naming convention. If Terra files are found, it proceeds to the next day. If no files are found for the current day, it searches for Aqua files using the same pattern. If Aqua files are found, it copies the file to the Terra folder and changes the prefix from "MYD" to "MOD" to match the Terra naming convention. If neither Terra nor Aqua files are found for the current day, it creates a new Terra file with a specific naming convention in the Terra folder.

In summary the function ensures that there is a Terra data file for each day within the set range of ten years. 

In [None]:
# replace missing MODIS terra files

# list files in the MODIS terra and aqua folder
files_terra = os.listdir(MODIS_terra_file_folder)
files_aqua = os.listdir(MODIS_aqua_file_folder)

# function to search for missing files in the terra folder and replace them with the corresponding aqua file
# if there is no terra and no aqua file a file only containing multiple 0´s is created
for y in range(year_max - year_min + 1):
    files_for_month = []
    
    is_leap_year = ((y + year_min) % 4 == 0) and ((y + year_min) % 100 != 0)
    
    for d in range(1, 365 + 1 if not is_leap_year else 366 + 1):
        is_terra = len(glob.glob(os.path.join(
            MODIS_terra_file_folder,
            "MOD10A1.A" + str(year_min + y) + with_leading_zeros(d) + "*"
        ))) > 0
        
        if not is_terra:
            possible_path = glob.glob(os.path.join(
                MODIS_aqua_file_folder,
                "MYD10A1.A" + str(year_min + y) + with_leading_zeros(d) + "*"
            ))
            
            is_aqua = len(possible_path) > 0
               
            if is_aqua:
                terra_file = possible_path[0].replace("MYD", "MOD")
                p = shutil.copy(possible_path[0], os.path.join(
                    MODIS_terra_file_folder,
                    "M0D10A1.A" + str(year_min + y) + with_leading_zeros(d) + ".h23v04.061.0000000000000.hdf")
                )
                print(p)
            else:
                terra_file = "M0D10A1.A" + str(year_min + y) + with_leading_zeros(d) + ".h23v04.061.0000000000000.hdf"
                data = np.full([2400,2400], 0, dtype=np.uint8)
                
                with h5py.File(os.path.join(MODIS_terra_file_folder, terra_file), "w") as file:
                    file.create_dataset("data", data=data)