# An example of reading a binary data and save it as a GeoTIF file


Author: Irene Garousi-Nejad

**Last updated**: 5/8/2023

**Description**: The purpose of this notebook is to transform binary ET data (.bin) into raster data (.tif). The data being utilized are from the [global long-term (1983-2013) daily Evapotranspiration record](https://www.umt.edu/numerical-terradynamic-simulation-group/project/global-et.php), which can be accessed at [here](http://files.ntsg.umt.edu/data/CONUS_ET/). The following sections of this notebook handle the downloading of the binary data for a specific year (e.g., 1983) and its conversion into a one-dimensional array. The geographical extent information required for selecting data chunks for each day is obtained from this [readme file](http://files.ntsg.umt.edu/data/CONUS_ET/Readme.pdf). These data chunks are then converted into raster files, resulting in daily raster files that are saved into an output folder.

**Software Requirements**

This notebook was developed using the following software versions.

>Python: 3.8 \
numpy: 1.23.1 \
gdal: 3.5.0

In [None]:
# Download the data
!wget http://files.ntsg.umt.edu/data/CONUS_ET/Daily_CONUS_ET_1983.bin

In [None]:
import os
import numpy as np
from osgeo import gdal, gdal_array, osr

In [None]:
# Define input parameters
input_file = "Daily_CONUS_ET_1983.bin"
output_folder = "output_folder"
year = 1983
start_date = f"{year}-01-01"

# Define the geographic extent
# These information are obtained from http://files.ntsg.umt.edu/data/CONUS_ET/Readme.pdf
cell_size = 0.07272727
num_rows, num_cols = 372, 812
xmin, ymin = -125.013548, 24.059730
xmax, ymax = -65.95900476, 51.11427444
missing_values = -9999.0
data_type = np.float32

In [None]:
# Calculate the number of days in the year
# This considers the leap years 
num_days = 365
if start_date.endswith("02-29"):
    num_days = 366  # Leap year

In [None]:
# Create a new folder to dave GeoTIFF files if it doesn't exist
if not os.path.exists(output_folder):
    os.makedirs(output_folder)

In [None]:
# Open the input binary file
with open(input_file, "rb") as f:
    # Read the binary file into a NumPy array
    # Note that the readme file mentions data type is 32-bit (4-byte per value) float variables
    data = np.fromfile(f, dtype=data_type)

In [None]:
# print the size of data
data.shape

In [None]:
# check the file size. The number 4 is 4-byte per value.
4*num_days*num_cols*num_rows

In [None]:
# Open the input binary file
# This assumes that there is only one input file

with open(input_file, "rb") as f:
    # Read the binary file into a NumPy array
    data = np.fromfile(f, dtype=data_type)
    
    # Reshape the data into a 3D array of shape (num_days, num_rows, num_cols)
    data = data.reshape(num_days, num_rows, num_cols)
    
    # Create a GeoTIFF driver
    driver = gdal.GetDriverByName("GTiff")
    
    # Define the projection. WGS84 is used
    # because the geographic extent from the readme file 
    # were in the geographical coordinate system not in a projected system.
    srs = osr.SpatialReference()
    srs.ImportFromEPSG(4326)  # WGS84
    
    # Loop over all days and save each day as a separate GeoTIFF file
    for day in range(num_days):
        # Extract the data for the current day
        day_data = data[day]
        
        # Create a filename 
        date = np.datetime64(start_date) + np.timedelta64(day, "D")
        filename = os.path.join(output_folder, f"{date}.tif")
        
        # Add geospatial information and create the output GeoTIFF file
        dst_ds = driver.Create(filename, num_cols, num_rows, 1, gdal.GDT_Float32)
        dst_ds.SetGeoTransform((xmin, cell_size, 0, ymax, 0, -cell_size))
        dst_ds.SetProjection(srs.ExportToWkt())
        
        # Write the data to the output GeoTIFF file
        dst_ds.GetRasterBand(1).WriteArray(day_data)
        dst_ds.GetRasterBand(1).SetNoDataValue(missing_values)
        
        # Close the output GeoTIFF file
        dst_ds = None