# Task 4

**Conversion of a measurement of atmospheric liquid water path from binary format into netcdf format.**

Wenfu Sun 2021-07-01

## 1. Load modules

Here I use `xarray` to build the netcdf file. The `struct` is used to read the binary file.

In [1]:
import os
import struct
import datetime
import xarray as xr
import numpy as np

## 2. Data preparation
Read the binary file and make the conversion according to the Manual_Profilers

In [2]:
# Initialize empty lists for the variables T, RF, LWP, and LWPANG
T      = []
RF     = []
LWP    = []
LWPAng = []

# Open the binary file
file_path = os.path.join('Data', '04-Conversion_of_binary_to_netcdf', '21060900.LWP')
with open(file_path, mode='rb') as file: # b is important -> binary
    # According to Manual_Profilers, read following attributes and variables
    # First, read the first 6 headings according to the bytes of every variable
    LWPCode, N, LWPMin, LWPMax, LWPTimeRef, LWPRetrieval = struct.unpack('<IIffII', file.read(4 * 6))
    # Then loop the file based on the number of recorded samples
    for i in range(N):
        T.append(struct.unpack('<I', file.read(4)))      # int
        RF.append(struct.unpack('<b', file.read(1)))     # char
        LWP.append(struct.unpack('<f', file.read(4)))    # float
        LWPAng.append(struct.unpack('<f', file.read(4))) # float

# Validate if the binary unpack method is right
if (LWPCode==934501978):
    print('Unpack method is right!')

Unpack method is right!


### Convert the time of seconds into UTC time

In [3]:
T_converted = []
# 2001-01-01 00:00:00 is the reference time
reference_time = datetime.datetime(year=2001, month=1, day=1, hour=0, minute=0, second=0)
for itime in T:
    time_delta = datetime.timedelta(seconds = itime[0])
    sample_time = reference_time + time_delta
    T_converted.append(sample_time)

In [4]:
# Preview the first 5 elements in T_converted
T_converted[:5]

[datetime.datetime(2021, 6, 9, 0, 2, 25),
 datetime.datetime(2021, 6, 9, 0, 2, 27),
 datetime.datetime(2021, 6, 9, 0, 2, 28),
 datetime.datetime(2021, 6, 9, 0, 2, 29),
 datetime.datetime(2021, 6, 9, 0, 2, 31)]

## 3. Create the NetCDF file

The netcdf file contains one dimension: `time` in 'UTC'. The four data variables `T`, `RF`, `LWP`, and `LWPAng` are based on `time`.

In [5]:
ds = xr.Dataset(
    # Data variables
    data_vars=dict(
        T   = (["time"], np.hstack(T)),
        RF  = (["time"], np.hstack(RF)),
        LWP = (["time"], np.hstack(RF)),
        LWPAng = (["time"], np.hstack(LWPAng)),
    ),
    # Data coordinates
    coords=dict(
        # sample = (["sample"], np.arange(0,N,1)),
        time= np.hstack(T_converted),
        reference_time = reference_time
    ),
    # Global attributes
    attrs=dict(
        LWP_File_Code = LWPCode,
        Number_of_recorded_samples = N,
        Minimum_of_recorded_LWP_values = LWPMin,
        Maximum_of_recorded_LWP_values = LWPMax,
        Time_reference = str(LWPTimeRef)+' UTC',
        LWPRetrieval   = str(LWPRetrieval)+' 0: lin. Reg., 1 : quad. Reg., 2: Neur. Net.'
    )
)

# Variable attributes
ds.T.attrs=dict(Description='Time of sample (# of sec. since 1.1.2001)')
ds.RF.attrs=dict(Description='Rainflag of sample 1 (0: no rain, 1: rain)')
ds.LWP.attrs=dict(Description='LWP sample 1 [g/m^2]')
ds.LWPAng.attrs=dict(Description='LWP angle 1 [DEG]')

# Preview the nc file
ds

In [6]:
# Here we can easily access the variable by a given UTC time. For example:

ds['RF'].sel(time='2021-06-09 00:02:31')

## 4. Save the NetCDF file

In [7]:
os.makedirs('Results', exist_ok=True)
ds.to_netcdf(path=os.path.join('Results', 'T04_LWP_2021060900.nc'))