## Preprocess HSAF Data

The following scripts are used for preprocessing HSAF data mainly using CDO (Climate Data Operators) and partially using NCO (netCDF Operator).

https://code.mpimet.mpg.de/projects/cdo 

https://nco.sourceforge.net/

Two various methods were initiated for regridding the HSAF data:

### 1. Using `lat_lon_0.nc` Information:

#### Step 1)

- Change variable names

- Trim the data to include Germany and surroundings for a smaller file size [longmin: -1.1, longmax: 18.4, latmin: 44.1, latmax: 56.5].



*the changes made in step 1 are saved in `HSAF_DUMP`*

#### Step 2)

- Merge all the NetCDF files generated in step 1 into one single NetCDF file using `mergetime`

- Regrid the data into DE05 grid.

- Include all the changes in the `history` attribute and in a human-readable way.

*the changes made in step 2  are saved in `HSAF_PP`*

```
REMAPBIL - Bilinear interpolation
REMAPBIC - Bicubic interpolation
REMAPNN - Nearest neighbor remapping
REMAPDIS - Distance weighted average remapping
REMAPCON - First order conservative remapping
REMAPCON2 - Second order conservative remapping
REMAPLAF - Largest area fraction remapping
```


#### Command guides used for CDO and NCO:

CDO:

important! when combining several CDO operators, the commands should come in inverse order.
mergetime cannot be combined with other operators.

```
                -L: Lock IO
                -O: Overwrite existing output file
                -b: Number of bits for the output (F64 64-bit floating point type)
          -selname: Select parameters by name
           -deltat: Difference between timesteps
      -seltimestep: Select timesteps
        -mergetime: Merge datasets sorted by date and time
           -chname: Change name of a variable
     -setattribute: Set attributes
             -mulc: Multiply with a constant
```

NCO: 

```
          -ncatted: Attribute editor
                -O: Overwrite existing output file
                -h: append the command to the history attribute
                -a: attribute modification
                ,a: append
```

```  
              ncks: netCDF Kitchen Sink (combines selected features from various NCO commands into one)
                -A: Append
                -h: Do not add to the history variable
```

### 2. Using `GDAL` Information (depreciated):

Note: This method distorts the data for unknown reason. Therefore, it is still under development. Originally, we used the gdal information to translate the coordinate information. Later, we found out that HSAF already provides lat/lon information seperately in the utilities folder (see DATA_MANAGE.ipynb:`HSAF_UTI`).

*Adapted from Niklas Wagner and https://gis.stackexchange.com/a/192722:*

There are no coordinate variables with hsaf data. There is a data matrix and a projection str stored with the global attributes. We need to 'translate' the projection str into coordinates using GDAL. We use

`gdal_translate` to translate the projection str into coordinates 

and use `gdalwrap` translate to lon/lat coordinates.

`gdal_translate -a_srs "+proj=geos +h=35785832 +a=6378169 +b=6356584 +no_defs" -a_ullr -5568000 5568000 5568000 -5568000 NETCDF:"h61_20221011_2300_01_fdk.nc":acc_rr h61_20221011_2300_01_fdk_translated.nc`

`gdalwarp -t_srs EPSG:4326 -wo SOURCE_EXTRA=100 h61_20221011_2300_01_fdk_translated.nc h61_20221011_2300_01_fdk_translated_wraped.nc`

`bash easyResample.sh --method="bil" --outgrid="DE06_2000x2000.griddes.txt" h61_20221011_2300_01_fdk_translated_wraped.nc`

### 1. Using `lat_lon_0.nc` Information:


In [12]:
source bashenv
rm -r $HSAF_PP $HSAF_DUMP $HSAF_LOG $HSAF_RG
mkdir $HSAF_PP $HSAF_DUMP $HSAF_LOG $HSAF_RG
echo "directories are wiped"

for ncfile in $(ls $HSAF_OR)
do
    time=$(cdo -w showattribute,end_of_accumulation_time $HSAF_OR/$ncfile) 
    yyyy=${time:39:4}
    mm=${time:43:2}
    dd=${time:45:2}
    tt=${time:48:2}

    cdo -L -O -w --no_history -setattribute,pr@coordinates="lat lon" -setattribute,qind@coordinates="lat lon" -chname,long,lon -chname,latg,lat -chname,acc_rr,pr -settaxis,$yyyy"-"$mm"-"$dd,$tt":00:00",1hour -setdate,$yyyy"-"$mm"-"$dd -settime,$tt":00:00" -setcalendar,standard -merge $HSAF_UTI/lat_lon_0.cr.re.nc -selindexbox,1400,2200,3000,3500 $HSAF_OR/$ncfile $HSAF_DUMP/$ncfile.cr.nc >& $HSAF_LOG/sample1.s
done
echo "step 1 is done"

directories are wiped
cdo    showattribute: Open failed on >/p/scratch/deepacf/kiste/patakchiyousefi1/H_SAF/h61_20210225_1100_01_fdk.nc<
                      Unsupported file structure
Aborted
step 1 is done
[32mcdo    mergetime: [0mProcessed 2861276130 values from 7130 variables over 3565 timesteps [376.48s 27GB].
[32mcdo    mergetime: [0mProcessed 2338782228 values from 5828 variables over 2914 timesteps [323.97s 22GB].
[32mcdo    mergetime: [0mProcessed 1756895778 values from 4378 variables over 2189 timesteps [188.19s 17GB].
[32mcdo    mergetime: [0mProcessed 6956954136 values from 6 variables over 8668 timesteps [157.65s 147MB].
step 2 is done


In [None]:
# STEP 2) Mergetime
# note: there is a limit of open files according to ulimit -n. Will merge the files every five months and merge them all together later.
#ulimit -n ~ 4000
cd $HSAF_DUMP
cdo -L -O --no_history -mergetime *202010*cr.nc *202011*cr.nc *202012*cr.nc *202101*cr.nc *202102*cr.nc MG_PART_1.cr.nc 
cdo -L -O --no_history -mergetime *202103*cr.nc *202104*cr.nc *202105*cr.nc *202106*cr.nc MG_PART_2.cr.nc 
cdo -L -O --no_history -mergetime *202107*cr.nc *202108*cr.nc *202109*cr.nc MG_PART_3.cr.nc
cdo -L -O --no_history -mergetime MG_PART*.cr.nc $HSAF_PP/HSAF_PP_OCT_2020_2021.cr.nc

cp $HSAF_PP/HSAF_PP_OCT_2020_2021.cr.nc $HSAF_RG/HSAF_PP_OCT_2020_2021.cr.nc
echo "step 2 is done"

"""
# generate the source grid (H-SAF) grid corner information using NCL
# needed for CDO conserve remapping
{
source bashenv
module load Stages/2020  GCC/9.3.0  ParaStationMPI/5.4.7-1
module load NCL
} &> /dev/null

ncl grid_corners.ncl
"""

In [None]:
source bashenv

# STEP 3) HSAF_RG
remapmethods=("remapbil" "remapbic" "remapnn" "remapdis" "remapcon" "remapcon2" "remaplaf")

for method in ${remapmethods[@]}
do
    cdo -L -O $method,$HRES_UTI/hres_grid.txt -setgrid,$HSAF_UTI/hsaf_grid_ncl.nc $HSAF_PP/HSAF_PP_OCT_2020_2021.cr.nc $HSAF_RG"/HSAF_PP_OCT_2020_2021.cr.nc."$method".nohist.nc"
    
    ncatted -O -h -a history,global,o,c,"cdo remap -mergetime -setattribute,pr@coordinates -setattribute,qind@coordinates -chname,long,lon -chname,latg,lat -chname,acc_rr,pr -settaxis -setdate -settime -setcalendar,standard -merge HSAF_UTI/lat_lon_0.cr.re.nc -selindexbox,1400,2200,3000,3500" $HSAF_RG"/HSAF_PP_OCT_2020_2021.cr.nc."$method".nohist.nc" $HSAF_RG"/HSAF_PP_OCT_2020_2021."$method".cr.nc"
    
    rm $HSAF_RG"/HSAF_PP_OCT_2020_2021.cr.nc."$method".nohist.nc"
    
    echo $method " remapping is done"
done

echo "step 3 is done"

### 2. Using `GDAL` Information (depreciated):


In [None]:
#!/bin/bash
{
#------------------------
module load Stages/2022  GCC/11.2.0  OpenMPI/4.1.1
module load CDO/2.0.2
module load NCO
#------------------------
} &> /dev/null

HSAF_OR=/p/scratch/deepacf/kiste/patakchiyousefi1/H_SAF
HSAF_PP=/p/scratch/deepacf/kiste/patakchiyousefi1/H_SAF_PP
HSAF_DUMP=/p/scratch/deepacf/kiste/patakchiyousefi1/H_SAF_DUMP
HSAF_LOG=/p/scratch/deepacf/kiste/patakchiyousefi1/H_SAF_LOG

rm -r $HSAF_PP $HSAF_DUMP $HSAF_LOG
mkdir $HSAF_PP $HSAF_DUMP $HSAF_LOG

cd "/p/project/deepacf/kiste/patakchiyousefi1/MISC/Test_Resample_HSAF_KPI/"
bash easyResample.sh --method="bil" --outgrid="DE06_2000x2000.griddes.txt" h61_20221011_2300_01_fdk_translated_wraped.nc`

In [None]:
# USING gdal

rm $EXA_DIR/h61_20221011_2300_01_fdk.nc
rm $EXA_DIR/h61_20221011_2300_01_fdk_renamed.nc
rm $EXA_DIR/h61_20221011_2300_01_fdk_renamed_setatt.nc
rm $EXA_DIR/h61_20221011_2300_01_fdk_renamed_setatt_cropped.nc
cp $HSAF_RET/h61_20221011_2300_01_fdk.nc $EXA_DIR/

gdal_translate -a_srs "+proj=geos +h=35785832 +a=6378169 +b=6356584 +no_defs" -a_ullr -5568000 5568000 5568000 -5568000 NETCDF:"h61_20221011_2300_01_fdk.nc":acc_rr h61_20221011_2300_01_fdk_translated.nc`
gdalwarp -t_srs EPSG:4326 -wo SOURCE_EXTRA=100 h61_20221011_2300_01_fdk_translated.nc h61_20221011_2300_01_fdk_translated_wraped.nc`