## Data/Directory Management

Documentation on whereabouts of data directories and data management (copying, transferring and maintaining the data).

### 1. HRES Data

- Internal directories:

`HRES_OR` is the directory for raw HRES data (directly downloaded from ECMWF for extended DE05 domain).

`HRES_PP` is the directory where the HRES data are preprocessed according to step 2 in `HRES_PP.ipynb`.

`HRES_DUMP` is the directory used for saving and dumping preprocessed HRES data in step 1 according to `HRES_PP.ipnb`.

`HRES_LOG` is the directory used for saving log files of data preprocessing.

- External directories:

`HRES_RET` is where the retrieved HRES data are stored (pfgpude05 project as of October 2022).

#### 1.1. Retrieving HRES data from 01.10.2020 until 30.09.2021

#### 1.2. Retrieving HRES data from 01.10.2020 until 25.04.2023


### 2. HSAF Data

- Internal directories:

`HSAF_OR` is the directory for raw HSAF data (directly downloaded from EUMETSAT for h61 product coverage).

`HSAF_PP` is the directory where the single NetCDF preprocessed file is stored according to `HSAF_PP.ipynb`.

`HSAF_DUMP` is the directory used for saving and dumping preprocessed HSAF data in the first step according to `HSAF_PP.ipnb`.

`HSAF_LOG` is the directory used for saving log files of data preprocessing.

`HSAF_UTI` is the directory where the HSAF Utilities are saved.

`HSAF_RG` is the directory where the preprocessed and regridded single NetCDF file is stored according to `HSAF_PP.ipnb`.

- External directories:

`HSAF_RET` is where the retrieved HSAF data are stored (shared_data in slts largedata as of October 2022).

#### 2.1. Retrieving HSAF data from 01.10.2020 until 30.09.2021

### 1. HRES Data

#### 1.1. Retrieving HRES data from 01.10.2020 until 30.09.2021


In [None]:
# Data management scripts for retrieving HRES data from 01.10.2020 until 30.09.2021
source bashenv

#remove all files in OR_DIR and copy the new files from HRES_RET
rm $HRES_OR/*
tar -xvf $HRES_RET/2020.tar $HRES_OR/
# move files one directory up and delete the mother folder
mv $HRES_OR/2020/* $HRES_OR
rm -r $HRES_OR/2020
cp $HRES_RET/2021/* $HRES_OR/

cd $HRES_OR
# keep only the 0-90 files and between the specified dates
rm *144* *202001* *202002* *202003* *202004* *202005* *202006* *202007* *202008* *202009* *202110* *202111* *202112*

#### 1.2. Retrieving HRES data from 01.07.2020 until 25.04.2023 (1,025 days, 24600 hours)


In [None]:
# Data management scripts for retrieving HRES data from 07.2020 until 04.2023
source bashenv

#remove all files in HRES_OR and copy the new files from HRES_RET
rm -r $HRES_OR/*
tar -xf $HRES_RET/2020.tar -C $HRES_OR/
echo "tar extract done" 

# move files one directory up and delete the mother folder
mv $HRES_OR/2020/* $HRES_OR
rm -r $HRES_OR/2020

# copy files from other years
cp $HRES_RET/2021/* $HRES_OR/
cp $HRES_RET/2022/* $HRES_OR/
cp $HRES_RET/2023/* $HRES_OR/

cd $HRES_OR
# keep only the 0-90 files and between the specified dates
rm *202001* *202002* *202003* *202004* *202005* *202006* *202305* *144*

### 2. HSAF Data

#### 2.1. Retrieving HSAF data from 01.10.2020 until 30.09.2021

In [None]:
# Data management scripts for retrieving HSAF data from 01.10.2020 until 30.09.2021
source bashenv

# make the non-existing directories:
mkdir $OR_DIR $PP_DIR $DUMP_DIR $LOG_DIR 
# copy the data from HSAF_RET to OR_DIR
cd $HSAF_RET
cp *202010*_01_fdk.nc *202011*_01_fdk.nc *202012*_01_fdk.nc *202101*_01_fdk.nc *202102*_01_fdk.nc *202103*_01_fdk.nc *202104*_01_fdk.nc *202105*_01_fdk.nc *202106*_01_fdk.nc *202107*_01_fdk.nc *202108*_01_fdk.nc *202109*_01_fdk.nc $HSAF_OR/

#### 2.2. Retrieving HSAF data from 01.07.2020 until 25.04.2023 (1,025 days, 24600 hours)

In [1]:
source bashenv
# make the non-existing directories:
# mkdir $OR_DIR $PP_DIR $DUMP_DIR $LOG_DIR 
# copy the data from HSAF_RET to OR_DIR
cd $HSAF_RET
cp *202007*_01_fdk.nc *202008*_01_fdk.nc *202009*_01_fdk.nc *202010*_01_fdk.nc *202011*_01_fdk.nc *202012*_01_fdk.nc *202101*_01_fdk.nc *202102*_01_fdk.nc *202103*_01_fdk.nc *202104*_01_fdk.nc *202105*_01_fdk.nc *202106*_01_fdk.nc *202107*_01_fdk.nc *202108*_01_fdk.nc *202109*_01_fdk.nc *202110*_01_fdk.nc *202111*_01_fdk.nc *202112*_01_fdk.nc *202201*_01_fdk.nc *202202*_01_fdk.nc *202203*_01_fdk.nc *202204*_01_fdk.nc *202205*_01_fdk.nc *202206*_01_fdk.nc *202207*_01_fdk.nc *202208*_01_fdk.nc *202209*_01_fdk.nc *202210*_01_fdk.nc *202211*_01_fdk.nc *202212*_01_fdk.nc *202301*_01_fdk.nc *202302*_01_fdk.nc *202303*_01_fdk.nc *202304*_01_fdk.nc $HSAF_OR/
