![lop](../../images/logo_diive1_128px.png)

<span style='font-size:32px; display:block;'>
<b>
    Read multiple EddyPro _fluxnet_ output file with MultiDataFileReader
</b>
</span>

---
**Notebook version**: `1` (20 Apr 2024)  
**Author**: Lukas Hörtnagl (holukas@ethz.ch) 

</br>

# Description

This example shows how to read multiple `EddyPro` `_fluxnet_` output files with `MultiDataFileReader` and the pre-defined filetype `EDDYPRO-FLUXNET-CSV-30MIN`.

# Imports

In [1]:
import importlib.metadata
import warnings
from datetime import datetime

from diive.core.io.filereader import MultiDataFileReader, search_files

warnings.filterwarnings('ignore')
version_diive = importlib.metadata.version("diive")
print(f"diive version: v{version_diive}")

diive version: v0.85.0


## Using `MultiDataFileReader` with pre-defined filetype `EDDYPRO-FLUXNET-CSV-30MIN`

In [2]:
filepaths = search_files(
    searchdirs=r"..\..\diive\configs\exampledata\EDDYPRO-FLUXNET-CSV-30MIN_multiple",
    pattern='eddypro_CH-HON_FR-*_fluxnet_*_adv.csv')
filepaths

[WindowsPath('../../diive/configs/exampledata/EDDYPRO-FLUXNET-CSV-30MIN_multiple/eddypro_CH-HON_FR-20240818-090003_fluxnet_2024-08-18T090018_adv.csv'),
 WindowsPath('../../diive/configs/exampledata/EDDYPRO-FLUXNET-CSV-30MIN_multiple/eddypro_CH-HON_FR-20240819-090003_fluxnet_2024-08-19T090019_adv.csv'),
 WindowsPath('../../diive/configs/exampledata/EDDYPRO-FLUXNET-CSV-30MIN_multiple/eddypro_CH-HON_FR-20240820-090004_fluxnet_2024-08-20T090021_adv.csv')]

In [3]:
mdf = MultiDataFileReader(filetype='EDDYPRO-FLUXNET-CSV-30MIN',
                          filepaths=filepaths,
                          output_middle_timestamp=True)

Reading file eddypro_CH-HON_FR-20240818-090003_fluxnet_2024-08-18T090018_adv.csv ...
Reading file eddypro_CH-HON_FR-20240819-090003_fluxnet_2024-08-19T090019_adv.csv ...
Reading file eddypro_CH-HON_FR-20240820-090004_fluxnet_2024-08-20T090021_adv.csv ...


In [4]:
df = mdf.data_df
meta = mdf.metadata_df

File data are now stored in a dataframe:

In [5]:
df

Unnamed: 0_level_0,AIR_MV,AIR_DENSITY,AIR_RHO_CP,AIR_CP,AOA_METHOD,AXES_ROTATION_METHOD,BOWEN,BURBA_METHOD,BADM_LOCATION_LAT,BADM_LOCATION_LONG,BADM_LOCATION_ELEV,BADM_HEIGHTC,BADM_INST_SAMPLING_INT,BADM_INST_AVERAGING_INT,BADM_INST_MODEL_SA,...,W_T_SONIC_COV_IBROM_N0004,W_NUM_SPIKES,WD_FILTER_NREX,W_SPIKE_NREX,W_ABSLIM_NREX,W_VM97_TEST,W_LGD,W_KID,W_ZCD,W_ITC,W_ITC_TEST,WBOOST_APPLIED,WPL_APPLIED,ZL,ZL_UNCORR
TIMESTAMP_MIDDLE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1
2024-08-15 01:15:00,0.026289,1.10046,1119.00,1016.85,0,1,-0.397400,0,47.4189,8.49131,527.0,0.5,20,30,,...,,1,0,1,0,801000011,0.0,3.92798,1626,16,2,0,1,2.708700,2.398160
2024-08-15 01:45:00,0.026281,1.10081,1119.08,1016.59,0,1,-0.067049,0,47.4189,8.49131,527.0,0.5,20,30,,...,,1,0,1,0,800000111,0.0,5.27995,1161,87,5,0,1,14.255900,8.720120
2024-08-15 02:15:00,0.026254,1.10196,1120.16,1016.51,0,1,-0.113645,0,47.4189,8.49131,527.0,0.5,20,30,,...,,0,0,0,0,801000011,0.0,4.23637,1109,13,1,0,1,3.178640,2.030050
2024-08-15 02:45:00,0.026242,1.10246,1120.76,1016.60,0,1,-0.906105,0,47.4189,8.49131,527.0,0.5,20,30,,...,,1,0,3,0,800000011,0.0,5.62788,2537,5,1,0,1,4.078210,3.826030
2024-08-15 03:15:00,0.026237,1.10266,1121.05,1016.68,0,1,-0.898270,0,47.4189,8.49131,527.0,0.5,20,30,,...,,0,0,0,0,800000011,0.0,4.53595,1836,2,1,0,1,3.585550,3.362940
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2024-08-19 16:45:00,0.026464,1.09337,1110.11,1015.31,0,1,-0.976782,0,47.4189,8.49131,527.0,0.5,20,30,,...,,0,0,0,0,800000100,0.0,14.18050,134,41,3,0,1,0.057519,0.054155
2024-08-19 17:15:00,0.026443,1.09423,1111.11,1015.43,0,1,-0.509692,0,47.4189,8.49131,527.0,0.5,20,30,,...,,3,0,3,0,800000100,0.0,11.72300,117,22,2,0,1,0.058328,0.051651
2024-08-19 17:45:00,0.026424,1.09501,1111.99,1015.51,0,1,-0.281252,0,47.4189,8.49131,527.0,0.5,20,30,,...,,7,0,10,0,800000000,0.0,15.29790,175,15,1,0,1,0.032435,0.025775
2024-08-19 18:15:00,0.026384,1.09665,1113.67,1015.52,0,1,-0.277239,0,47.4189,8.49131,527.0,0.5,20,30,,...,,6,0,9,0,800000100,0.0,13.36820,41,10,1,0,1,0.020879,0.016491


Metadata are also stored, which inlcudes units (if available in the data files) and tags (for later processing). In this example, the column `TIMESTAMP_END` was used to parse the timestamp index, and was thus removed from the data columns to avoid having identical names for index and data column. However, the original data column still shows up in the metadata.

In [6]:
meta

Unnamed: 0,UNITS,TAGS,ADDED,VARINDEX
TIMESTAMP_START,-no-units-,[#orig],2025-01-25 01:46:09.642055,0
TIMESTAMP_END,,,NaT,
DOY_START,-no-units-,[#orig],2025-01-25 01:46:09.642055,1
DOY_END,-no-units-,[#orig],2025-01-25 01:46:09.642055,2
FILENAME_HF,-no-units-,[#orig],2025-01-25 01:46:09.642055,3
...,...,...,...,...
CUSTOM_AGC_MEAN,-no-units-,[#orig],2025-01-25 01:46:09.642055,475
CUSTOM_FAST_T_MEAN,-no-units-,[#orig],2025-01-25 01:46:09.642055,476
CUSTOM_AIR_P_MEAN,-no-units-,[#orig],2025-01-25 01:46:09.642055,477
CUSTOM_COOLER_V_MEAN,-no-units-,[#orig],2025-01-25 01:46:09.642055,478


# End of notebook

In [7]:
dt_string = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"Finished {dt_string}")

Finished 2025-01-25 01:46:16
