![lop](../../images/logo_diive1_128px.png)

<span style='font-size:32px; display:block;'>
<b>
    Read single EddyPro _fluxnet_ output file with DataFileReader
</b>
</span>

---
**Notebook version**: `1` (20 Apr 2024)  
**Author**: Lukas Hörtnagl (holukas@ethz.ch) 

</br>

# Description

This example shows how to read the `EddyPro` `_fluxnet_` output file with `DataFileReader` by manually providing parameters.

# Imports

In [1]:
import importlib.metadata
import warnings
from datetime import datetime

warnings.filterwarnings('ignore')
from diive.core.io.filereader import DataFileReader

version_diive = importlib.metadata.version("diive")
print(f"diive version: v{version_diive}")

diive version: v0.83.2


## Using `DataFileReader` with parameters

In [2]:
FILE = r"L:\Sync\luhk_work\TMP\eddypro_CH-HON_FR-20241024-090003_fluxnet_2024-10-24T090023_adv.csv"
dfr = DataFileReader(filepath=FILE,
                     data_header_section_rows=[0],  # Header section (before data) comprises 1 row
                     data_skip_rows=[],  # Skip no rows
                     data_header_rows=[0],
                     # Header with variable names and units, in this case only variable names in first row of header                     
                     data_varnames_row=0,  # Variable names are in first row of header
                     data_varunits_row=None,  # Header does not contain any variable units
                     data_na_vals=[-9999],
                     # List of values interpreted as missing values, EddyPro uses -9999 for missing values in ouput file                     
                     data_freq="30min",  # Time resolution of the data is 30-minutes
                     data_delimiter=",",  # This csv file uses the comma as delimiter
                     data_nrows=None,
                     # How many data rows to read from files, mainly used for testing, in this case None to read all rows in file
                     timestamp_idx_col=["TIMESTAMP_END"],  # Name of the column that is used for the timestamp index
                     timestamp_datetime_format="%Y%m%d%H%M",  # Timestamp in the files looks like this: 202107010300
                     timestamp_start_middle_end="end",
                     # Timestamp in the file defined in *timestamp_idx_col* refers to the END of the averaging interval
                     output_middle_timestamp=True,
                     # Timestamp in output dataframe (after reading the file) refers to the MIDDLE of the averaging interval
                     compression=None)  # File is not compressed (not zipped)

In [3]:
df, meta = dfr.get_data()

File data are now stored in a dataframe:

In [4]:
df

Unnamed: 0_level_0,TIMESTAMP_START,DOY_START,DOY_END,FILENAME_HF,SW_IN_POT,NIGHT,EXPECT_NR,FILE_NR,CUSTOM_FILTER_NR,WD_FILTER_NR,SONIC_NR,T_SONIC_NR,CO2_NR,H2O_NR,CH4_NR,NONE_NR,TAU_NR,H_NR,FC_NR,LE_NR,FCH4_NR,FNONE_NR,TAU,H,LE,...,BADM_INST_GA_CP_TUBE_IN_DIAM_GA_CH4,BADM_INST_GA_CP_TUBE_FLOW_RATE_GA_CH4,HPATH_GA_CH4,VPATH_GA_CH4,RESPONSE_TIME_GA_CH4,MANUFACTURER_GA_NONE,BADM_INST_MODEL_GA_NONE,BADM_INSTPAIR_NORTHWARD_SEP_GA_NONE,BADM_INSTPAIR_EASTWARD_SEP_GA_NONE,BADM_INSTPAIR_HEIGHT_SEP_GA_NONE,BADM_INST_GA_CP_TUBE_LENGTH_GA_NONE,BADM_INST_GA_CP_TUBE_IN_DIAM_GA_NONE,BADM_INST_GA_CP_TUBE_FLOW_RATE_GA_NONE,HPATH_GA_NONE,VPATH_GA_NONE,RESPONSE_TIME_GA_NONE,NUM_CUSTOM_VARS,CUSTOM_DATA_SIZE_IRGA75_MEAN,CUSTOM_STATUS_CODE_IRGA75_MEAN,CUSTOM_GA_DIAG_CODE_IRGA75_MEAN,CUSTOM_AGC_MEAN,CUSTOM_FAST_T_MEAN,CUSTOM_AIR_P_MEAN,CUSTOM_COOLER_V_MEAN,NUM_BIOMET_VARS
TIMESTAMP_MIDDLE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1
2024-10-21 01:15:00,202410210100,295.0416,295.0624,,0.0000,1,36000,36000,36000,36000,36000,36000,,,,,36000,36000,,,,,0.002065,-5.317350,,...,,,,,,,,,,,,,,,,,7,2.0,200.0,,,,,,0
2024-10-21 01:45:00,202410210130,295.0624,295.0833,,0.0000,1,36000,36000,36000,36000,36000,36000,,,,,36000,36000,,,,,-0.009801,-4.360490,,...,,,,,,,,,,,,,,,,,7,2.0,200.0,,,,,,0
2024-10-21 02:15:00,202410210200,295.0833,295.1041,,0.0000,1,36000,36000,36000,36000,36000,36000,,,,,36000,36000,,,,,0.001444,-1.429800,,...,,,,,,,,,,,,,,,,,7,2.0,200.0,,,,,,0
2024-10-21 02:45:00,202410210230,295.1041,295.1249,,0.0000,1,36000,36000,36000,36000,36000,36000,,,,,36000,36000,,,,,0.001528,-1.297420,,...,,,,,,,,,,,,,,,,,7,2.0,200.0,,,,,,0
2024-10-21 03:15:00,202410210300,295.1249,295.1458,,0.0000,1,36000,36000,36000,36000,36000,36000,,,,,36000,36000,,,,,-0.006617,-0.364722,,...,,,,,,,,,,,,,,,,,7,2.0,200.0,,,,,,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2024-10-23 16:45:00,202410231630,297.6873,297.7082,,148.3740,0,36000,36000,36000,36000,36000,36000,36000.0,36000.0,,,36000,36000,36000.0,36000.0,,,-0.117624,-2.663600,11.54330,...,,,,,,,,,,,,,,,,,7,16.0,0.0,250.0,62.5,13.1612,97080.6,1.18792,0
2024-10-23 17:15:00,202410231700,297.7082,297.7290,,46.3137,0,36000,36000,36000,36000,36000,36000,36000.0,36000.0,,,36000,36000,36000.0,36000.0,,,-0.160577,-2.241410,7.72480,...,,,,,,,,,,,,,,,,,7,16.0,0.0,250.0,62.5,12.9368,97058.5,1.18654,0
2024-10-23 17:45:00,202410231730,297.7290,297.7498,,0.0000,1,36000,36000,36000,36000,36000,36000,36000.0,36000.0,,,36000,36000,36000.0,36000.0,,,-0.109488,-9.735060,25.62300,...,,,,,,,,,,,,,,,,,7,16.0,0.0,250.0,62.5,12.8063,97075.7,1.18603,0
2024-10-23 18:15:00,202410231800,297.7498,297.7707,,0.0000,1,36000,36000,36000,36000,36000,36000,36000.0,36000.0,,,36000,36000,36000.0,36000.0,,,-0.050314,-7.564570,28.06370,...,,,,,,,,,,,,,,,,,7,16.0,0.0,250.0,62.5,12.7774,97085.4,1.18505,0


Metadata are also stored, which inlcudes units (if available in the data files) and tags (for later processing). In this example, the column `TIMESTAMP_END` was used to parse the timestamp index, and was thus removed from the data columns to avoid having identical names for index and data column. However, the original data column still shows up in the metadata.

In [5]:
meta

Unnamed: 0,UNITS,TAGS,ADDED,VARINDEX
TIMESTAMP_START,-no-units-,[#orig],2024-10-25 00:18:57.063243,0
TIMESTAMP_END,,,,
DOY_START,-no-units-,[#orig],2024-10-25 00:18:57.063243,1
DOY_END,-no-units-,[#orig],2024-10-25 00:18:57.063243,2
FILENAME_HF,-no-units-,[#orig],2024-10-25 00:18:57.063243,3
...,...,...,...,...
CUSTOM_AGC_MEAN,-no-units-,[#orig],2024-10-25 00:18:57.063243,475
CUSTOM_FAST_T_MEAN,-no-units-,[#orig],2024-10-25 00:18:57.063243,476
CUSTOM_AIR_P_MEAN,-no-units-,[#orig],2024-10-25 00:18:57.063243,477
CUSTOM_COOLER_V_MEAN,-no-units-,[#orig],2024-10-25 00:18:57.063243,478


# End of notebook

In [6]:
dt_string = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"Finished {dt_string}")

Finished 2024-10-25 00:18:57
