# Comparison of NA Screening Results from Multiple CSV Files

This notebook processes one or more CSV files (each representing data from the same site at different times) using MiBiPreT functions. It loads, standardizes, and analyzes each CSV file, and then extracts only the following columns for comparison:

- sample_nr
- obs_well
- na_traffic_light
- intervention_traffic
- intervention_number
- intervention_contaminants

The filtered results from each file are then combined side by side.

## 1. Import Required Libraries

The following cell imports all necessary modules and libraries.

In [1]:
import mibipret.analysis.sample.screening_NA as na
from mibipret.data.check_data import standardize
from mibipret.data.load_data import load_csv
from IPython.display import display
import pandas as pd

## 2. Define CSV File Paths

List the file paths for the CSV files to be compared. Modify the list below with your actual file paths.

In [2]:
file_paths = [
    "../data/cleaned/na_screening/cw_T0_BTEXN.csv",
    "../data/cleaned/na_screening/cw_T1_BTEXN.csv",
    "../data/cleaned/na_screening/cw_T2_BTEXN.csv"
]

## 3. Ensure File Paths is a List

If a single file path is provided as a string, it is converted into a list to ensure uniform processing.

In [3]:
if isinstance(file_paths, str):
    file_paths = [file_paths]

## 4. Initialize Results Dictionary

Create an empty dictionary to store the filtered NA screening results from each CSV file.

In [4]:
results = {}

## 5. Process Each CSV File

Loop through each CSV file: load the data, standardize it, run the screening analysis, and extract the selected columns for comparison.

In [5]:
selected_columns = [
    "sample_nr", 
    "obs_well", 
    "na_traffic_light", 
    "intervention_traffic", 
    "intervention_number", 
    "intervention_contaminants"
]

for file in file_paths:
    print(f"Processing file: {file}")
    
    # Load raw data and units from the CSV file
    data_raw, units = load_csv(file, verbose=True)
    
    # Standardize the data (reducing to known quantities and cleaning values)
    data, units = standardize(data_raw, reduce=True, verbose=True)
    
    # Optional: additional calculations can be performed here (e.g., tot_redct, tot_oxi, etc.)
    tot_redct = na.reductors(data, verbose=True, ea_group="ONS")
    tot_oxi = na.oxidators(data, verbose=True, contaminant_group="BTEX")
    e_bal = na.electron_balance(data, verbose=True)
    na_traffic = na.NA_traffic(data, verbose=True)
    tot_cont = na.total_contaminant_concentration(data, verbose=True, contaminant_group="BTEX")
    na_interventation = na.thresholds_for_intervention(data, verbose=True, contaminant_group="BTEX")
    
    # Get the final NA screening table for this dataset
    data_na = na.screening_NA(data)
    
    # Extract only the selected columns for comparison
    data_na_filtered = data_na[selected_columns]
    
    # Store the filtered result using the file name as key
    results[file] = data_na_filtered

Processing file: ../data/cleaned/na_screening/cw_T0_BTEXN.csv
 Running function 'load_csv()' on data file  ../data/cleaned/na_screening/cw_T0_BTEXN.csv
Units of quantities:
-------------------
  sample_nr obs_well Benzene Toluene Ethylbenzene O Xylene P/M Xylene  \
0       NaN      NaN    ug/l    ug/l         ug/l     ug/l       ug/l   

  Naphthalene depth  pH    O2 Redox Fe II    Mn chloride nitrite Nitrite - N  \
0        ug/l     m NaN  mg/l    mV  mg/l  mg/l     mg/l    mg/l        mg/l   

  nitrate nitrate - N sulfate  
0    mg/l        mg/l    mg/l  
________________________________________________________________
Loaded data as pandas DataFrame:
--------------------------------
      sample_nr     obs_well Benzene Toluene Ethylbenzene O Xylene P/M Xylene  \
0           NaN          NaN    ug/l    ug/l         ug/l     ug/l       ug/l   
1    NL_CW_W_01  CW1_EFF-1-1     150      21          110       43         35   
2    NL_CW_W_02  CW1MF09-1-2     110       2           44    

## 6. Display the Comparison Table

If multiple CSV files have been processed, combine the filtered results side by side using `pd.concat`. If only one file is processed, display its result directly.

In [6]:
if len(results) == 1:
    display(list(results.values())[0])
else:
    # Concatenate the filtered screening results along columns
    comparison_table = pd.concat(results, axis=1)
    display(comparison_table)

Unnamed: 0_level_0,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T0_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T1_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv,../data/cleaned/na_screening/cw_T2_BTEXN.csv
Unnamed: 0_level_1,sample_nr,obs_well,na_traffic_light,intervention_traffic,intervention_number,intervention_contaminants,sample_nr,obs_well,na_traffic_light,intervention_traffic,intervention_number,intervention_contaminants,sample_nr,obs_well,na_traffic_light,intervention_traffic,intervention_number,intervention_contaminants
1,NL_CW_W_01,CW1_EFF-1-1,green,red,2,"[benzene, naphthalene]",NL_CW_W_11,CW1_EFF-1-2,green,red,2.0,"[benzene, naphthalene]",NL_CW_W_11,CW1_EFF-1-2,green,red,2.0,"[benzene, naphthalene]"
2,NL_CW_W_02,CW1MF09-1-2,green,red,1,[benzene],NL_CW_W_12,CW1MF09-1-3,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_12,CW1MF09-1-3,green,red,3.0,"[benzene, o_xylene, naphthalene]"
3,NL_CW_W_03,CW1MF10-1-2,green,red,5,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_13,CW1MF10-1-3,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_13,CW1MF10-1-3,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
4,NL_CW_W_04,CW1MF06-1-1,red,red,1,[benzene],NL_CW_W_14,CW1MF06-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_14,CW1MF06-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
5,NL_CW_W_05,INF-1-1,green,red,5,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_15,IINF-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_15,IINF-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
6,NL_CW_W_06,CW1MF01-1-2,red,red,1,[benzene],NL_CW_W_16,CW1MF01-1-3,green,red,3.0,"[benzene, o_xylene, naphthalene]",NL_CW_W_16,CW1MF01-1-3,green,red,4.0,"[benzene, ethylbenzene, pm_xylene, o_xylene]"
7,NL_CW_W_07,CW1MF02-1-2,green,red,5,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_17,CW1MF02-1-3,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_17,CW1MF02-1-3,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
8,NL_CW_W_08,CW1MF05-1-1,green,green,0,[],NL_CW_W_18,CW1MF05-1-2,red,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_18,CW1MF05-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
9,NL_CW_W_09,CW2MF05-1-2,green,red,5,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_19,CW2MF05-1-2,red,red,1.0,[benzene],NL_CW_W_19,CW2MF05-1-2,green,red,1.0,[benzene]
10,NL_CW_W_010,CW2MF06-1-1,green,red,5,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_110,CW2MF06-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n...",NL_CW_W_110,CW2MF06-1-2,green,red,5.0,"[benzene, ethylbenzene, pm_xylene, o_xylene, n..."
