# Notebook to compute metrics for a collection of PDFs


**Required input files:** 
- a file with the list of PDFs to be considered, eg. `data/input/rpqs_eval_list_1.csv`
- the true indicator values for each PDF, in `data/input/sispea_vs_pdf_indic_values/`. 
- the answers given by the model for benchmark X, for each PDF, i.e. `data/output/benchmark_X/answers/RPQS_*****_answers.csv`


**Output files:**
- a copy of the answer files with 2 additional columns (`true_sispea_value`, `true_pdf_value`) giving the true values , e.g. `data/output/benchmark_X/answers/RPQS_*****_answers_vs_true.csv`
- the metrics per pdf in `data/output/all_metrics_per_pdf.csv`
- the metrics per indicator in `data/output/all_metrics_per_indic.csv`

**NOTE**  
In the present version, metrics calculated for new benchmark versions are appended to the files `all_metrics_per_pdf.csv` and/or `all_metrics_per_indic.csv` (ie without deleting previous data).   
Be careful that two rows in the output metrics table with the same "benchmark_version", "pdf_list_file", "indicator" are considered as duplicated values and only the first row is kept. In particular, if `compute_metrics.ipynb` has been used to compute the metrics of benchmark X and if then the notebook `run_cleaning_step.ipynb` has been used without incrementing the benchmark version to X+1, running `compute_metrics.ipynb` once again for benchmark X will not save the new metrics.   
In case of doubt, erase the metrics files and recalculate them from scratch for all benchmark versions.

### Import modules

In [15]:
import sys
sys.path.append("../")    # Add the path to the root directory (where we can find the folder narval/)

%load_ext autoreload
%autoreload 2 

from narval.utils import get_data_dir, FileSystem
from narval.metrics import MetricsCalculator

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [16]:
fs = FileSystem()
dir = get_data_dir()

### Import the Metrics Calculator

In [17]:
metrics_calc = MetricsCalculator()

### Compute metrics per pdf

Choose the PDFs and benchmark versions for which the metrics should be calculated.  
Only answers for PDFs listed in `pdf_list_name` below are considered.

In [19]:
benchmark_list = [f"benchmark_{i}" for i in range(4, 33)] + ["benchmark_table_28", "benchmark_table_29", "benchmark_table_32"]

pdf_list_name = "rpqs_eval_list_1+2.csv"

Compute metrics 

In [20]:
eval_df = fs.read_csv_to_df(dir+"/data/input/"+pdf_list_name, sep=";", usecols=["pdf_name"])
pdf_list = eval_df['pdf_name'].values.tolist()


for benchmark_version in benchmark_list:
    for pdf_name in pdf_list:
        pdf_main_name = pdf_name.split(".")[0]
        answer_file = pdf_main_name + "_answers.csv"
        answer_vs_true_file = pdf_main_name + "_answers_vs_true.csv"
        try:
            # If the RPQS has been questioned for this benchmark version
            # ie if the answer files exist
            metrics_calc.write_answers_vs_true_file(answer_file, benchmark_version)
            metrics_calc.fill_metrics_df_per_pdf(answer_vs_true_file, benchmark_version)
        except FileNotFoundError:
            print(f"Metrics for {pdf_name}.pdf and {benchmark_version} have not been calculated. The answer files do not exist.")
            pass

Metrics for RPQS_Ahun_cp23150_rpqsid_674494_AC_2021.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Amagne_cp08300_rpqsid_651153_AC_2022.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Artaix_cp71110_rpqsid_303861_AC_2019.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RAD_Cabasse_AC_2022.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Cartelegue_cp33390_rpqsid_787673_AC_2023.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chalautre-la-Petite_cp77160_rpqsid_794713_AC_2019.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chantrigne_cp53300_rpqsid_739213_AC_2021.pdf and benchmark_4 have not been calculated. The answer files do not exist.
Metrics for RPQS_Charchigne_cp53250_rpqsid_778433_AC_2022.pdf and benchmark_4 have not

  # Fill the metrics dataframe


Metrics for RPQS_Ahun_cp23150_rpqsid_674494_AC_2021.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Amagne_cp08300_rpqsid_651153_AC_2022.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Artaix_cp71110_rpqsid_303861_AC_2019.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RAD_Cabasse_AC_2022.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Cartelegue_cp33390_rpqsid_787673_AC_2023.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chalautre-la-Petite_cp77160_rpqsid_794713_AC_2019.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chantrigne_cp53300_rpqsid_739213_AC_2021.pdf and benchmark_5 have not been calculated. The answer files do not exist.
Metrics for RPQS_Charchigne_cp53250_rpqsid_778433_AC_2022.pdf and benchmark_5 have not

  # Fill the metrics dataframe


Metrics for RPQS_Ahun_cp23150_rpqsid_674494_AC_2021.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Amagne_cp08300_rpqsid_651153_AC_2022.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Artaix_cp71110_rpqsid_303861_AC_2019.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RAD_Cabasse_AC_2022.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Cartelegue_cp33390_rpqsid_787673_AC_2023.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chalautre-la-Petite_cp77160_rpqsid_794713_AC_2019.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Chantrigne_cp53300_rpqsid_739213_AC_2021.pdf and benchmark_6 have not been calculated. The answer files do not exist.
Metrics for RPQS_Charchigne_cp53250_rpqsid_778433_AC_2022.pdf and benchmark_6 have not

  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill the metrics dataframe
  # Fill

### Compute metrics per indicator

Choose the PDF list and benchmark versions for which the metrics should be calculated.  
Include only benchmark versions that have been run on all PDFs in `pdf_list_name` below.

In [None]:
# For rpqs_eval_list_1 only (because benchmark versions < 27 have not been run for other PDFs)
#benchmark_list = [f"benchmark_{i}" for i in range(4, 33)] + ["benchmark_table_28", "benchmark_table_29", "benchmark_table_32"]
# For rpqs_eval_list_1, rpqs_eval_list_2, rpqs_eval_list_1+2
benchmark_list = [f"benchmark_{i}" for i in range(27, 33)] + ["benchmark_table_28", "benchmark_table_29", "benchmark_table_32"]

pdf_list_name = "rpqs_eval_list_1+2.csv"


Compute metrics

In [28]:
for benchmark_version in benchmark_list:
    metrics_calc.fill_metrics_df_per_indic(pdf_list_name, benchmark_version)

### Save results

In [29]:
metrics_calc.save_metrics()

Metrics per pdf are saved to all_metrics_per_pdf.csv
Metrics per indicator are saved to all_metrics_per_indic.csv
