# Install requirements

Install requirements.txt from the repository

# Starting Corruption
Define all the parameter of the config file related to path, dataset, corruption, layout analysis and model. 

## Short explanation of each field:
### Paths:
- base_path: The root directory for relative paths.
- augmented_dataset: Path to the JSON file containing the augmented dataset.
- output_corrupted: Destination for the JSON file with all corrupted (unanswerable) questions.
- output_corrupted_cleaned: Path for the cleaned version of the corrupted questions.
- patch_saving_dir: Directory where patch files (modifications applied during corruption) are saved.
- layout_saving_dir: Directory where layout analysis outputs are stored.

### Dataset:
- type: Indicates the dataset in use, here it's MPDocVQA or DUDE.
- split: Specifies the data split (e.g., train) being processed.
- dataset_json_path: File path to the original dataset JSON file.

### Corruption:
- percentage: Determines the proportion (100%) of the data to be corrupted.
- complexity: Sets the corruption complexity level (1,2 or 3).
- generated_sample_per_complexity_greater_than_1: Number of corrupted samples to generate for complexities higher than 1.
- types: A set of boolean flags indicating which types of corruptions to apply (numerical, temporal, entity, location, document—all enabled).

### Layout Analysis:
- model: Specifies the layout analysis model to use, which likely helps in understanding or processing document layouts.

### Model:
- provider: Indicates the model hosting service (huggingface).
- name: The primary model used for processing.

After defining all the parameter run the main with the config path as parameter.


In [None]:
!python /corruption-scripts/corruption/main.py --config /corruption-scripts/config.json

# Verification step
After the corruption process, in the config.json you can edit the parameters related to the verification.

### Verification:
- provider: The service used for verifying the corrupted outputs (gemini/openai).
- api_key: Placeholder for the API key.
- verification_input_file: Path to the cleaned corrupted questions that need to be verified.
- verification_output_file: Path where the verification results will be saved.
- verification_percentage: Indicates that all (100%) of the cleaned corrupted samples should be verified.
- model_name: The verification model to use (gemini-2.0-flash).
- log_file: File path to save the log output from the verification process.

In [None]:
!python /corruption-scripts/verification/answerability_verifier.py --config /corruption-scripts/config.json

AnswerabilityVerifier using device: cuda with provider: gemini
Will verify 100% of questions
E0000 00:00:1739451499.763338   29011 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.


## JUST FALSE RESULTS
Taking unanswerable questions from the verifier file.


In [None]:
!python /corruption-scripts/verification/just_false.py --input_file /corruption-scripts/results/MPDocVQA_unanswerable_corrupted_questions_verified.json --output_file /corruption-scripts/results/MPDocVQA_unanswerable_corrupted_questions_verified_just_false.json

Total questions: 19
Questions with false verification: 19


# VQA Models Testing
After the verification process, VQA models can be tested on corrupted questions. 

You can run each model by passing the VQA_config file path as parameter.

## NOTE
The results file follows this nomenclature:
- results_w#: prompt Explicit
- results_w#_ocr: prompt Explicit + OCR
- results_w#_unable: prompt None
- results_w#_ocr_unable: prompt OCR

The parameter batch_size represents the number of pages considered in the evaluation.

"unable_to_respond_aware" set to false means that the prompt is Explicit.

## QWEN

In [None]:
!python /VQA_analysis/models/llm/qwen_evaluator.py --config /VQA_analysis/models/config.json

## DOCOWL

In [None]:
!python /VQA_analysis/docowl_evaluator.py --config /VQA_analysis/models/config.json

## INTERNVL

In [None]:
!python /VQA_analysis/models/llm/internvl_evaluator.py --config /VQA_analysis/models/config.json

## OVIS

In [None]:
!python /VQA_analysis/models/llm/ovis_evaluator.py --config /VQA_analysis/models/config.json

## PHI

In [None]:
!python /VQA_analysis/models/llm/phi_evaluator.py --config /VQA_analysis/models/config.json

## MOLMO

In [None]:
!python /VQA_analysis/models/llm/molmo_evaluator.py --config /VQA_analysis/models/config.json

# LM ANALYSIS

## BLIP

In [None]:
!python /VQA_analysis/models/lm/blip_evaluator.py --config /VQA_analysis/models/config.json

## UDOP

In [None]:
!python /VQA_analysis/models/lm/udop_evaluator.py

## LAYOUTLMV3

In [None]:
!python /VQA_analysis/models/lm/layoutLMV3_evaluator.py

# ANSWER CONVERSION AND AUGMENTATION

Standardize LLMs output

In [None]:
!python /VQA_analysis/models/results/MPDocVQA/LLM/unable_converter.py

Gemini model initialized successfully
Processing: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/original/Qwen_vqa_analysis_results.json
Saving to: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/converted/Qwen_vqa_analysis_results_converted.json
Processed file saved to /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/converted/Qwen_vqa_analysis_results_converted.json
Successfully processed: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/original/Qwen_vqa_analysis_results.json
E0000 00:00:1739471691.324773   59935 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.


Standardize LMs output

In [None]:
!python /VQA_analysis/models/results/MPDocVQA/LM/unable_converter_binary.py

Preprocess metrics files adding 

Preprocessing metrics file by adding patch_entities

In [None]:
!python /VQA_analysis/models/results/metrics_file_preprocessing.py

Gemini model initialized successfully
Processing: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/converted/Qwen_vqa_analysis_results_converted.json
Saving to: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/augmented/Qwen_vqa_analysis_results_converted_augmented.json
Processed file saved to /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/augmented/Qwen_vqa_analysis_results_converted_augmented.json
Successfully processed: /content/drive/MyDrive/Thesis/VQA_analysis/models/results/MPDocVQA/LLM/results_w1/converted/Qwen_vqa_analysis_results_converted.json


Save the metrics

In [None]:
!python /VQA_analysis/models/results/result_analysis.py

2025-02-13 18:43:38.255153: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1739472218.288382   62260 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739472218.298555   62260 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-13 18:43:38.329159: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Initializing Entity Verifier...
Fetching 5 files: 100% 5/5 [00:00<00:00, 6828.89it/s]
Entity Verifier initialized
Bas