Skip to content

isspek/ecir-2025-multilang-consistency

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Consistency Check on LLMs Answers to Health-Related Questions

A Case Study for English, Chinese, Turkish and German

This repository contains the source code of the paper "Do LLMs Provide Consistent Answers to Health-Related Questions across Languages?" [1].

Dataset Extension

The original dataset is based on HealthFC[2]. Follow the instructions at the dataset source code for downloading it.

Disease Classification

We categorize the diseases based on the mentioned disease entities on the text. We applied a semi-automatic method for constructing a dictionary for disease categorization. Final dictionary is called healthFC_diseases_wd_icd10_maps_v2.csv.

Download ner_model:

pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.4/en_core_sci_md-0.5.4.tar.gz

Execute code for extraction of the named entities

python -m src.ne_extraction

Enrich entities with their alternate names at Wikidata. See the execution code:

bash scripts/enrich_wd.sh

Translations

To run the translation

bash scripts/translation.sh

The extras of the dataset can be accessed at this link. You need to fetch the corresponding cells from the original data if you want to check the English/German questions.

Answer Generation

HF Models (Local)

Create virtual environment

conda create --name hf_local python=3.10
conda activate hf_local

bash scripts/run_hf_models.sh

Alternatively, you can interact with the HF models through ollama.

bash scripts/run_ollama.sh

HF Inference Endpoint

This is required for inferencing Llama3.1-70B

bash scripts/run_llama3.sh

OpenAI Inference Endpoint

This is the script for inferencing the OpenAI models

bash scripts/run_openai.sh

Evaluation

To evaluate consistency between answers, you first need to parse the answers, and then run the consistency-check function.

Parsing

bash scripts/parse_prompts.sh

After parsing on the answers, you need to merge answer pairs.

bash scripts/merge_results.sh

Checking Consistency

bash scripts/consistency_check.sh

References (Bib)

[1] The citation information for our paper.

@misc{schlicht2025llmsprovideconsistentanswers,
      title={Do LLMs Provide Consistent Answers to Health-Related Questions across Languages?}, 
      author={Ipek Baris Schlicht and Zhixue Zhao and Burcu Sayin and Lucie Flek and Paolo Rosso},
      year={2025},
      eprint={2501.14719},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.14719}, 
}

[2] The citation information for the HealthFC Original Paper

@inproceedings{vladika-etal-2024-healthfc-verifying,
    title = "{H}ealth{FC}: Verifying Health Claims with Evidence-Based Medical Fact-Checking",
    author = "Vladika, Juraj  and
      Schneider, Phillip  and
      Matthes, Florian",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italy",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.709",
    pages = "8095--8107",
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published