This GitHub repository contains the code and data resources related to the paper titled "A Comparative Analysis of Conversational Large Language Models in Knowledge-Based Text Generation", which has been accepted at the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024).
For citing this study in academic papers, presentations, or theses, please use the following BibTeX entry:
@inproceedings{schneider-etal-2024-comparative,
title = "A Comparative Analysis of Conversational Large Language Models in Knowledge-Based Text Generation",
author = "Schneider, Phillip and
Klettner, Manuel and
Simperl, Elena and
Matthes, Florian",
editor = "Graham, Yvette and
Purver, Matthew",
booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)",
month = mar,
year = "2024",
address = "St. Julian{'}s, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.eacl-short.31",
pages = "358--367",
abstract = "Generating natural language text from graph-structured data is essential for conversational information seeking. Semantic triples derived from knowledge graphs can serve as a valuable source for grounding responses from conversational agents by providing a factual basis for the information they communicate. This is especially relevant in the context of large language models, which offer great potential for conversational interaction but are prone to hallucinating, omitting, or producing conflicting information. In this study, we conduct an empirical analysis of conversational large language models in generating natural language text from semantic triples. We compare four large language models of varying sizes with different prompting techniques. Through a series of benchmark experiments on the WebNLG dataset, we analyze the models{'} performance and identify the most common issues in the generated predictions. Our findings show that the capabilities of large language models in triple verbalization can be significantly improved through few-shot prompting, post-processing, and efficient fine-tuning techniques, particularly for smaller models that exhibit lower zero-shot performance.",
}
- results: Contains the predictions and evaluations for each model-prompt combination.
- results/human_evaluation: Contains code to create a file with instances for human annotators to label. Furthermore, it contains code for analyzing the results.
- lora_adapter: Contains the WebNLG training dataset and adapter that can be merged with the LLaMA-7B model to create a fine-tuned model. We refer to this model as LLaMA-FT-7B in the paper and as LoRA-7B in the code.
- scripts/WebNLG_Preparation.ipynb: This script converts the XML files of the WebNLG dataset to JSON.
- scripts/WebNLG_Finetune_Dataset.ipynb: This script was used to create the fine-tuning dataset. It results in webnlg_finetune_dataset_chat.json which we use to create the LoRA-7B model based on LLaMA-7B.
- This is only needed if you want to change the size of the existing fine-tuning dataset.
- scripts/WebNLG_Finetune.ipynb: Contains the code to fine-tune the LLaMA model using the LoRA (Low-Rank Adaptation) approach and webnlg_finetune_dataset_chat.json as data.
- scripts/WebNLG_DataToText_Prediction.ipynb: This is the code to generate verbalizations based on triples with models running on a local server (e.g., LLaMA, Vicuna, ...) and using the OpenAI API (e.g., GPT-3.5-Turbo).
- scripts/WebNLG_DataToText_Evaluation.ipynb: Script to transform the predictions into the format expected by the official WebNLG evaluation script. Since the official evaluation script aggregates the results, we provide additional code to generate evaluation metrics for specific instances to enable a detailed analysis of the results.
- Clone this repository to your workspace
- Download the WebNLG dataset
- Download the WebNLG Corpus XML Reader
- Use the WebNLG_Preparation.ipynb notebook to translate the XML files of WebNLG to JSON
- Setup the LLaMA Large Language Model (LLM)
- Setup the Vicuna LLM using FastChat
- If you want to use OpenAI models (e.g., GPT-3.5-turbo), rename the .env.dist file to .env and add your OpenAI API key
- Use the WebNLG_DataToText_Prediction.ipynb notebook to transform RDF triples into text, using different LLMs and prompts
- For evaluation, the WebNLG_DataToText_Evaluation.ipynb notebook can be used
The applied prompts are defined in the file Prompts.py.
Generate a concise text for the given set of triples. Ensure that the generated output only includes the provided information from the triples.
Input triples: <triples>
Generate a concise text for the given set of triples. Ensure that the generated output only includes the provided information from the triples.
Input triples: [{’object’: ’Mike_Mularkey’,’property’: ’coach’,’subject’: ’Tennessee_Titans’}]
Output text: Mike Mularkey is the coach of the Tennessee Titans.
Input triples: [{’object’: ’Albert_E._Austin’, ’property’: ’successor’, ’subject’: ’Alfred_N._Phillips’}, {’object’: ’Connecticut’, ’property’: ’birthPlace’, ’subject’: ’Alfred_N._Phillips’}, {’object’: ’United_States_House_of_Representatives’, ’proper ty’: ’office’, ’subject’: ’Alfred_N._Phillips’}]
Output text: Albert E. Austin succeeded Alfred N. Phillips who was born in Connecticut and worked at the United States House of Representatives.
Input triples: [{’object’: ’College_of_William_&_Mary’, ’property’: ’owner’, ’subject’: ’Alan_B._Miller_Hall’}, {’object’: ’2009-06-01’, ’property’: ’completionDate’, ’subject’: ’Alan_B._Miller_Hall’}, {’object’: ’101 Ukrop Way’, ’property’: ’address’, ’subject’: ’Alan_B._Miller_Hall’}, {’object’: ’Williamsburg,_Virginia’, ’property’: ’location’, ’subject’: ’Alan_B._Miller_Hall’}, {’object’: ’Robert_A._M._Stern’, ’prop- erty’: ’architect’, ’subject’: ’Alan_B._Miller_Hall’}]
Output text: The Alan B Miller Hall’s location is 101 Ukrop Way, Williams- burg, Virginia. It was designed by Robert A.M. Stern and was completed on 1 June 2009. Its owner is the College of William and Mary.
Input triples: <triples>