(Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection

This is the official repository for our paper (Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection

Figure 1: The title of this paper draws inspiration by the movie Batman v Superman: Dawn of Justice. In this paper, we leverage the analogy of (Chat)GPT and BERT, powerful and popular PFMs, as two lexical superheroes often erroneously associated for solving similar problems. Our aim is to shed lights on the potential of (Chat)GPT for semantic change detection.

❗❗❗ NOTE: ❗❗❗ We are sorry to inform you that, after publication, we discovered an error in the code that led us to compare ChatGPT via the web interface with ChatGPT via the API instead of comparing it with the foundational GPT model. Use openai.Completion.create instead of openai.ChatCompletion.create for promptig the OpenAI foundation models.

Abstract

In the universe of Natural Language Processing, Transformer-based language models like BERT and (Chat)GPT have emerged as lexical superheroes with great power to solve open research problems. In this paper, we specifically focus on the temporal problem of semantic change, and evaluate their ability to solve two diachronic extensions of the Word-in-Context (WiC) task: TempoWiC and HistoWiC. In particular, we investigate the potential of a novel, off-the-shelf technology like ChatGPT (and GPT) 3.5 compared to BERT, which represents a family of models that currently stand as the state-of-the-art for modeling semantic change. Our experiments represent the first attempt to assess the use of (Chat)GPT for studying semantic change. Our results indicate that ChatGPT performs significantly worse than the foundational GPT version. Furthermore, our results demonstrate that (Chat)GPT achieves slightly lower performance than BERT in detecting long-term changes but performs significantly worse in detecting short-term changes.

ChatGPT Conversations

To access the answers generated by ChatGPT for multiple experiments, run these commands:

unzip chatgpt-conversations.zip
mv chatgpt-conversations/* .
unzip dump-web-chat.zip

chatgpt-conversations

The chatgpt-conversations folder contains 10 sub-folders named chat-conversation{i} for each experiment run, where i ranges from 1 to 10. Within each chat-conversation{i} sub-folder, you will find the following three sub-folders:

HistoWiC: it contains ChatGPT conversations related to the experiments on HistoWiC.
TempoWiC: it contains ChatGPT conversations related to the experiments on TempoWiC.
LSC: it contains ChatGPT conversations related to the experiments on LSC

Additionally, both HistoWiC and TempoWiC have two sub-folders named zsp and fsp, corresponding to Zero-shot prompting and Few-shot prompting, respectively. The LSC sub-folder contains a sub-folder called graded. Each of these sub-folders (i.e. zsp, fsp, graded) contains files corresponding to specific temperatures experimented during the respective run (e.g., 0.0.json, ..., 2.0.json).

Navigate through the folders to access the data related to each experiment run and its corresponding temperature values.

chat-conversation1
- HistoWiC
  - zsp
    - 0.0.json
    - ...
    - 2.0.json
  - fsp
    - 0.0.json
    - ...
    - 2.0.json
- TempoWiC
  - zsp
    - 0.0.json
    - ...
    - 2.0.json
  - fsp
    - 0.0.json
    - ...
    - 2.0.json
- LSC
  - graded
    - 0.0.json
    - ...
    - 2.0.json (Repeat for chat-conversation2 to chat-conversation10)

dump-web-chat

The dump-web-chat folder contains a dump of our ChatGPT Web conversations (ChatGPT -> Setting -> Data controls -> Export data).

Getting Started

Before you begin, ensure you have met the following requirements:

Python 3.8+
Required Python packages (listed in requirements.txt)

To install the required packages, you can use pip:

pip install -r requirements.txt

Reproducing Results

Data

Download data and generate prompts

python download-histowic.py
python download-tempowic.py
python generate-prompts.py

mkdir prompt-data/HistoTempoWiC
cat prompt-data/TempoWiC/zsp.txt > prompt-data/HistoTempoWiC/zsp.txt
tail -n+2 prompt-data/HistoWiC/zsp.txt >> prompt-data/HistoTempoWiC/zsp.txt
mkdir prompt-data/HistoTempoWiC
cat prompt-truth/TempoWiC/test.txt > prompt-truth/HistoTempoWiC/test.txt
cat prompt-truth/HistoWiC/test.txt >> prompt-truth/HistoTempoWiC/test.txt
cat prompt-truth/TempoWiC/train.txt > prompt-truth/HistoTempoWiC/train.txt
cat prompt-truth/HistoWiC/train.txt >> prompt-truth/HistoTempoWiC/train.txt

mkdir data/HistoTempoWiC
cat data/TempoWiC/test.txt > data/HistoTempoWiC/test.txt
cat data/HistoWiC/test.txt >> data/HistoTempoWiC/test.txt
cat data/TempoWiC/train.txt > data/HistoTempoWiC/train.txt
cat data/HistoWiC/train.txt >> data/HistoTempoWiC/train.txt

Download data for Lexical Semantic Change detection (LSC)

wget https://www2.ims.uni-stuttgart.de/data/sem-eval-ulscd/semeval2020_ulscd_eng.zip
unzip semeval2020_ulscd_eng.zip

ChatGPT - WebInterface

Utilize the bot4chatgpt bot to chat with ChatGPT through the OpenAI GUI. Follow the instructions provided by the script.

python bot4chatgpt.py -d TempoWiC -p ZSp
python bot4chatgpt.py -d TempoWiC -p FSp
python bot4chatgpt.py -d TempoWiC -p MSp
python bot4chatgpt.py -d HistoWiC -p ZSp
python bot4chatgpt.py -d HistoWiC -p FSp
python bot4chatgpt.py -d HistoWiC -p MSp

GPT - API

Create a file named 'your_api' containing your OpenAI API token.
Chat with ChatGPT through the OpenAI API using various prompts and temperature settings. Execute the following commands (each run will test different temperature values):

python chatgpt-api.py -a your_api -d TempoWiC -p zsp 
python chatgpt-api.py -a your_api -d TempoWiC -p fsp 
python chatgpt-api.py -a your_api -d HistoWiC -p zsp 
python chatgpt-api.py -a your_api -d HistoWiC -p fsp
python chatgpt-api.py -a your_api -d HistoTempoWiC -p zsp

Lexical Semantic Change (LSC)

Test the knowledge of ChatGPT on historical semantic changes.

python chatgpt-api-LSC.py

BERT

Extract embeddings

python store-target-embeddings.py -d data/HistoWiC/ --model bert-base-uncased --batch_size 16 --train_set --test_set --use_gpu
python store-target-embeddings.py -d data/TempoWiC/ --model bert-base-uncased --batch_size 16 --train_set --test_set --use_gpu
python store-target-embeddings.py -d data/HistoTempoWiC/ --model bert-base-uncased --batch_size 16 --train_set --test_set --use_gpu

Run the following commands to use Train as Dev set (to find optimal threshold)

mv data/HistoWiC/target_embeddings/bert-base-uncased/train/ data/HistoWiC/target_embeddings/bert-base-uncased/dev/
mv data/TempoWiC/target_embeddings/bert-base-uncased/train/ data/TempoWiC/target_embeddings/bert-base-uncased/dev/
mv data/HistoTempoWiC/target_embeddings/bert-base-uncased/train/ data/HistoTempoWiC/target_embeddings/bert-base-uncased/dev/
cp data/TempoWiC/train.txt data/TempoWiC/dev.txt
cp data/HistoWiC/train.txt data/HistoWiC/dev.txt
cp data/HistoTempoWiC/train.txt data/HistoTempoWiC/dev.txt

Compute BERT stats on Test set

python bert-wic-stats.py -d data/TempoWiC -m bert-base-uncased --test_set --dev_set
python bert-wic-stats.py -d data/HistoWiC -m bert-base-uncased --test_set --dev_set
python bert-wic-stats.py -d data/HistoTempoWiC -m bert-base-uncased --test_set --dev_set

Explore statistics

import pandas as pd
pd.read_csv('data/HistoWiC/wic_stats.tsv', sep='\t')
pd.read_csv('data/TempoWiC/wic_stats.tsv', sep='\t')

Plots

Run the ChatGPTvBERT.ipynb notebook.

References

@inproceedings{periti-etal-2024-chat,
    title = {{(Chat)GPT v BERT Dawn of Justice for Semantic Change Detection}},
    author = "Periti, Francesco  and Dubossarsky, Haim  and Tahmasebi, Nina",
    editor = "Graham, Yvette  and Purver, Matthew",
    booktitle = "Findings of the Association for Computational Linguistics: EACL 2024",
    month = mar,
    year = "2024",
    address = "St. Julian{'}s, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-eacl.29",
    pages = "420--436"
}

@Article{periti2024montanelli,
author = {Periti, Francesco and Montanelli, Stefano},
title = {{Lexical Semantic Change through Large Language Models: a Survey}},
year = {2024},
issue_date = {November 2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {56},
number = {11},
issn = {0360-0300},
url = {https://doi.org/10.1145/3672393},
doi = {10.1145/3672393},
abstract = {Lexical Semantic Change (LSC) is the task of identifying, interpreting, and assessing the possible change over time in the meanings of a target word. Traditionally, LSC has been addressed by linguists and social scientists through manual and time-consuming analyses, which have thus been limited in terms of the volume, genres, and time-frame that can be considered. In recent years, computational approaches based on Natural Language Processing have gained increasing attention to automate LSC as much as possible. Significant advancements have been made by relying on Large Language Models (LLMs), which can handle the multiple usages of the words and better capture the related semantic change. In this article, we survey the approaches based on LLMs for LSC, and we propose a classification framework characterized by three dimensions: meaning representation, time-awareness, and learning modality. The framework is exploited to (i) review the measures for change assessment, (ii) compare the approaches on performance, and (iii) discuss the current issues in terms of scalability, interpretability, and robustness. Open challenges and future research directions about the use of LLMs for LSC are finally outlined.},
journal = {ACM Comput. Surv.},
month = {jun},
articleno = {282},
numpages = {38},
keywords = {Lexical semantics, lexical semantic change, semantic shift detection, large language models}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection

Table of Contents

Abstract

ChatGPT Conversations

Getting Started

Reproducing Results

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
ChatGPTvBERT-meme.png		ChatGPTvBERT-meme.png
ChatGPTvBERT.ipynb		ChatGPTvBERT.ipynb
README.md		README.md
bert-wic-stats.py		bert-wic-stats.py
bot4chatgpt.py		bot4chatgpt.py
chatgpt-api-LSC.py		chatgpt-api-LSC.py
chatgpt-api.py		chatgpt-api.py
chatgpt-conversations.zip		chatgpt-conversations.zip
data.zip		data.zip
download-histowic.py		download-histowic.py
download-tempowic.py		download-tempowic.py
dump-web-chat.zip		dump-web-chat.zip
embeddings-extraction.py		embeddings-extraction.py
generate-prompts.py		generate-prompts.py
openai_gpt.py		openai_gpt.py
openai_gpt.sh		openai_gpt.sh
prompt-data.zip		prompt-data.zip
prompt-truth.zip		prompt-truth.zip
requirements.txt		requirements.txt
store-target-embeddings.py		store-target-embeddings.py
train_chatgpt.py		train_chatgpt.py
your_api		your_api

FrancescoPeriti/ChatGPTvBERT

Folders and files

Latest commit

History

Repository files navigation

(Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection

Table of Contents

Abstract

ChatGPT Conversations

Getting Started

Reproducing Results

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages