The purpose of this project is to provide an evaluation framework for intent discovery over the chats conducted with the VIRA chatbot . This repository becomes available along side the releases of the paper Benchmark Data and Evaluation Framework for Intent Discovery Around COVID-19 Vaccine Hesitancy.
VIRA's chat dataset (VIRADialogs) is available to download from Johns Hopkins Bloomberg School of Public Health. This code base is compatible with the dataset snapshot as of May 2022.
Users are welcome to use the dataset and framework for evaluating new algorithms for intent discovery.
- Clone this repository
- Download
VIRADialogsfrom Johns Hopkins Bloomberg School of Public Health. The dataset downloads asvira_logs.zip. - Unpack the file in a temporal location and copy the file
vira_logs_<DATE>.csvintoresources/snapshot - Activate a python (3.7+) environment
- Install the dependencies listed in requirements.txt
pip install -r requirements.txt
- Run the preparation script for splitting the data to train and test
python prepare.py
This repository contains the results (predictions) of all systems mentioned in the paper.
To view the results, run the user interface and follow the instructions shown on screen:
streamlit run ui.py
It is possible to reproduce the results of the sIB and K-Means systems, but not of KPA and RBC systems which are closed-source.
- Open the file
baselines.py, locate the enumbaselinesand set the valueRunstatus.Runfor sIB and K-Means. - Run the baselines generation file
python baselines.py
- Run the user interface to check the results
streamlit run ui.py
- After inspecting the results, it is helpful to revert sIB and K-Means in the enum
baselinestoRunstatus.Skip, to avoid re-generating their results in subsequent runs.
Evaluating a new algorithm is fairly straightforward:
-
Edit the file
algorithms.pyas follows:- Add entry to the enum Algorithm
- Add the title of the new algorithm to the dictionary
titles - Add the path where predictions are stored to the dictionary
paths
-
Edit the file
baselines.pyas follows:- Add a new entry (tuple) to the dictionary
baselineswith 3 values as described below:- The enum of the new algorithm
- A function for generating the algorithm result. The function should have the signature
(algorithm: Algorithm, df: pd.DataFrame, output_dir: str) -> None(Seecluster_and_extract_intentsinclustering.pyfor example). Alternatively, you can putgenerated_externallyif the generation is done by a separate - offline - process. - The value
Runstatus.Runto include the algorithm in the next run of the evaluation.
- Whether a function was specified, or the generation is done externally, the results should be stored in a file named
predictions.csvunder the output path given to the new algorithm. The CSV consists of 3 columns:slot,intentandid. Theslotis a time-frame marker. We use a date format to indicate it, so for example 2021-07-01 is used to indicate the whole month of July 2021. Theintentis a predicted intent in thatslot, and theidof the text associated to thisintentin thatslot. For example, see the file underresources/predictions/kmeans.
- Add a new entry (tuple) to the dictionary
-
Run the baselines generation file
python baselines.py
- Run the user interface to check the results
streamlit run ui.py
The framework relies on a transformers-based classifier for classifying user utterances from the dataset to (at most) one COVID-19 vaccine intent, and for matching intents discovered by an algorithm to the target intents.
This model is based on RoBERTa large (Liu, 2019), fine-tuned on a dataset of intent expressions available here and also on 🤗 Transformer datasets hub here. The model is available on 🤗 Transformer models hub here. The model is downloaded automatically as part of the evaluation processes.
Users can experiment with alternative models by modifying the reference in the file model.py or by training in a different framework. The dataset of intent expressions can be downloaded manually from the links mentioned above or programmatically as is done in trainer.py.
Copyright IBM Corporation 2022
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
If you would like to see the detailed LICENSE click here.
If you have any questions or issues you can create a new issue here.
Benchmark Data and Evaluation Framework for Intent Discovery Around COVID-19 Vaccine Hesitancy. Shai Gretz, Assaf Toledo, Roni Friedman, Dan Lahav, Rose Weeks, Naor Bar-Zeev, João Sedoc, Pooja Sangha, Yoav Katz, Noam Slonim. arXiv, 2022