This repository contains the code to replicate experiments for our participation at SemEval Task 6: LegalAI. Note that we only participate in subtask (A), predicting Rhetorical Roles.
- Create Folder
data
- Place 3
json
files calledtrain.json
,dev.json
, andtest.json
in folderdata
. These files are provided by the shared task organisers, but have to be renamed accordingly - Run
python main.py
. This will train and save all models and output test predictions in filetest_predictions.pickle
. - Run
python make_test_predictions.py
. This will create a file calledRR_TEST_DATA_FS.json
which contains test set predictions in the right format for submission to the shared task.
Note that training MLPs and fine-tuning LMs requires GPU and internet access to download models from huggingface model hub.
torch
(with GPU support)transformers
datasets
nltk
numpy
scipy
pandas
tqdm