This is the repository for the resources in EMNLP 2021 Paper "Learning Constraints and Descriptive Segmentation for Subevent Detection". This repository contains the source code and datasets used in our paper.
Event mentions in text correspond to real-world events of varying degrees of granularity. The task of subevent detection aims to resolve this granularity issue, recognizing the membership of multi-granular events in event complexes. Since knowing the span of descriptive contexts of event complexes helps infer the membership of events, we propose the task of event-based text segmentation (EventSeg) as an auxiliary task to improve the learning for subevent detection. To bridge the two tasks together, we propose an approach to learning and enforcing constraints that capture dependencies between subevent detection and EventSeg prediction, as well as guiding the model to make globally consistent inference. Specifically, we adopt Rectifier Networks for constraint learning and then convert the learned constraints to a regularization term in the loss function of the neural model. Experimental results show that the proposed method outperforms baseline methods by 2.3% and 2.5% on benchmark datasets for subevent detection, HiEve and IC, respectively, while achieving a decent performance on EventSeg prediction.
Two datasets (HiEve and IC) are used for training in the paper.
git clone git@github.com:CogComp/Subevent_EventSeg.git
conda env create -n conda-env -f env/environment.yml
pip install -r env/requirements.txt
python spacy -m en-core-web-sm
mkdir rst_file
mkdir config
mkdir output_redirect
mkdir model_params
cd model_params
mkdir HiEve_best
mkdir IC_best
mkdir cons_learn
cd ..
python main.py <DEVICE_ID> <RESULT_FILE>
<DEVICE_ID>
: choose from "gpu_0", "gpu_1", "gpu_5,6,7", etc.
<RESULT_FILE>
: for example, "1236.rst"
nohup python main.py gpu_1 1236.rst > output_redirect/1236.out 2>&1 &
To look at the standard output: cat output_redirect/1236.out
Input should be a json file that contains a list of dictionaries. Each dictionary contains 6 key-value pairs, i.e., two sentences, two char id's denoting the start position of events, and the two event mentions. Examples can be found under example folder.
Output will also be a json file under output folder. The output contains a dictionary with two key-value pairs; one is labels, the other is predicted probabilities.
python predict.py <INPUT_FILE> <OUTPUT_FILE>
<INPUT_FILE>
: a json file
<OUTPUT_FILE>
: name for a json file
python predict.py example/subevent_example_input.json predict_subevent.json
Store 0429_5_cons.pt
, 0429_6_cons.pt
under model_params/cons_learn
; store 1236.pt
, 1233.pt
under model_params/IC_best
, model_params/HiEve_best
respectively
Changing the google account settings
Bibtex:
@inproceedings{WZCR21,
author = {Haoyu Wang and Hongming Zhang and Muhao Chen and Dan Roth},
title = {{Learning Constraints and Descriptive Segmentation for Subevent Detection}},
booktitle = {Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year = {2021},
url = "https://cogcomp.seas.upenn.edu/papers/WZCR21.pdf",
funding = {KAIROS, BETTER},
}