-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does Petrarch2 take care of Event Coreference Resolution? #16
Comments
It depends on what you mean by event coreference resolution. If you mean something like cross-doc or cross-sent linkage of events, then no. If you mean will PETRARCH2 return multiples of the same coded event per sentence, then also no. |
By event co-resolution, do you mean determine if multiple texts that code to the same event tuple refer to the same thing? If so, no: in most of the work up until the past five years, event data sets were generally coded from a single source (typically Reuters or Agence France Press for the machine-coded data, New York Times in the human-coded systems prior to that), and this wasn't a big issue because it was fairly easy to detect multiple stories reporting on the same actions. With the advent of sets generated from large numbers of sources (ICEWS, Phoenix) it is a very big issue, and the "one-a-day" filter method that most systems use (including the Phoenix pipeline; ICEWS apparently does no deduplication) has some decided drawbacks: this paper (http://eventdata.parusanalytics.com/papers.dir/Schrodt.TAD-NYU.EventData.pdf) discusses the issue in detail. There's an emerging consensus that we need to do document-level resolution first, either by de-duplication (large NLP literature on this) or clustering (some method similar to Google News or European Media Monitor), but we haven't worked out any open source solutions for this yet. |
Thanks for the clarification! |
PETRARCH does that within a sentence. For cross-sentence things we apply a daily one-a-day filter to the final output generated. See the phoenix_pipeline for more details on that. Specifically, this script. In other words, PETRARCh aims to do one thing: code event data. Pre- or post-processing is designed to occur elsewhere. |
No description provided.
The text was updated successfully, but these errors were encountered: