how to annotate citations in scientific articles ? #381

Open
arademaker opened this Issue Dec 15, 2016 · 6 comments

Projects

None yet

5 participants

@arademaker
Contributor
arademaker commented Dec 15, 2016 edited

Examples:

It first transforms the graph into a tree, then decodes into a string using a weighted tree-to-string transducer and a language model (Graehl and Knight, 2004).

But we can also have things like

Mamede, and Markov (2014) discuss lexical and parsing issues of integrating a lexicon-grammar of Portuguese.

Besides the format, the citation can be related to the whole sentence or only to a specific word or phrase.

@nschneid
Contributor
nschneid commented Dec 15, 2016 edited

See http://www.isi.edu/~ulf/amr/lib/amr-dict.html#citation — does that answer your question?

@jnivre
Contributor
jnivre commented Dec 16, 2016

For parenthetical citations, I would use parataxis. For syntactic citations, whatever relation is relevant in the syntactic structure (nsubj in the above example). For the internal structure of the citation, I would probably choose flat in v2. How does that sound?

@nschneid
Contributor

(Oops, I got my GitHub issue trackers crossed. Ignore my comment about AMR.)

@livyreal

cool, @nschneid, we are thinking on AMRs too. :)

@jnivre I think parataxis is good too, but for internal structure I was thinking on using nummod. nummod is for quantity, but I'd like to preserve the information that we have dates (numbers) inside the reference.

@dan-zeman
Member

I agree with @jnivre about the external relations. Within the citation, I would stick with normal guidelines whenever applicable. Graehl and Knight is coordination. 2004 is nummod – although I think it would be nice to distinguish numbers-identifiers from numbers-quantities, we do not do that in UD (except for SynTagRus, which I believe has a language-specific extension of nummod for that purpose).

@jnivre
Contributor
jnivre commented Dec 16, 2016

Graehl and Knight is definitely coordination. Using nummod for 2004 is fine too, although I think eventually some subtype of flat could be an alternative (to capture the distinction @dan-zeman cares about).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment