The V2 documentation for punct says
"A punctuation mark separating coordinated units is attached to the first conjunct."
but gives an example:
We have apples , pears , oranges , and bananas
where the comma's are attached to pears, oranges and bananas, respectively. Shouldn't all comma's in this example be attached to apples, if we follow the guidelines?
That is a bug. The example is right, while the guideline is V1, not V2.
Thanking the same example, why not use the relation cc instead of punct for cases where the punctuation is having the role of conjunct like "and" ?
The analysis of commas as coordinating conjunctions is controversial and there would be lots of tricky borderline cases. On a more general note, we lack a good theory of the role played by punctuation in syntactic structure and the UD policy is therefore (a) to give all punctuation marks the relation "punct" and (b) to exclude these links when evaluating parsers. Any exception from this policy would have unwanted negative effects.
hi @jnivre , thank you for your answer. Can you point me some references about the issues?
Not easily. There is very little written on these issues, so it is mainly based on practical experience. On the theoretical side, there is the issue that punctuation does not exist at all in spoken language, which is the primary form of all languages.
Fixed the documentation in 983a063, closing the issue.