New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clitics with discourse function #709
Comments
We have such particles in Naija, a creole of English spoken in Nigeria. When they attach on words, such as sha, we attached them with @dan-zeman By the way, we had a problem with the validator when they attach on a CCONJ, because When they are more discursive, such as o, we used |
Yes, this is the same problem I'm having, with At least for
|
Out of curiosity, do they modify the words they are attached to semantically, or are they really clause level evidentials, emphatics, etc. that just happen to piggy-back on adjacent phonological units? Are they selective in any way with regard to the host word? |
If these particles are to be attached to the single words, it seems that By the way, what is the other particle э which always accompanies the ъым in your sample? |
@Stormur so if I have a
The glossing guidelines say that this is "the most frequent particle" but do not give any other information and Dunn's grammar and the sketch by Muravyova doesn't mention it. I've asked the original annotators this question as well as @amir-zeldes', let's see what they say. |
I tried it and apparently it didn't. In Latin too we are going to use some particles with |
I tried it, and it doesn't seem to work, I get the same errors,
Here is an example that breaks,
|
Sorry, my bad. Yes, the validator treats any dependency of a What does work is that But still, if the same particle appears over and over in the same sentence, isn't it really just a discourse tied to the root? And are we really going to have non-projectivities? |
Yes, I was aware of the Finnish solution, but it does not convince me. I think that if the clitic deserves to be annotated like that, it is not just a variant of that word and has its own syntactic role. So I would stick just to Just some thoughts about |
@amir-zeldes @Stormur here is the answer I got back:
|
I support the view that these particles should not be @Stormur : I think there are not many (if any) restrictions the validator places on morphological features at present. But I hope we will be able to add tests of that sort in the future :-) |
@dan-zeman that sounds reasonable. Now the file passes validation :) |
I'm working with a spoken corpus of Chukchi where there are clitics with a discourse function, "particles", as Dunn describes:
Example 018:
I have been annotating them with the
discourse
relation and attaching them to the word they are cliticised too. However, the guidelines fordiscourse
state that they should be attached "to the head of the most relevant nearby clause".This is a bit inconvenient when there are a lot of these in a single clause... pretty much anything can have them attached, as Dunn notes. For example in the following tree,
Applying the rule to this sentence, we would end up with a tree like,
It isn't the case in this example, but one could imagine a lot of non-projectivities could be introduced because of this.
The validator does not complain about most cases, but it does complain about
cc
having dependents,So, I'd like to ask if anyone has any suggestions? Does this happen in other languages, what are the potential solutions? I've seen that Czech has
advmod:emph
, would that do the job?The text was updated successfully, but these errors were encountered: