Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information about the POS of the whole MWE #11

Open
bguil opened this issue Oct 20, 2017 · 0 comments
Open

Add information about the POS of the whole MWE #11

bguil opened this issue Oct 20, 2017 · 0 comments

Comments

@bguil
Copy link
Owner

bguil commented Oct 20, 2017

In case of MWE, the POS of the whole expression is not given.

For instance, below, en is ADP, particulier is ADJ but en particulier is ADV.

12	et	et	CCONJ	_	_	17	cc	_	_
13	en	en	ADP	_	_	12	advmod	_	_
14	particulier	particulier	ADJ	_	Gender=Masc|Number=Sing	13	fixed	_	_
15	à	à	ADP	_	_	17	case	_	_
16	l'	le	DET	_	Definite=Def|Number=Sing|PronType=Art	17	det	_	SpaceAfter=No
17	inspecteur	inspecteur	NOUN	_	Gender=Masc|Number=Sing	9	conj	_	_

There is not standard in UD for this.

This problem is mentioned here.
I propose to follow UD_Portuguese treebank and to use the MISC column for this. For instance :

7	,	,	PUNCT	PU|@PU	_	6	punct	_	_
8	por	por	ADP	PRP|@<ADVL	_	6	cc	_	MWE=por_exemplo|MWEPOS=CCONJ
9	exemplo	exemplo	NOUN	N|M|S|@P<	Gender=Masc|Number=Sing	8	fixed	_	SpaceAfter=No
10	,	,	PUNCT	PU|@PU	_	15	punct	_	_

And then, for en particulier, we would have:

12	et	et	CCONJ	_	_	17	cc	_	_
13	en	en	ADP	_	_	12	advmod	_	MWE=en_particulier|MWEPOS=ADV
14	particulier	particulier	ADJ	_	Gender=Masc|Number=Sing	13	fixed	_	_
15	à	à	ADP	_	_	17	case	_	_
16	l'	le	DET	_	Definite=Def|Number=Sing|PronType=Art	17	det	_	SpaceAfter=No
17	inspecteur	inspecteur	NOUN	_	Gender=Masc|Number=Sing	9	conj	_	_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant