Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unificar o formato das entradas de advérbios interrogativos #41

Closed
leoalenc opened this issue May 21, 2022 · 8 comments
Closed

unificar o formato das entradas de advérbios interrogativos #41

leoalenc opened this issue May 21, 2022 · 8 comments
Assignees
Labels
invalid This doesn't seem right lexicon This issue relates to lexical data question Further information is requested

Comments

@leoalenc
Copy link
Contributor

leoalenc commented May 21, 2022

No momento, temos no glossário, seguindo Navarro (2016):

masuí (adv.) - 1. (interr.) de onde? 2. (afirm.) de onde; donde
mayé 1 (adv. interr.) - como?

Com base na primeira entrada, masuí é classificado pelo etiquetador apenas como adv., quando deveria ter classificação comoadv. interr., tal como mayé.
Outra questão é a classe de palavra em exemplos como este (NAVARRO, 2016, p. 17):

Nti akwáu masuí Pedro usika. - Não sei donde Pedro chegou.

@leoalenc leoalenc added invalid This doesn't seem right question Further information is requested lexicon This issue relates to lexical data labels May 21, 2022
@leoalenc leoalenc self-assigned this May 21, 2022
@leoalenc
Copy link
Contributor Author

No tagset do Penn Treebank, usa-se apenas a etiqueta WRB (Wh-adverb).
O UDPipe 2.0, com o modelo english-ewt-ud-2.6-200830, analisa where no exemplo abaixo como ADV | WRB | PronType=Int.

I don't know where these people live.

@leoalenc
Copy link
Contributor Author

O advérbio where é também analisado neste treebank como relativo (http://universal.grew.fr/?custom=6289182c06ed8):

# newdoc id = n01068
# sent_id = n01068029
# text = The constituency is in the council area of North Kesteven, where 62% of voters backed leaving the EU.
1 The the DET DT Definite=Def|PronType=Art 2 det _ _
2 constituency constituency NOUN NN Number=Sing 7 nsubj _ _
3 is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 7 cop _ _
4 in in ADP IN _ 7 case _ _
5 the the DET DT Definite=Def|PronType=Art 7 det _ _
6 council council NOUN NN Number=Sing 7 compound _ _
7 area area NOUN NN Number=Sing 0 root _ _
8 of of ADP IN _ 10 case _ _
9 North north ADJ JJ Degree=Pos 10 compound _ _
10 Kesteven Kesteven PROPN NNP Number=Sing 7 nmod _ SpaceAfter=No
11 , , PUNCT , _ 17 punct _ _
12 where where ADV WRB PronType=Rel 17 advmod _ _
13 62 62 NUM CD NumType=Card 14 nummod _ SpaceAfter=No
14 % % SYM NN Number=Sing 17 nsubj _ _
15 of of ADP IN _ 16 case _ _
16 voters voter NOUN NNS Number=Plur 14 nmod _ _
17 backed back VERB VBD Mood=Ind|Tense=Past|VerbForm=Fin 7 acl:relcl _ _
18 leaving leave VERB VBG VerbForm=Ger 17 xcomp _ _
19 the the DET DT Definite=Def|PronType=Art 20 det _ _
20 EU EU PROPN NNP Number=Sing 18 obj _ SpaceAfter=No
21 . . PUNCT . _ 7 punct _ _

@leoalenc
Copy link
Contributor Author

No corpus Bosque, temos onde como PRON e PronType=Rel neste exemplo:

# text = Meu avô materno foi prefeito de Limeira, meu pai foi duas vezes prefeito de Itapira, onde passei toda minha infância.
# sent_id = CF74-4
# source = CETENFolha n=74 cad=Revista Folha sec=nd sem=94b
1 Meu meu DET _ Gender=Masc|Number=Sing|PronType=Prs 2 det _ _
2 avô avô NOUN _ Gender=Masc|Number=Sing 5 nsubj _ _
3 materno materno ADJ _ Gender=Masc|Number=Sing 2 amod _ _
4 foi ser AUX _ Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 5 cop _ _
5 prefeito prefeito NOUN _ Gender=Masc|Number=Sing 0 root _ _
6 de de ADP _ _ 7 case _ _
7 Limeira Limeira PROPN _ Gender=Fem|Number=Sing 5 nmod _ SpaceAfter=No
8 , , PUNCT _ _ 5 punct _ _
9 meu meu DET _ Gender=Masc|Number=Sing|PronType=Prs 10 det _ _
10 pai pai NOUN _ Gender=Masc|Number=Sing 14 nsubj _ _
11 foi ser AUX _ Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 14 cop _ _
12 duas dois NUM _ NumType=Card 13 nummod _ _
13 vezes vez NOUN _ Gender=Fem|Number=Plur 14 nmod _ _
14 prefeito prefeito NOUN _ Gender=Masc|Number=Sing 5 parataxis _ _
15 de de ADP _ _ 16 case _ _
16 Itapira Itapira PROPN _ Gender=Fem|Number=Sing 14 nmod _ SpaceAfter=No
17 , , PUNCT _ _ 19 punct _ _
18 onde onde PRON _ Gender=Fem|Number=Sing|PronType=Rel 19 obl _ _
19 passei passar VERB _ Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin 16 acl:relcl _ _
20 toda toda DET _ Gender=Fem|Number=Sing|PronType=Tot 22 det _ _
21 minha meu DET _ Gender=Fem|Number=Sing|PronType=Prs 22 det _ _
22 infância infância NOUN _ Gender=Fem|Number=Sing 19 obj _ SpaceAfter=No
23 . . PUNCT _ _ 5 punct _ _

Noutro exemplo, donde é ADV.

@leoalenc
Copy link
Contributor Author

Avila (2021, p. 461), masuí é advérbio (1) ou pronome (2):

Kurumiwasú, masuí taá reyuri? (Amorim, 284, adap.) - Moço, donde vieste?

Se iwá-itá, masuí usinhĩ kurí amú-itá se yawé upurakari arama iwí. (Amorim, 215, adap.) - São minhas frutas, donde hão de nascer outras como eu para encherem a terra.

@leoalenc
Copy link
Contributor Author

No Corpus Tycho Brahe:

~/tycho/pos$ grep -Eioh "[[:space:]]onde\/[^[:space:]]+" *_pos.txt | sort | uniq -c

1 onde/ADV
1 ONDE/ADV
5 onde/NPR
2 Onde/NPR
1840 onde/WADV
7 Onde/WADV
2 ONDE/WADV
1 onde/WADV-1
1 onde/WADV-2

~/tycho/pos$ grep -Eioh "[[:space:]]donde\/[^[:space:]]+" *_pos.txt | sort | uniq -c

1 donde/D
1 Donde/NPR
1 donde/P+ADV
348 donde/P+WADV
57 donde/WADV
1 donde/WADVP

@leoalenc
Copy link
Contributor Author

@leoalenc
Copy link
Contributor Author

Pronomes adverbiais na teoria das dependências universais:

There is a closed subclass of pronominal adverbs that refer to circumstances in context, rather than naming them directly; similarly to pronouns, these can be categorized as interrogative, relative, demonstrative etc. Pronominal adverbs also get the ADV part-of-speech tag but they are differentiated by additional features.

@leoalenc
Copy link
Contributor Author

Avila (2021, p. 461), masuí é advérbio (1) ou pronome (2):

Kurumiwasú, masuí taá reyuri? (Amorim, 284, adap.) - Moço, donde vieste?

Se iwá-itá, masuí usinhĩ kurí amú-itá se yawé upurakari arama iwí. (Amorim, 215, adap.) - São minhas frutas, donde hão de nascer outras como eu para encherem a terra.

Com este commit, passamos a distinguir entre advérbios interrogativos e advérbios relativos, ou seja, advérbios pronominais relativos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right lexicon This issue relates to lexical data question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant