New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Word agglutination in German but not in French #27
Comments
hello!
|
Slightly related, what should we do with translations, for example of concepts:
1) the existence of a <ENAMEX type="CONCEPT">Volksgemeinschaft</ENAMEX> ("people's
community").
2) the existence of a <ENAMEX type="CONCEPT">Volksgemeinschaft</ENAMEX> ("<ENAMEX
type="CONCEPT">people's community</ENAMEX>").
2) the existence of a Volksgemeinschaft ("<ENAMEX type="CONCEPT">people's
community</ENAMEX>").
? |
In my opinion, as we want to keep the same approach as what said before, I would then say the 3rd answer. |
We've discussed this afternoon. I try to explain:
|
In the other examples,
|
For (2) 👍, this is was the point @everzeni raised and does make sense 🕺 Regarding the (1) does it make sense to annotate it at all then?
For the first case both of them are an clear entity - shall we annotate then both? For the second case I hope these examples are relevant |
I have also multiple cases of entities translated here. What should we do ? Annotate both ? Just the original ? Just the translation ? The “<ENAMEX type="INSTITUTION">Archives Générales du Royaume</ENAMEX>” (<ENAMEX type="INSTITUTION">National Archives of Belgium</ENAMEX>) “<ENAMEX type="INSTITUTION">Archives de l’État dans les Provinces</ENAMEX>” (<ENAMEX type="INSTITUTION">State Archives in the Provinces</ENAMEX>) in other words the <ENAMEX type="INSTITUTION">State Archives</ENAMEX> are a federal academic establishment that forms part of the “<ENAMEX type="INSTITUTION">Service Public Fédéral de Programmation Politique scientifique</ENAMEX> ”(<ENAMEX type="INSTITUTION">Belgian Federal Science Policy Office</ENAMEX>).</sentence> Following probable examples where the translation is not really an entity (like "people's community" for Volksgemeinschaft) and following the rule where we annotate foreign words with existing classes i vote to annotate just the original words. |
@ebenaissa for me I would annotate everything in your examples. @kermitt2 does it make sense to you? |
yes this all make perfect sense 💃 |
In the sentence :
I would tend to annotate only
Aryan certificate
like this:So I have two questions:
If that's correct to annotate only "Aryan" in
Aryan certificate
, what should we do withAriernachweis
?If we annotate the NP
Aryan certificate
as a whole (same for the NPAriernachweis
), what is the right class? ARTIFACT maybe?grobid-ner/resources/dataset/ner/corpus/xml/generated/Wikipedia_holocaust.2.training.xml:41
The text was updated successfully, but these errors were encountered: