Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-tagset tags should map to the generic type #858

Closed
reckart opened this issue May 26, 2016 · 1 comment
Closed

Out-of-tagset tags should map to the generic type #858

reckart opened this issue May 26, 2016 · 1 comment
Assignees
Labels
🐛Bug Something isn't working
Milestone

Comments

@reckart
Copy link
Member

reckart commented May 26, 2016

Different tagset mappings currently handle the catch-all rule differently. Some map to the generic type POS, others to the other tag POS_X (was O in earlier versions of DKPro Core). This is inconsistent and should be harmonized across all mappings.

@reckart reckart added the ⭐️ Enhancement New feature or request label May 26, 2016
@reckart reckart added this to the 1.9.0 milestone May 26, 2016
@reckart reckart modified the milestones: 1.10.0, 1.9.0 Jul 26, 2017
@reckart reckart modified the milestones: 1.10.0, 1.11.0 Dec 8, 2017
@reckart reckart modified the milestones: 1.10.0, 1.11.0 Jul 28, 2018
@reckart
Copy link
Member Author

reckart commented Jan 12, 2019

Nowadays the rule is represented in the mapping files as *=POS_X and can be found in various mappings:

bn-utpal-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
de-stts-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
de-tiger-rftagger-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-arktweet-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-brown-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset (2 matches)
en-browntei-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-c5-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-lbj-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-medpost-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-ptb-emory-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-ptb-tt-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
es-conll2009-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
et-tartu-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fa-upc-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fa-upc-reduced-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fr-ftb-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fr-stein-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
it-stein-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
it-tanl-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
ru-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
sv-suc-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
zh-ctb-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
zh-lcmc-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset

In other mappings, the *=POS rule is used.

ar-atb-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
da-ddt-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
de-ud-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-ptb-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
en-ud-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
es-ancora-ixa-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
es-ancora-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
es-crater-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
es-parole-reduced-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
eu-ud-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fr-corenlp34-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
fr-melt-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
gl-xiada-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
gz-gamallo-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
it-stein-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
it-ud-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
la-brandolini-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
nl-alpino-ixa-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
nl-tt-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
no-ud-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
pl-ncp-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
pt-bosque-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
pt-gamallo-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
ru-msd-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
sk-smt-reduced-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset
sw-swatwol-pos.map - de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/lexmorph/tagset

It's like 50/50...

reckart added a commit that referenced this issue Jan 12, 2019
- Make catch-all mapping consistent over all mappings
@reckart reckart added 🐛Bug Something isn't working and removed ⭐️ Enhancement New feature or request labels Jan 12, 2019
@reckart reckart self-assigned this Jan 12, 2019
reckart added a commit that referenced this issue Jan 13, 2019
reckart added a commit that referenced this issue Jan 15, 2019
…uld-map-to-the-generic-type

#858 - Out-of-tagset tags should map to the generic type
@reckart reckart closed this as completed Feb 12, 2019
reckart added a commit that referenced this issue Mar 19, 2019
* master:
  #1322 - Upgrade to OpenNLP 1.9.1
  #1308 - integrate mystem
  #1327 - Update LIF support
  #1327 - Update LIF support
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1327 - Update LIF support
  #1323 - File extension generated by BinaryCasWriter does not contain dot
  #858 - Out-of-tagset tags should map to the generic type
  #1239 - Rename NYTCollectionReader to NitfReader
  #858 - Out-of-tagset tags should map to the generic type
  #1317 - Standard parameter to disable type mapping
  No issue. If a DKProTextContext is available, then TestRunner generates an XMI file from the processed data and stores it in the test output folder.
  No issue - Log names of files with license issues to the console.
  #1160 - Better support for CoNLL-U v2 (1.11.0)

% Conflicts:
%	dkpro-core-asl/pom.xml
reckart added a commit to tilmanbeck/dkpro-core that referenced this issue Apr 19, 2019
* master:
  dkpro#1325 - Avoid datasets being extracted outside their target directory
  dkpro#1325 - Avoid datasets being extracted outside their target directory
  dkpro#1325 - Avoid datasets being extracted outside their target directory
  dkpro#1338 - Factor CAS <-> brat conversion code into Pojos
  dkpro#1338 - Factor CAS <-> brat conversion code into Pojos
  dkpro#1322 - Upgrade to OpenNLP 1.9.1
  dkpro#1308 - integrate mystem
  dkpro#1327 - Update LIF support
  dkpro#1327 - Update LIF support
  dkpro#1329 - Span annotations with slot features may disappear from WebAnno TSV
  dkpro#1329 - Span annotations with slot features may disappear from WebAnno TSV
  dkpro#1329 - Span annotations with slot features may disappear from WebAnno TSV
  dkpro#1327 - Update LIF support
  dkpro#1325 - Avoid datasets being extracted outside their target directory
  dkpro#1325 - Avoid datasets being extracted outside their target directory
  dkpro#1323 - File extension generated by BinaryCasWriter does not contain dot
  dkpro#858 - Out-of-tagset tags should map to the generic type
  dkpro#858 - Out-of-tagset tags should map to the generic type
reckart added a commit that referenced this issue Apr 26, 2019
* master: (21 commits)
  #1305 - Update TreeTagger models in build.xml
  #1325 - Avoid datasets being extracted outside their target directory
  #1325 - Avoid datasets being extracted outside their target directory
  #1325 - Avoid datasets being extracted outside their target directory
  #1338 - Factor CAS <-> brat conversion code into Pojos
  #1338 - Factor CAS <-> brat conversion code into Pojos
  #1322 - Upgrade to OpenNLP 1.9.1
  #1308 - integrate mystem
  #1327 - Update LIF support
  #1327 - Update LIF support
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1329 - Span annotations with slot features may disappear from WebAnno TSV
  #1327 - Update LIF support
  #1325 - Avoid datasets being extracted outside their target directory
  #1325 - Avoid datasets being extracted outside their target directory
  #1323 - File extension generated by BinaryCasWriter does not contain dot
  #858 - Out-of-tagset tags should map to the generic type
  #1239 - Rename NYTCollectionReader to NitfReader
  #858 - Out-of-tagset tags should map to the generic type
  ...

% Conflicts:
%	dkpro-core-asl/pom.xml
%	dkpro-core-io-lif-asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/io/lif/LifReaderWriterTest.java
%	dkpro-core-io-lif-asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/io/lif/LifWriterTest.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant