Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved mechanism for loading models and mappings #40

Closed
reckart opened this issue May 12, 2015 · 4 comments
Closed

Improved mechanism for loading models and mappings #40

reckart opened this issue May 12, 2015 · 4 comments
Labels
⭐️ Enhancement New feature or request
Milestone

Comments

@reckart
Copy link
Member

reckart commented May 12, 2015

Currently DKPro TreeTagger supports auto-lookup of model files. It looks up and
loads the appropriate language model automatically according to the document
language. All other DKPro analysis engines (AEs) doesn't possess this ability
yet.

Dive into DKPro TreeTagger and learn how it does such auto-lookup. Can this
mechanism be encapsulated into ExternalResource? Goal is to let AE
automatically gain this auto-lookup feature, when such an object is passed in
in the parameter for model file location.

Furthermore, specific default paths should be configurable via property files.

Lastly, can it load concrete resources lazily? Meaning to load the resource the
moment it is first used. (Good starting point: ExternalResourceFactory of
UIMAFit, line 220)

For the lazy-loeading resources, have a look at the class ParametrizedResource
in org.uimafit.factory.ExternalResourceFactoryTest.

There is one more aspect to this issue: tags produced by the TreeTagger or
other analysis components do not directly correspond to UIMA types. We usually
have a generic base type, e.g. POS for Part-of-Speech annotations and more
specific subtypes, e.g. V for verbs, N for nouns, etc. The same for parsers or
named entity recognition. The generic model resource should also have some
method getUimaType(String tag) were you pass in a tag and it retuns a UIMA type
to use for the annotation. See
de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase.getTagType(DKProModel,
String, TypeSystem) for how this is done in the TreeTagger component.

Original issue reported on code.google.com by richard.eckart on 2011-10-03 19:19:13

@reckart
Copy link
Member Author

reckart commented May 12, 2015

50% done.

Encapsulated auto-lookup mechanism in AutoResourceResolver.

Specific paths can be configured in Java, which can also be overridden at runtime by
UIMA parameters.




Original issue reported on code.google.com by s.yang@ishuo.de on 2012-01-03 14:34:46

@reckart
Copy link
Member Author

reckart commented May 12, 2015

(No text was entered with this change)

Original issue reported on code.google.com by richard.eckart on 2012-02-08 22:51:53

  • Labels added: Milestone-1.4.0

@reckart
Copy link
Member Author

reckart commented May 12, 2015

Changed the title to reflect a reorientation in this task. For the time being we no
longer try to model this using an external resource, but rather first try to harmonize
the model/mapping loading across components.

Original issue reported on code.google.com by richard.eckart on 2012-05-08 18:11:25

@reckart
Copy link
Member Author

reckart commented May 12, 2015

This works pretty well now for POS tags in many components. For furhter enhancements,
separate bugs will be opened.

Original issue reported on code.google.com by richard.eckart on 2012-07-01 18:38:03

@reckart reckart closed this as completed May 12, 2015
@reckart reckart modified the milestone: 1.4.0 May 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐️ Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant