Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved mechanism for loading models and mappings #40

Closed
GoogleCodeExporter opened this issue Mar 22, 2015 · 4 comments
Closed

Improved mechanism for loading models and mappings #40

GoogleCodeExporter opened this issue Mar 22, 2015 · 4 comments

Comments

@GoogleCodeExporter
Copy link

Currently DKPro TreeTagger supports auto-lookup of model files. It looks up and
loads the appropriate language model automatically according to the document
language. All other DKPro analysis engines (AEs) doesn't possess this ability
yet.

Dive into DKPro TreeTagger and learn how it does such auto-lookup. Can this
mechanism be encapsulated into ExternalResource? Goal is to let AE
automatically gain this auto-lookup feature, when such an object is passed in
in the parameter for model file location.

Furthermore, specific default paths should be configurable via property files.

Lastly, can it load concrete resources lazily? Meaning to load the resource the
moment it is first used. (Good starting point: ExternalResourceFactory of
UIMAFit, line 220)

For the lazy-loeading resources, have a look at the class ParametrizedResource
in org.uimafit.factory.ExternalResourceFactoryTest.

There is one more aspect to this issue: tags produced by the TreeTagger or
other analysis components do not directly correspond to UIMA types. We usually
have a generic base type, e.g. POS for Part-of-Speech annotations and more
specific subtypes, e.g. V for verbs, N for nouns, etc. The same for parsers or
named entity recognition. The generic model resource should also have some
method getUimaType(String tag) were you pass in a tag and it retuns a UIMA type
to use for the annotation. See
de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase.getTagType(DKProMode
l,
String, TypeSystem) for how this is done in the TreeTagger component.

Original issue reported on code.google.com by richard.eckart on 3 Oct 2011 at 7:19

@GoogleCodeExporter
Copy link
Author

50% done.

Encapsulated auto-lookup mechanism in AutoResourceResolver.

Specific paths can be configured in Java, which can also be overridden at 
runtime by UIMA parameters.




Original comment by s.y...@ishuo.de on 3 Jan 2012 at 2:34

@GoogleCodeExporter
Copy link
Author

Original comment by richard.eckart on 8 Feb 2012 at 10:51

  • Added labels: Milestone-1.4.0

@GoogleCodeExporter
Copy link
Author

Changed the title to reflect a reorientation in this task. For the time being 
we no longer try to model this using an external resource, but rather first try 
to harmonize the model/mapping loading across components.

Original comment by richard.eckart on 8 May 2012 at 6:11

  • Changed title: Improved mechanism for loading models and mappings

@GoogleCodeExporter
Copy link
Author

This works pretty well now for POS tags in many components. For furhter 
enhancements, separate bugs will be opened.

Original comment by richard.eckart on 1 Jul 2012 at 6:38

  • Changed state: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant