-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OGER hackathon #34
Comments
Is the non-XML preface in the XMI file a Unicode BOM (Byte Order Marker)? In theory the files should be UTF-8 which I don't believe requires a BOM, but I know we've had a problem in GATE before (outside of OpenMinTeD) where XML files from odd sources had a BOM prefix. If it helps then the code we use in GATE to ensure we always discard the BOM can be found at https://github.com/GateNLP/gate-core/blob/master/src/main/java/gate/util/BomStrippingInputStreamReader.java |
To make sure we are as well prepared as possible to help during the hackathon sessions could you please add/attach to this issue:
|
Dear @greenwoodma:
|
For some reason the code that is generating Galaxy XML wrappers didn't work as expected. The typesystem you provided was not copied. I do not know why... So, I deleted your record and re-registered it. Then used the registered app to process the thalamus corpus. Finished .... :-) :-) :-) Output is here Please check it. I do not see any NER annotations. Maybe we need some help by University Of Manchester that developed the web service The typesystem is required from the web service client to serialize the results. If it is not there |
Yeah, this is the issue we're currently investigating, and which we were hoping to discuss during the Hackathon. OGER sends NER annotations, but OMTD doesn't seem to care for them when it re-parses our results. I'm actually a bit at a loss as for what sort of typesystem we should provide and how so. We have this file ready on our server (typesystem.xml.zip), which I would've expected to provide the necessary information. However, OMTD never sends a request for this file. If you have any more information on what sort of typesystem file precisely we need to add where, that would be greatly appreciated. |
Please see this one as an example. |
@Aequivinius There is a minor semantical error in your metadata. Your component takes as input a whole corpus of documents, not a single document, and generated annotations for the corpus, thus an annotated corpus. Correct? If that's the case, please change the processingResourceType from document to corpus in both inputContentResourceInfo and outputResourceInfo, in the final version of your metadata. |
@gkirtzou Done @galanisd | @nguyennth I have a few questions:
|
The NeuroScience maven artifact was registered as follows: It seems identical to yours. The web service executor that I created downloads this artifact and adds it |
Does anyone know why this It was deleted by someone? |
I noticed it, too, currently using this (
https://test.openminted.eu/landingPage/application/b8fb9bbd-603c-4b53-b86d-15c6c753302d).
It is set to private so I can easily play around with different
typesystems, but I can set it to public if you need me to.
…On Tue, Apr 17, 2018 at 5:07 PM, Dimitrios Galanis ***@***.*** > wrote:
Does anyone know why this
https://test.openminted.eu/landingPage/application/OGERWS
has disappeared?
It was deleted by someone?
There is a new landing page?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AK6JaJRYxRzSvvgAeCPaYCdbmS4WQE3Rks5tpgVKgaJpZM4TTjPa>
.
|
I am sure that I didn't delete it |
I've tried now these maven coordinates in the omtd-share.xml, which seem correct: mvn:de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.api.ner-asl:1.9.1 This should point to this repository, if I'm not mistaken, which includes the necessary info. However, our namedEntity annotations are still missing from OMTD. |
You are expecting things like this? |
I re-registered your app. and processed the thalamus corpus. Output here: |
@Aequivinius welcome to OpenMinTeD. |
Hi, Sorry for my late reply. As far as I understand it seems that you're using an available type system that was already uploaded to Maven central, i.e., the ner type system by dkpro. This means that you don't need to create a new type system. You only need to include the type system as a dependency in pom of the web service project. As @galanisd showed above, I believe it works now. In the case that you need to create a new type system, please let me know, we can discuss details later. |
@galanisd Fascinating, this is precisely what we were after. Wonder if the re-registering did the trick? Anyway, this is what we wanted, so it seems all is well! Thanks for your help! Should we now proceed to register the service on services.openminted.eu? |
Not yet. services.openminted.eu has not been updated for quite some time. Thanks! Dimitris |
@Aequivinius I was taking a final look into your metadata (as the one registered here ) and I noticed that you had declared in your input that the annotation type is Name Entity (i.e. http://w3id.org/meta-share/omtd-share/NamedEntity). Semantically, that means that your input needs to be annotated at that level before using your application. Is that the case? If not, and your input is just a raw corpus, then I would suggest removing the annotation type in the inputContentResourceInfo section. Also I would like to ask for statistical reasons, whether you performed the registration via the registration form or via xml? |
This is a mistake, I'll remove it from the XML and upload it correctly next
time (the registration form doesn't let me delete the value for this
specific field once set). I mostly used the web registration form, only
occasionally tinkering with the XML.
…On Wed, Apr 18, 2018 at 9:45 AM, Katerina Gkirtzou ***@***.*** > wrote:
@Aequivinius <https://github.com/Aequivinius> I was taking a final look
into your metadata (as the one registered here
<https://test.openminted.eu/landingPage/application/b8fb9bbd-603c-4b53-b86d-15c6c753302d>
) and I noticed that you had declared in your input that the annotation
type is Name Entity (i.e. http://w3id.org/meta-share/
omtd-share/NamedEntity). Semantically, that means that your input needs
to be annotated at that level before using your application. Is that the
case? If not, and your input is just a raw corpus, then I would suggest
removing the annotation type in the inputContentResourceInfo section.
Also I would like to ask for statistical reasons, you whether you
performed the registration via the registration form or via xml?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AK6JaCHncRyxYLyDi5mPE2wS7DGphbsYks5tpu8WgaJpZM4TTjPa>
.
|
@Aequivinius I didn't know that the registration form didn't allow you to delete specific fields once set. I will report this bug to the responsible technical person. Thanks for sharing! |
Also, when you do the last changes in the OMTD-SHARE descriptor could you please uploaded here as well to have a final check? In case I missed anything :) |
@gkirtzou Here you go! 18-4-removed_input.xml.zip |
The metadata seems fine. I would only suggest two things
Othewise, the metadata are correct and your application is also tested. It only rests the final registeration to the platform, when @greenwoodma informs you. |
@gkirtzou Thank you for you help! Find attached the most recent version of our share descriptor. |
@Aequivinius Perfect! I have no further comments/recommendations. |
@Aequivinius You can now proceed to the final uploading of your application at services.openminted.eu. If you encounter any problems, please let us know. |
@Aequivinius My mistake, please refrain from uploading at services.openminted.eu until further notice. |
@Aequivinius I have taken the liberty to upload your application at services.openminted.eu and tested it. It seems to work ok. The application is available at: https://services.openminted.eu/landingPage/application/71345d18-297f-4ac5-b4de-38ef3cacbe75 You can also test it yourself. |
Perfect, thanks! |
@Aequivinius I have a question; in your proposal and the description of the application, you mention the Bio Term Hub, and I'm trying to understand the relation between the two. When you say that the OGER is built on top of the BTH, you mean that you use the terminologies from the reference databases? And this aggregation of terminologies is already in the docker image you have provided? Or should we expect another component/application? |
@pennyl67 No, there will be no further components or applications.
BTH is an aggregator of terminologies and produces a unified
terminology. The terminology created in this way can be used by OGER.
However, the two components can also be used independently. The term
list provided by BTH could be used for other purposes; and OGER can be
provided with a term list obtained from other sources.
We submitted OGER as a web service as an application to OMTD. This web
service uses BTH to obtain up to date terminologies in the background.
Furthermore, we also wanted to make BTH available to the public, so we
created a Docker image that allows researchers can run it locally.
Alternatively, they may use our own webservice at
https://pub.cl.uzh.ch/projects/ontogene/biotermhub/. However, BTH uses a
web interface in which desired resources are manually selected. Because
of that, it was not suited to be integrated into the OMTD platform,
which is why we provide a separate link for the research community where
they can download a Dockerized version of BTH
(https://github.com/OntoGene/BioTermHub_dockerized).
Kind regards,
- Nico Colic
…On 15.5.2018 17:06, Penny Labropoulou wrote:
@Aequivinius [1] I have a question; in your proposal and the
description of the application, you mention the Bio Term Hub, and I'm
trying to understand the relation between the two. When you say that
the OGER is built on top of the BTH, you mean that you use the
terminologies from the reference databases? And this aggregation of
terminologies is already in the docker image you have provided? Or
should we expect another component/application?
--
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub [2], or mute the
thread [3].
*
Links:
------
[1] https://github.com/Aequivinius
[2]
#34 (comment)
[3]
https://github.com/notifications/unsubscribe-auth/AK6JaLssZOakEfaVe8-VyAC8_awEWu2Wks5tyu7hgaJpZM4TTjPa
|
Thanks for the explanations. It's clear now! Given that your application is already uploaded and public in the platform, if you agree, I will close this issue. |
Dear organisers
We're preparing our submission of OGER, a dictionary-based entity recogniser, as a webservice for openminted. We're currently in the process of fixing a few remaining issues that relate to how we parse the XMI that we receive from openminted. As it currently stands, it looks like the payload of the requests includes some non-XML preface, which we need to cut in order to parse the document to be annotated. Would you have a sample of how OMTD constructs the requests payload?
As for the hackathon, would it be possible to find a time on Tuesday afternoon? Most people from our group can make it then. Apart from that, Thursday or Friday would suit us, too.
Thanks for your help & kind regards,
The text was updated successfully, but these errors were encountered: