Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NER step is excluding everything but proteins; why is that? #9

Closed
ajmazurie opened this issue May 23, 2013 · 2 comments
Closed

NER step is excluding everything but proteins; why is that? #9

ajmazurie opened this issue May 23, 2013 · 2 comments

Comments

@ajmazurie
Copy link

When I looked at the output of the BANNER step I realized only Protein entities were annotated, excluding everything else. Switching to Cocoa didn't change a thing despite the fact I know Cocoa annotates much more than proteins.

Looking at the code (Tools/BANNER.py) I realize that only proteins are even considered. I was wondering why? I am interested in including all the annotations from BANNER (actually, from Cocoa) and am wondering if something will break downstream.

Best,
Aurélien

@jbjorne
Copy link
Owner

jbjorne commented May 23, 2013

Dear Aurélien,

BANNER does not assign type to the entities it detects. The scope of its
output is somewhat comparable to the Protein-entity type of the GENIA
task, so that's why we label it's output as Protein-entities.

If you want to use other methods (such as Cocoa) for detecting entities
of different types, this should be fine. Depending on the task, TEES
will take into account entity types corresponding to the given entities
(shared task a1 files) of that task. The presence of entities not used
in the task model may reduce performance, but should not otherwise
affect the system.

Regards,
Jari

23.5.2013 20:06, Aurélien Mazurie, Ph.D. kirjoitti:

When I looked at the output of the BANNER step I realized only Protein
entities were annotated, excluding everything else. Switching to Cocoa
didn't change a thing despite the fact I know Cocoa annotates much more
than proteins.

Looking at the code (Tools/BANNER.py) I realize that only proteins are
even considered. I was wondering why? I am interested in including all
the annotations from BANNER (actually, from Cocoa) and am wondering if
something will break downstream.

Best,
Aurélien


Reply to this email directly or view it on GitHub
#9.

@ajmazurie
Copy link
Author

Thanks for the information. I modified the Tools/Cocoa.py to add the information of the entity type, and apparently TEES didn't object to this. I am currently figuring out which output file(s) contains the results I want, and how to visualize it with BRAT.

@jbjorne jbjorne closed this as completed Nov 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants