Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assorted comments/questions #24

Closed
eaubin opened this issue Oct 12, 2018 · 2 comments
Closed

Assorted comments/questions #24

eaubin opened this issue Oct 12, 2018 · 2 comments

Comments

@eaubin
Copy link

eaubin commented Oct 12, 2018

How do you create a ta or tajson file from a text file? Is there a config format for plain text?

If I click on highlighted entity, e.g. Miss Watson in doc-EnglishTa.txt and select a different label 'ORG', it doesn't change color or appear in the log. It does appear to change for things I've annotated myself.

It might be nice to remove a label in the label selection dialog (easier to find/remember than right clicking if you have closely read the readme).

Is the idea that we can reconstruct the labels by replaying the log? I can find the orig-annotation-user files, but how do I read them?

Where do initial labels come from? They are not in the log, so I'm not clear how to get them.

What happens when you submit a definition?

@mayhewsw
Copy link
Member

I'll answer each in order:

Use txt2tajson.py. There is no config format for plain text because we always save to the format we load, and there's no established way of storing NER annotations in plain text.

Are you clicking on just one token? (e.g. Watson, or Miss). If you click and drag (start on Miss, hold down, move to Watson) then select label it should work. If this doesn't work, please open a separate issue with steps to reproduce.

There used to be a Non-entity button, but I removed it. Thanks for the suggestion, we'll consider for future updates.

The log is intended as a log, not necessarily for recreating. I'm not sure I understand the question about the orig-annotation-user files.

Input data may be labeled. In this case, I guess only the ta/eng data has labels (sorry, I can see how that is confusing), but any input data may contain labels.

This should be stored in a local dictionary, and also written to file. Once you add a definition, you should be able to check "Show dictionary" and all definitions will be shown inline. This appears to not work at the moment. Thanks for pointing it out.

@mayhewsw
Copy link
Member

mayhewsw commented Oct 12, 2018

Just pushed a commit (b74c836) that fixes the dictionary bug. Thanks again for pointing it out.

This will save the dictionary to dicts/CONFIGNAME-USERNAME.txt. This will be loaded next time you open talen.

@eaubin eaubin closed this as completed Oct 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants