Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start and end positions of tokens #29

Closed
dinani65 opened this issue May 17, 2021 · 3 comments
Closed

Start and end positions of tokens #29

dinani65 opened this issue May 17, 2021 · 3 comments

Comments

@dinani65
Copy link

Thanks for your interesting work.
afaik, it is necessary to specify the start and end of tokens in input sentences and also one tag is possible for each sentence at a time.
So, If we want to use it to annotate the content of a webpage, it is necessary to specify the words at first, right?
could u please explain what get_entity_spans does?
#20
Is is responsible to detect the tags and their start and end positions?

@nicola-decao
Copy link
Contributor

Do you want to do Mention Detection, Entity Disambiguation or Entity Linking?

@dinani65
Copy link
Author

dinani65 commented May 17, 2021

In fact, I am looking for a multilingual named entity linking approach which is able to disambiguate names using entity linking.
GENRE is not multilingual but it is possible to have more than one tag in input text while mGENRE is multilingual with the mentioned restrictions.
Mention Detection also should be done before employing GENRE/mGENRE.
Another question, Is there any restriction for the size of text input?

@nicola-decao
Copy link
Contributor

nicola-decao commented May 18, 2021

  1. You can use GENRE to do entity linking in English only (both mention detection and entity disambiguation).
  2. You can use GENRE to do entity disambiguation in English only.
  3. You can use mGENRE to do entity disambiguation in 100 languages.
  4. You can combine an off the shelf mention detection model (like FLAIR) and then apply mGENRE if you want to have a multilingual entity linking system at the end.
    Also, the size of text input is limited to 1024 BPEs right now (a limitation that comes from BART).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants