Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

: Idiomfier as an NER tagger #15

Closed
1 of 6 tasks
Tracked by #4
eubinecto opened this issue Mar 12, 2022 · 1 comment
Closed
1 of 6 tasks
Tracked by #4

: Idiomfier as an NER tagger #15

eubinecto opened this issue Mar 12, 2022 · 1 comment

Comments

@eubinecto
Copy link
Owner

eubinecto commented Mar 12, 2022

How?

First of all, let's try this with the baseline approach - just using a simple linear layer approach. Performance is not what matters as of right now.

Could you use BART for this? Yes you could, but.. BART is an auto-regressive model - cannot refer to the future when processing the past. BERT would be a better choice than BART.

To-do's

  • delete tokenizer- related fetchers and paths
  • change the builders: InputsBuilder
    • we must make sure that the tokens are not split further than it is now..
  • explore_ inputs_builder
  • change the builders: LabelsBuilder
  • explore_labels_builder
  • rewrite Idiomifier to learn NER with BERT
@eubinecto eubinecto mentioned this issue Mar 12, 2022
34 tasks
@eubinecto eubinecto changed the title : Idiomfier as an NER tagger. : Idiomfier as an NER tagger Mar 22, 2022
@eubinecto
Copy link
Owner Author

eubinecto commented Mar 22, 2022

How should we handle subwords in a token-level classification task?

image

eubinecto added a commit that referenced this issue Apr 10, 2022
eubinecto added a commit that referenced this issue Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant