Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wl coref chains #1309

Merged
merged 39 commits into from Dec 1, 2023
Merged

Wl coref chains #1309

merged 39 commits into from Dec 1, 2023

Conversation

AngledLuffa
Copy link
Collaborator

WIP: add English coref to the pipeline

  • still need a script to rebuild data files
  • also need to turn the coref results in the pipeline into objects of some sort

AngledLuffa and others added 19 commits November 30, 2023 19:59
Loads the Coref model from a version downloaded via HuggingFace

Uses a rough draft of namedtuples for coref chains, connects the chains to the document

Could possibly make the chains have the text of the mentions.
Adds backlinks from words to the coref chains

Assigns the model to the proper device when building the coref processor

Hide the CorefModel import in coref_processor to avoid importing transformers when not available
Don't add a default yet to the default packages, as it is quite expensive

Instead, add an optional coref model for EN to the resources
Specifying "tokenize,coref" when creating a pipeline will find the optional EN model
…ber of steps. Helps but doesn't completely fix the problem of the models going to 0 F1
…hment into an object rather than a namedtuple, since the default encoder doesn't appear to have a way to override the decoding of a namedtuple. Only output a couple items from the coref chain as part of the json output

Add a potential conllu format as well
Uses unit- for singletons, start-, end-, middle- otherwise

Add an index to the coref chains
…g data) if only doing eval mode. Also, turn off the noisy output by default
@AngledLuffa AngledLuffa merged commit 5d0aa45 into dev Dec 1, 2023
1 check passed
@AngledLuffa AngledLuffa deleted the wl_coref_chains branch December 1, 2023 05:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants