Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Score alignments #6

Merged
merged 6 commits into from
Apr 21, 2021
Merged

Score alignments #6

merged 6 commits into from
Apr 21, 2021

Conversation

JohnGiorgi
Copy link
Owner

@JohnGiorgi JohnGiorgi commented Apr 21, 2021

This PR produces a score for each alignment (between 0 and 1). The score is simply the fraction of interactions in BioGRID that we were able to find in the raw text using PubTator annotations. The purpose of this score is two-fold:

  1. We can eventually use it to filter out low-quality alignments.
  2. More immediately, we can use it to create high-quality validation and test sets, but creating them from examples with scores of 1.0

This scoring system is not perfect. In particular, it will not catch missed coreferent mentions (#7). That is OK for now, but probably something we should address down the line.

@JohnGiorgi JohnGiorgi merged commit d647bba into main Apr 21, 2021
@JohnGiorgi JohnGiorgi deleted the score-alignments branch April 21, 2021 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant