Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supercript feature in fulltext #573

Merged
merged 5 commits into from
Aug 11, 2020
Merged

Conversation

kermitt2
Copy link
Owner

Add a feature in the full text model indicating if the token is a superscript.
Motivation: better learning of citation callout expressed as superscript.

Evaluated on 139,835 citation contexts and it improves the citation context resolution as follow:
from

Precision citation contexts:     81.36
Recall citation contexts:        68.73
fscore citation contexts:        74.51

to

Precision citation contexts:     81.45
Recall citation contexts:        69.58
fscore citation contexts:        75.05

Note: due to some obscure reasons, I mixed this PR with PR #572, so it will have to be merged after PR #572.

@kermitt2 kermitt2 added this to the 0.6.1 milestone Apr 21, 2020
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 37.93% when pulling d6d6a8f on supercript-feature-in-fulltext into 206583a on master.

@kermitt2 kermitt2 merged commit f47dcc8 into master Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants