-
-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ner model #1706
Merged
Merged
Ner model #1706
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…uage specific. This greatly simplifies the input parameters and generally makes more sense.
…gnizer to help abstract this functionality and make it more reusable.
This case seems to be either impossible or very unlikely. throwing and error for now unless we determine we want to handle this case.
It seems that the start indices from the mapping are already off-by-one, so end doesn't need to be -1.
…n mapping main span and part spans.
…anges are dealt with in NormalizerComposer
…he search by at least one word.
edamboritz
previously approved these changes
Dec 19, 2023
This will remove the extra www. if it appears.
… function is more reusable
Make sure not to remove duplicate refs that matched different parts.
…o figure out how to do subref on parsha.
English v3 results are still not on par with v2.
edamboritz
approved these changes
Jun 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR in general adds functionality to the linker to (1) find named entities as well as citations (2) run the models in English.
The code needed to be refactored in order to make these changes in a clean way. Bugs were encountered with existing code and they were fixed as explained below.
Below is an overview of the major components that were changed.
Linker API
Linker Python class
RefResolver
NamedEntityRecognizer
NamedEntityResolver
MapReferenceableBookNode
AltStructNode
CORSDebugMiddleware
Linker JS
Linker Index Converter
Normalization
norm_to_unnorm_indices()
which combines two commonly used functions into one.