Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When only TextQuoteSelector is used, anchoring should use suffix to differentiate when prefix and exact are the same #1022

Closed
judell opened this issue Jun 4, 2019 · 2 comments

Comments

@judell
Copy link

judell commented Jun 4, 2019

When the Hypothesis client creates two annotations, as shown here, http://jonudell.net/h/same-prefix-diff-suffix.html, both anchor, because the client uses the full complement of selectors.

But when Scibot creates the same two annotations, the anchors pile up on one occurrence of the exact like so:

<hypothesis-highlight class="annotator-hl">
   <hypothesis-highlight class="annotator-hl">WB-STRAIN</hypothesis-highlight>
</hypothesis-highlight>

Now, granted, Scibot would ideally post annotations using the full complement of selectors. It doesn't operate in a DOM context, so would have a hard time figuring out RangeSelector. Scibot probably could and should at least figure out TextPositionSelector as well as TextQuoteSelector, which might be enough to differentiate. Still, one would expect that the anchoring system could handle annotations with TextQuoteSelectors that differ by suffix alone. Evidently we don't?

/cc @tgbugs, see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6522220/

@judell
Copy link
Author

judell commented Jun 12, 2019

Here's why this happens:

https://github.com/tilgovi/dom-anchor-text-quote/blob/master/src/index.js#L109

We only search for the suffix if the prefix isn't found. Which makes sense: always searching would be expensive. If we kept track of anchors where prefixes and exacts match, we could maybe search for the suffix in only those cases.

@robertknight
Copy link
Member

Fixed by hypothesis/client#2814. When there are multiple matches for a given TextQuoteSelector they are now ranked by a weighted score based on quote similarity, suffix similarity, prefix similarity and distance from expected text position.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants