Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why so many orphans at https://www.washingtonpost.com/news/energy-environment/wp/2017/01/18/u-s-scientists-officially-declare-2016-the-hottest-year-on-record-that-makes-three-in-a-row ? #143

Closed
judell opened this issue Jan 19, 2017 · 11 comments · Fixed by hypothesis/client#226
Assignees
Labels

Comments

@judell
Copy link

judell commented Jan 19, 2017

https://www.washingtonpost.com/news/energy-environment/wp/2017/01/18/u-s-scientists-officially-declare-2016-the-hottest-year-on-record-that-makes-three-in-a-row/

The orphan text (copied from an annotation):

"NASA concurred with NOAA, also declaring 2016 the warmest year on record in its own dataset that tracks the temperatures at the surface of the planet’s land and oceans, and expressing “greater than 95 percent certainty” in that conclusion. (In contrast, NOAA gave a 62 percent confidence in the broken record.)"

The text in the story (copied from the story):

"NASA concurred with NOAA, also declaring 2016 the warmest year on record in its own data set that tracks the temperatures at the surface of the planet’s land and oceans, and expressing “greater than 95 percent certainty” in that conclusion. (In contrast, NOAA gave a 62 percent confidence in the broken record.)"

Why the orphans on this page?

We'll definitely hear from Climate Feedback about this, so would be good to have an answer in hand before turning on the orphans tab.

@ajpeddakotla
Copy link

@robertknight can you comment on what might be happening (or happened here)?

If you look at the annotations in the orphans tab, the text that many of the orphans originally anchored to, is still present. And it's not clear why they would show up in the annotations tab.

@judell thanks for finding this.

@robertknight
Copy link
Member

@robertknight can you comment on what might be happening (or happened here)?

I'm going to be looking into this issue this afternoon and updating this with my findings.

@judell
Copy link
Author

judell commented Jan 20, 2017

Can you check the aliases for that URL in the db? Sometimes ClimateFeedback annotate a copy of an article that's not openly available to the team, and I'm wondering if that may be a factor.

@robertknight
Copy link
Member

Couple of observations so far:

  1. The page polyfills window.console with a replacement that doesn't actually write to the console. FFS
  2. If I load the page with ads blocked using AdBlock Plus, all annotations anchor. If I load without ABP I see 18 annotations anchor and 15 fail to anchor.

@robertknight
Copy link
Member

Found the issue. The problem is as follows:

  1. When ads are enabled on the page, a script is loaded which adds an enumerable containsArray method to Array.prototype
  2. dom-anchor-text-quote uses for (var .. in ..) { ... } to loop over items in an array at one point. This is the wrong way to loop over an array because it includes enumerable properties on the prototype, including containsArray in this case.
  3. If an exact match search for the quote fails, the code in diff-match-patch blows up when it tries to process the containsArray function where it is expecting a string

I'll get this fixed upstream in the dom-anchor-text-quote library.

The more general solution to this category of problem is to find ways to isolate Hypothesis from the page's JavaScript.

robertknight added a commit to hypothesis/dom-anchor-text-quote that referenced this issue Jan 20, 2017
for .. in loops iterate over all enumerable properties of an object,
including those on the prototype. This means that if the code runs in an
environment where Array.prototype contains enumerable properties, it
will loop over those too.

Use Array.prototype.reduce instead as a natural companion to a fold
function.

See hypothesis/product-backlog#143
@judell
Copy link
Author

judell commented Jan 20, 2017

@ajpeddakotla
Copy link

@robertknight
Copy link
Member

In the case of http://www.newyorker.com/science/maria-konnikova/being-a-better-online-reader , all annotations anchor correctly when using the extension but a significant number (41 at the time of writing) fail to anchor when the page is loaded in Via.

@ajpeddakotla ajpeddakotla added this to Discovery/Spec in Feature Inbox Jan 30, 2017
@robertknight
Copy link
Member

Possibly related #146

@robertknight
Copy link
Member

robertknight added a commit to hypothesis/client that referenced this issue Feb 11, 2017
…otype

Update dom-anchor-text-quote to incorporate
tilgovi/dom-anchor-text-quote#11 which fixes quote anchoring on pages
which include JS that adds enumerable properties to `Array.prototype`.

Fixes hypothesis/product-backlog#143
@judell
Copy link
Author

judell commented Feb 11, 2017

Excellent fix, thanks @robertknight!

@ajpeddakotla ajpeddakotla removed this from Discovery/Spec in Feature Inbox Mar 20, 2017
robertknight added a commit to hypothesis/dom-anchor-text-quote that referenced this issue Jun 20, 2017
* Avoid using `for ( .. in .. )` to iterate over arrays

for .. in loops iterate over all enumerable properties of an object,
including those on the prototype. This means that if the code runs in an
environment where Array.prototype contains enumerable properties, it
will loop over those too.

Use Array.prototype.reduce instead as a natural companion to a fold
function.

See hypothesis/product-backlog#143

* Correct test case description to match assertion

* Add a couple of additional test cases for long quotes

Given that long quotes are processed in chunks, add tests that check for
matches and non-matches in different 32-char chunks within a string.

* Refactor check for `acc` being null

As suggested in tilgovi#11 (review)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants