Skip to content
This repository has been archived by the owner on Sep 2, 2020. It is now read-only.

New: reference collection #140

Merged
merged 21 commits into from Jun 29, 2018
Merged

New: reference collection #140

merged 21 commits into from Jun 29, 2018

Conversation

montehurd
Copy link
Collaborator

@montehurd montehurd commented May 9, 2018

Used to make it easier to show native adjacent reference panel carousel:

dino mov

Usage details:

Used on iOS when a reference is tapped to show a native panel which lets user drag side to side to see any info for references which were adjacent to the tapped reference. ie: if a lone reference is tapped ie [4] we show a panel with its reference text, but if that reference has adjacent references ie [3][4][5] the panel we show can be dragged side to side to see the references text for [3] and [5].

To use:
Pass isCitation the href string of the tapped reference anchor, if it returns true (on iOS we use it to determine if the type of the clicked item is reference) call collectNearbyReferences with the document and the tapped anchor element to extract relevant references info (iOS example) including a referencesGroup array and the index of the selected element selectedIndex from that group.

The each item in referencesGroup will contain:

  • id - id of the reference
  • rect - ClientRect of the reference (so you can do a native overlay dimming everthing but the tapped reference link ie [4])
  • text - the text of the tapped reference link ie [4]
  • html - the html/text of the reference from the actual reference at the bottom of the page (in the animation above this is everything shown below the horizontal line in the panels)

iOS native interface details:

  • On iOS we use it as outlined above to extract adjacent reference info, and we send that info across the iOS version of the JS/Native bridge.
  • We then present a native interface showing the adjacent reference group, with one panel per reference (the bottom of the panel has a web view for showing the reference html, but it just looks like part of the panel, as you can see in the animation).
  • The native code also applies an overlay with a cutout around the tapped ref.
  • It also scrolls the web view up if the native panel it presents would cover the tapped reference, as you can see in the animation when I tap reference [11].

Notes:

  • I also moved some utilities to NodeUtilities.js - apologies that this wasn't as separate commit.
  • I had to fairly heavily refactor the iOS implementation to modernize it a bit and make it more testable.
  • On iOS on tablet we only show the single tapped reference (no side-to-side sliding panels or dark overlay):

screen shot 2018-05-09 at 4 33 00 pm

…ferences.

Used on iOS when a reference is tapped to show a panel which lets user drag side to side to see any info for references which were adjacent to the tapped reference. ie: if a lone reference is tapped "[4]" we show a panel with its reference text, but if that reference has adjacent references ie "[3][4][5]" the panel we show can be dragged side to side to see the references text for "[3]" and "[5]".

To use:
Pass `isCitation` the href string of the tapped reference anchor, if it returns true call `collectNearbyReferences` with the document and the tapped anchor element to get extracted references info including a `referencesGroup` array and the index of the selected element `selectedIndex` from that group.

The each item in `referencesGroup` will contain:
 `id` - id of the reference
 `rect` - ClientRect of the reference (so you can do a native overlay dimming everthing but the tapped reference link ie "[4]")
 `text` - the text of the tapped reference link ie "[4]"
 `html` - the html/text of the reference from the actual reference at the bottom of the page
@sharvaniharan
Copy link
Collaborator

Thankyou @montehurd :) yay! i'm looking forward to use this!

@montehurd
Copy link
Collaborator Author

@sharvaniharan You're welcome! I just edited the description with a few more details of how we use this on iOS. Let me know if you have any questions or issues :)

montehurd added a commit to wikimedia/wikipedia-ios that referenced this pull request May 9, 2018
This doesn't include the pagelib version bump because the PR moving those bits to the page lib isn't merged at the moment. As soon as wikimedia/wikimedia-page-library#140 is merged we can bump the pagelib version (as a new commit on this PR) and it should work.

/**
* Checks if element has a child anchor with a citation link.
* @param {!HTMLElement} element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space between type and name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

* @return {!boolean}
*/
const isWhitespaceTextNode = node =>
!(!node || node.nodeType !== Node.TEXT_NODE || !node.textContent.match(/^\s+$/))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, this may run faster but it would read more easily if the negation was pushed inward:

node && node.nodeType === Node.TEXT_NODE && node.textContent.match(/^\s+$/))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

const hasCitationLink = element => {
try {
return isCitation(getFirstChildAnchor(element).getAttribute('href'))
} catch (e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was wrapped previously to prevent catch on empty links. I feel that needless try / catch blocks obscure how easy it is to reason about the code. What do you think of:

const anchor = element.querySelector('a')
return anchor && anchor.getAttribute('href')

Instead of

try {
  return isCitation(getFirstChildAnchor(element).getAttribute('href'))
} catch (e) {
  return false
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't recall the exact reason for the try/catch - been a while. I played with your alternative for a bit (adding call to isCitation where needed) but didn't get it to work (tests failed). Would be ok to leave this as-is for now?

Copy link
Collaborator Author

@montehurd montehurd Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nevermind I think I fixed it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! I'm glad we have a test case that fails! Now we can easily make the improvement right now and be confident in our code instead of punting it down the line and be confused later.

I looked into this myself. This try / catch block was obscuring the real underlying issue which was that adjacentNonWhitespaceNode() can return null and getFirstChildAnchor() expects nonnull.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya just a heads-up I reverted your change - see my comment here #140 (comment) for details.

* @param {!string} href
* @return {!boolean}
*/
const isCitation = href => href.indexOf('#cite_note') > -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be more more precise to say anchor.hash.startsWith('#cite_note'). You may consider also moving the selector to the top of the file, as CITE_SELECTOR_PREFIX, as you have done with REFERENCE_SELECTOR.

Copy link
Collaborator Author

@montehurd montehurd Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer to keep the method narrowly scoped vs passing it the entire anchor element. Open to refactoring this later if you feel strongly about it though :)

I added and used a constant per your suggestion.


/**
* Determines if node is a text node containing only whitespace.
* @param {!Node} node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space between type and name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

const getRefTextContainer = (document, sourceNode) => {
const refTextContainerID = getFirstChildAnchor(sourceNode).getAttribute('href').slice(1)
const refTextContainer = document.getElementById(refTextContainerID)
|| document.getElementById(decodeURIComponent(refTextContainerID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to check both decoded and undecoded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps no longer, but I do recall some of these had funky encoding at one time...

/**
* Reference item model.
*/
class ReferenceItem {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using a JSDoc for typing instead of a class? Then creating a ReferenceItem is just matching the shape which requires explicit labeling of properties. For example, before:

class ReferenceItem {
  constructor(id, rect, text, html) {
    this.id = id
    this.rect = rect
    this.text = text
    this.html = html
  }
}
const reference = new ReferenceItem('a', {width, height}, 'foo', '<b>bar</b>')

With labeled properties:

/**
 * @typedef ReferenceItem
 * @prop {string} id
 * @prop {DOMRect} rect
 * @prop {?string} text
 * @prop {?string} html
 */
const reference = {id: 'a', rect: {width, height}, text: 'foo', html: '<b>bar</b>'}

Same thing for NearbyReferences.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer using explicit language constructs over JSDoc shape conformance, though I'd be interested in chatting about this in the future. I think there are some benefits you get with classes, but there may be details I'm missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of the class keyword exposes us to misusage of extends and we're not using any extra functionality here. It's just plain data. I think it's easier to reason about a simple JSON object than a class. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of the class keyword exposes us to misusage of extends

Using classes to model data is pretty common. How is classes being extendable practically risky here?

beforeEach(() => {
document.documentElement.innerHTML = `
<stuff>
\t<b id=one>one</b>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but why an escaped tab here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is for adjacentNonWhitespaceNode(), so iirc I separated the elements with newlines and leading space "whitespace", so I thought it would be good to throw tabs in there too.

currentNode = adjacentNonWhitespaceNode(currentNode, siblingGetter)
if (hasCitationLink(currentNode)) {
nodeCollector(currentNode)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you invert this condition, you can replace the continue here with the break below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! Good catch! Fixed

const collectAdjacentReferenceNodes = (node, siblingGetter, nodeCollector) => {
let currentNode = node
while (true) {
currentNode = adjacentNonWhitespaceNode(currentNode, siblingGetter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may wish to explicate in the JSDocs that the starting node is never collected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point. Added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that when there is only one node, the passed in collector is never called.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it go without saying that if there no adjacent reference nodes, none will be collected?

@sharvaniharan
Copy link
Collaborator

@montehurd works on Android now! @niedzielski will let you merge once the changes you have requested are in!

@niedzielski
Copy link
Contributor

Re-review pending follow-ups from @montehurd.

@montehurd
Copy link
Collaborator Author

@niedzielski I think I replied to and/or pushed commits to all your comments. Should be good for a re-review when you have time. Thanks again for the great feedback!

@sharvaniharan
Copy link
Collaborator

@niedzielski @montehurd can we merge this?

Many JavaScript developers agree to favor strict equality (===) to loose
equality (==). This refactor makes the same argument for tests that
strict is generally safer.
/**
* Reference item model.
*/
class ReferenceItem {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of the class keyword exposes us to misusage of extends and we're not using any extra functionality here. It's just plain data. I think it's easier to reason about a simple JSON object than a class. What do you think?

const ReferenceCollection = pagelib.ReferenceCollection

const referenceGroupHTML = `
<sup id="cite_ref-a" class="reference"><a id='a1' href="#cite_note-a">[4]</a></sup>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's mixed single and double quotes in this string and others. Can we use single?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

* @param {!HTMLElement} element
* @return {?HTMLAnchorElement}
*/
const getFirstChildAnchor = element => element.querySelector('A')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya!

I wanted to share my perspective on replacing comments with functions as you've done here.

In general, I think it's great practice to use good naming on our symbols and, of course, abstraction for the win. However, I probably wouldn't write a comment for querying the first anchor of a DOM subtree since it's so commonplace as to be un-noteworthy and the abstraction here does nothing. Function calls may now read more like English but they also obscure the underlying simplicity of what's actually happening here with a new custom DSL and greatly inflate the verbosity of the code which is more of a decrease to readability than an increase.

Consider what happens also when the second child anchor is needed in some new patch. Do we make a new function that sits alongside getFirstChildAnchor() called getSecondChildAnchor()? That seems bad enough but when the third anchor is needed, suddenly we might wonder if getNChildAnchor() should be added which takes an argument, n. Now we have three functions that all boil down to querySelector / querySelectorAll and we still need to pass an argument.

I think we both feel very comfortable working with querySelector / querySelectorAll so my opinion is that we should favor their direct usage unless we're doing something extra that we want to capture in a function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya this was overly abstracted. I removed it. Good catch :)

})
it('correctly affirms child anchor does not have citation link', () => {
const element = document.querySelector('#cite_ref-b')
assert.equal(hasCitationLink(element), false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced all these calls to assert.equal with assert.strictEqual since for the same reason we favor === to ==. 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coolio! Thanks!

const hasCitationLink = element => {
try {
return isCitation(getFirstChildAnchor(element).getAttribute('href'))
} catch (e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! I'm glad we have a test case that fails! Now we can easily make the improvement right now and be confident in our code instead of punting it down the line and be confused later.

I looked into this myself. This try / catch block was obscuring the real underlying issue which was that adjacentNonWhitespaceNode() can return null and getFirstChildAnchor() expects nonnull.

const collectAdjacentReferenceNodes = (node, siblingGetter, nodeCollector) => {
let currentNode = node
while (true) {
currentNode = adjacentNonWhitespaceNode(currentNode, siblingGetter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that when there is only one node, the passed in collector is never called.

@montehurd
Copy link
Collaborator Author

montehurd commented Jun 12, 2018

@niedzielski Heya! Just a head's-up I reverted 22bfca3 as it didn't work. I'll take a peek at the cause and undo the revert if I can suss it out :)

Edit: So it looks like it's not working because you're trying to call querySelector on a text node. btw sorry if some of these bits were a little hard to follow - I've tried to make it more grok-able than the original but there's certainly room for further improvement :)

@niedzielski
Copy link
Contributor

niedzielski commented Jun 22, 2018

Edit: So it looks like it's not working because you're trying to call querySelector on a text node. btw sorry if some of these bits were a little hard to follow - I've tried to make it more grok-able than the original but there's certainly room for further improvement :)

Great! Thanks for diagnosing. I've added a test case for the scenario you've described and as I understand the desired behavior, and made the fix. Please be sure to check it out!

@montehurd
Copy link
Collaborator Author

@niedzielski Your 5ae2c77 commit worked! Thanks!

If you have a sec there are a couple questions above:
#140 (comment)
#140 (comment)

@montehurd
Copy link
Collaborator Author

@niedzielski btw thanks again for all the great comments and CR!!!

@sharvaniharan
Copy link
Collaborator

@niedzielski @montehurd merging this! thanks for all the review work you guys did!! 💯

@sharvaniharan sharvaniharan merged commit 2e7d7d0 into master Jun 29, 2018
@sharvaniharan sharvaniharan deleted the new/reference-collection branch June 29, 2018 12:49
wmfgerrit pushed a commit to wikimedia/apps-android-wikipedia that referenced this pull request Aug 2, 2018
- Added a ViewPager to view grouped citations
- integrated with the PR for grouped citations : wikimedia/wikimedia-page-library#140

Bug: T172283
Change-Id: Id1302b4a4ee0ce5553d397301ce17c528c08d07d
dbrant pushed a commit to dbrant/apps-android-wikipedia that referenced this pull request Aug 23, 2018
- Added a ViewPager to view grouped citations
- integrated with the PR for grouped citations : wikimedia/wikimedia-page-library#140

Bug: T172283
Change-Id: Id1302b4a4ee0ce5553d397301ce17c528c08d07d
@montehurd
Copy link
Collaborator Author

montehurd commented Feb 11, 2019

I mentioned this PR on...
https://phabricator.wikimedia.org/T164312

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
3 participants