This repository was archived by the owner on Dec 8, 2017. It is now read-only.
XPath rewrite#20
Open
divergentdave wants to merge 9 commits into
Open
Conversation
jsdom doesn't play well with browserify/uglifyjs
Note that some links went away because GPO FDsys doesn't have documents beyond certain date ranges. These limitations are reflected in the link handling logic of the citation library, but were not reflected previously in this repository.
Author
|
Speaking of breaking, I have pushed more commits to fix test breakage, etc. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I noticed this library works by modifying document.innerHTML. This approach can have a lot of downsides, especially when used on sites with dynamic content.
This PR is a rewrite of the script, so that it modifies the document by only touching the text nodes that contain citations. First, using XPath, it takes a snapshot of all text nodes that aren't direct children of
<script>,<style>, or<a>tags. Each text node is scanned individually for citations, and there are early outs for all-whitespace text nodes or for text nodes with no citations. Then, using the matches found by thecitationlibrary, the original text is sliced up, new text nodes and link elements are inserted in its place, and the old text node is removed. As before, this gets run on DOMContentLoaded in browser contexts, and the functions are exported for use elsewhere in non-browser contexts. I also upgraded the version of thecitationlibrary, and used the built inlinksfeature to get GPO links for each citation.I did a quick, unscientific speed comparison of the code before and after on a CRS report with plenty of citations, and I measured the runtime for the relevant JS method to be ~6s before and ~70ms after.
Let me know if you have any questions. Of course, this is a breaking change since the function now operates in-place on a DOM, rather than returning an HTML string. I have also changed the main function name, along with its signature.