Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Inserting <plaintext> means mobify.js AND main.js are single-points-of-failure #127

Closed
shogun70 opened this Issue · 14 comments

3 participants

@shogun70

Once the <plaintext> tag has been inserted the page is blank until it is rewritten.
This turns into a completely unusable page / site if anything prevents this rewrite from happening, e.g.

  • mobify.js fails to load because it is on an external server (CDN perhaps) which is down or blocked
  • mobify.js throws an error, because it encounters an unexpected browser behavior
  • main.js throws an error because it wasn't tested cross-browser

Ideally the initialization script provides a fallback to re-make the page as though there were no <plaintext> inserted.

@johnboxall
Owner

@shogun70 there are a couple of approaches to avoid this - assuming you have a modern browser, you could attach an error listener to the inserted script - on error you could set an "opt-out" cookie and reload the page.

@shogun70

An opt-out cookie throws the complexity back onto the site author. I think an option using the already loaded content would be more appropriate.

Assuming the init script is the first script in the document, you could:

  1. wait for DOMContentLoaded or equivalent
  2. remove all script elements from document
  3. document.write the <html> and <head> opening tags followed by head.innerHTML followed by the text-content of <plaintext>.

Something like:

onError = function() {
  var scripts = $$('script');
  for (var i=0, s; s=scripts[i]; i++) s.parentNode.removeChild(s);

  var head = $$('head')[0];
  var plaintext = $$('plaintext')[0];
  document.open(); 
  document.write(
    openingTag(document.documentElement), 
    openingTag(head),
    head.innerHTML,
    plaintext.innerHTML
  );
  document.close();
}

function openingTag(el) { 
  var tagName = el.tagName;
  return el.cloneNode(false).outerHTML.replace(RegExp('</' + tagName + '>$','i'), '\n');
}
function $$(tag) { 
  return document.getElementsByTagName(tag);
}

Probably a script loading timeout should also trigger this error recovery, so if mobify.js is taking a long time to download (because it is blocked) then it is abandoned and the unprocessed page is restored.

@johnboxall
Owner

@shogun70 Assuming the document is well formed and you aren't worried about losing content above bootstrap tag, your solutions should work!

https://gist.github.com/johnboxall/5355380

Some older browsers don't support el.outerHTML - depending on what browsers we're supporting, we may need to polyfill that.

Our philosophy behind the included bootstrap tag (https://github.com/mobify/mobifyjs/blob/v2.0/tag/bootstrap.html) was that it should be as small as possible - but I can see how this would be useful in a production deployment.

Perhaps we can provide a second "production" bootstrap tag that uses a combined library / adaptation and includes this recovery mechanism?

@shogun70

@johnboxall I've done a bit of experimenting and came up with something

https://gist.github.com/shogun70/5362893

The boot-function in that gist has this signature:

function(window, document, detector, timeout, scripts) {}

where

  • detector is the same as the current mobify boot-function. Internally the function also rejects browsers that don't support script.onload.

  • timeout is for a failsafe timer which rewrites the document if the <plaintext> is still in the document

  • scripts is an array of scripts, presumably mobify.js and main.js

Probably the most significant result is that it seems perfectly viable to put the boot-script before the <html> tag, and in fact I place it before the <!DOCTYPE html> and then reproduce the doctype again at the top.

This positioning is so advantageous that I recommend adopting it unless there are insurmountable obstacles.

  • on the server it allows adding mobify to pages by simple concatenation of doctype, boot-script and the default HTML file.
  • in the browser it simplifies the recovery code, and would have the same benefit for mobify.js.

I've tested on IE9, IE10, Firefox, Chrome. But there are no guarantees as to it's robustness, plus it is definitely for experimenting, not merging.

@shogun70

Another thought...

I'm dubious of the benefits of this inline boot-script. If the top of the document looked like

<!DOCTYPE html>
<script src="/boot.js"></script>
<!DOCTYPE html>
<html>

then the boot-script can be modified without updating the document.

boot.js could even be designed allowing it to contain the site-specific preprocessing, in which case there are still only two external scripts: boot.js and mobify.js. Alternatively the boot script could parallel load mobify.js and main.js.

The only time this would be slower than the current solution is when the boot-script isn't in cache and the browser won't be using mobify.

@johnboxall
Owner

@shogun70 when modern browsers hit a blocking resource the preparser fetches resources past the blocking script:

https://gist.github.com/johnboxall/5364132

We've made a philosophical decision to block the preparser until we've finished capturing the document. That means the bootstrap logic that writes out the <plaintext> must be inline.

Regarding your updated bootstrap - moving the bootstrap above the HTML does simplify recovery.

I'd be somewhat concerned about moving the script outside of the head. Are you sure that there are no ill effects on older browsers? eg. prematurely opening the <head> in response to seeing a script.

@johnboxall
Owner

@shogun70 Also - in a production scenario I'd expect you to concatenate all capturing related scripts - this would allow us to remove the queuing logic from your bootstrap.

@shogun70

@johnboxall Thanks for those responses.

  1. I forgot about look-ahead parsing and speculative downloading in modern browsers - Mobify's goals do seem to necessitate an inline boot-script.

  2. I believe <script> before <html> is safe in standards mode but I haven't checked in all browsers.

    I think the browser does create <html> and <head> placeholders which are then replaced if they are found in the document markup. This means that the boot-script is actually a child of <head> (not that it matters).

    I have checked IE6, for instance, and all tags received attributes from the markup.

  3. I think the queueing logic is about 100 chars if minified. It may not be robust, but I don't think the size is an issue.

@shogun70

@johnboxall Although, regarding the boot-script needing to be inline in order to prevent resource prefetching, this is only the case when the boot-script is not in the browser cache.

@shogun70

Reflecting on this a bit more...

a) The boot-script needs to be able to read and write the real-document-markup (for the fail-safe rewrite)
b) Placing the boot-script before the real-document-markup means the page markup can be trivially constructed.
c) The boot-script needs to be able to prevent the mobify-script from rewriting the page (after the fail-safe timeout)

I think the boot-script could and should provide exclusive read and write access to the page markup.
I've updated my gist

https://gist.github.com/shogun70/5362893

In addition to the previous functionality, the boot-script creates a global docProxy which is deleted when the page is rewritten and provides this API

  • state: Initially loading, then set to loaded by window.onload

  • onload: an externally provided function triggered by window.onload

  • getHTML(): retrieve the innerText / textContent of the <plaintext> element, i.e. the real doc markup

  • setHTML(html): document.write(html), deleting docProxy in the process

  • restore(): rewrite the page with the real doc markup

@johnboxall
Owner

Hey @shogun70,

We'd be willing to accept a Pull Request for a new tag that eliminates the capturing library as a SPOF.

Investigating SPOF has lead us to other questions:

  1. Should the bootstrap logic be moved to an external script as the pre-parser behaviour is only present when the script is not cached (this needs to be verified)
  2. Can the tag be repositioned to reduce the complexity of eliminating SPOF.

I think these are both good discussions to have but our intention is to be conservative with fundamental changes. For instance, we can speculate that window.onload works similarly on a captured document to document.readyState - but we'd rather be consistent:

29e3a81#L10R107

I believe this tag could be included in the library as a "experimental" new tag with the goal of eventually replacing the existing tag.

If you'd like to issue a PR with a testcase that shows the new tag surviving SPOF, I'd be happy to review it.

@shogun70

As an aside, /complete|interactive/.test(document.readyState) and document.onreadystatechange are not reliable as tests for DOMContentLoaded. All versions of IE can have interactive state before the DOM is fully loaded, or even when the first script runs. Typically this wouldn't matter because the document will have finished loading before the mobify script has.

Here's a test page

https://gist.github.com/shogun70/5388420

@jansepar
Owner

@shogun70 thanks for the test! I will have to look into verifying this on IE10 (my VM is not currently installed). Although we have setup unit tests to verify our solution, and we have ran it on IE10 and it worked: #130

Just a note, Mobify.js only supports IE10+

@shogun70

@jansepar I'm not familiar with Node.js, but I would assume output buffering on the server is preventing the write() in the unit test from really flushing to the client. When I made an equivalent test (using PHP sleep() to delay the last part of the response) I had to turn off gzip-encoding to get a proper flush.

@jansepar jansepar closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.