Skip to content
This repository has been archived by the owner on Jan 17, 2019. It is now read-only.

Notes on Web and Service Workers

Ivan Herman edited this page Jun 30, 2015 · 4 revisions

This is just jotting down my (very limited) understanding on Web Workers and Service Workers and how they would/could influence the future of EPUB-WEB. Very drafty, any comments/updates are welcome!

Note that this is also relevant to issue #29.

Web Workers

A Web Worker, from the editor's draft:

defines an API that allows Web application authors to spawn background workers running scripts in parallel to their main page. This allows for thread-like operation with message-passing as the coordination mechanism.

(See, e.g., the blog of Eric Bidelman, the Wikipedia page, the Mozilla developers' network page, or the latest editor's draft for further details.)

The important point is that a Javascript code running in a Web Worker is running in a concurrent thread or process to the main javascript thread of a Web page, instead of being controlled by its Javascript event loop (which, though providing asynchronous access to resources, always runs in a single thread). This means that a page can spawn a Web Worker to perform a potentially costly operation without significantly affecting the operation of the "main" thread of the page. Workers (including the main thread of the page) communicate with one another using a simple message passing protocol; a worker is (as usual in Javascript land) usually event-driven, waiting for messages to arrive and to react upon.

A Web Worker has access to operations like Web Sockets, HTTP access, or file system access, but it does not have access to the originating page's DOM. In terms of EPUB-WEB what it means is that a Web Worker can be used to unpack a whole package in the background, possibly unpacking or uncompressing constituent parts (note that a Web Worker may also spawn other Web Workers).

Service Workers

A Service Worker is a special type of Web Worker, but with additional features:

  • Service Worker is a programmable network proxy, allowing the main page's thread to control how network requests from that page are handled. Especially it has, as part of its specification an interface to handle a local cache for networked data.
  • A Service Worker is registered in a browser; this means the worker will stay alive even if the user moves away from the main page, and can be accessed later if he/she returns to it (I hope I got that right…).
  • At the time of registration the caller handles over a "scope" which lists all the static pages it is in control of (via its cache). This also means that once the content is in the cache, the browser can operate with the same content regardless on whether it is on-line or off-line.

There are more additional features, but those are not directly relevant to the model related to EPUB-WEB.

(See, e.g., the blog of blog of Matt Gaunt, the Mozilla developers' network page, or the editor's draft.)

It is conceivable that a Service Worker, combined with the OCF File System Container (which may be part of EPUB3.1) or something similar could play the rule of an ebook reading system without any kind of packaging (although some files may be compressed). The Service Worker's "scope" would be the ebook content (i.e., the whole book would loaded into the cache); the main "reading" cycle thread would handle the content as any other Web page, regardless on whether the page would have to be downloaded or not. In other words, the off-line vs. on-line can be handled by a Service Worker.

See also Brady's email on that.

Note that, in that package-less approach the manifest file is still of importance. That would determine what should be in the scope or not, for example…

Both Web Workers and Service Workers are at the bleeding edge of current Web development, and they are still rough at the edges. I have not idea how long it will take for them to become stable enough to rely upon for something like EPUB-WEB…

Random comments & Notes

Service Workers have additional security concerns.

  • Service workers, if my understanding is correct, must be connected through HTTPS (and not HTTP).
  • There is a same origin restriction. The main javascript running in a page and the (registered) service workers should have the same origin. This may have some consequences on how a full reading system is deployed (if it is deployed, at the start at least, through some javascript add-ons)