How SWWebView bridges between the webview and native

WKWebView brings a lot of performance improvements over UIWebView, but at one very important cost: it runs in a separate process. That makes implementing SWWebView more of a complicated mess than it might otherwise be (but the performance is still worth it).

Warning: here be caveats

Intercepting webview requests

One of the core components of a Service Worker is the ability to send custom responses back to the webview - cached files, or even manually constructed pages. Luckily, iOS11 introduced a new addition to WKWebView called WKURLSchemeHandler that lets you intercept all requests sent to a protocol you specify. Unfortunately you can't specify HTTPS as that protocol, it's reserved, so you have to use a different one. We use sw://. When a request is received to load a URL that falls under a service worker-enabled domain, SWWebView cancels that load request and immediately fires another request on the sw:// protocol.

Caveat: if your page tries to load any absolute URLs, including protocol, SWWebView will not intercept it. It can only rewrite page requests, not script resources. So, if you have something on a different domain, use a protocol-relative URL, like so:

<script src="//example.com/test.js"></script>

Injecting the API into pages

WKWebView allows you to inject JavaScript into a page via WKUserScript. It contains the option to inject it at document start, so by the time any of the remote JS loads, your injected script has already run. This allows us to define the serviceWorker property of navigator and leave client none the wiser that they aren't using a "real" API.

Getting commands out of WKWebView

WKWebView also has WKScriptMessageHandler to facilitate making calls from JavaScript code out into your native environment. It installs a new object at window.webkit.messageHandlers["YOUR_HANDLER_NAME"], which has a function named postMessage allowing you to do, well, exactly that. You can only send JS primitives (arrays, objects, numbers, strings) but it's more than enough to send a structured request to your native code. There's one major problem, though: you can't send anything back through the message handler.

It starts getting messy when you want to reply

At first look, there's an easy answer here - WKWebView, much like UIWebView, has an evaluateScript() function, which allows you to throw random JavaScript into your webview. But there is one glaring issue: you can only send it into the top-most frame. For many applications that might not be a problem, but any app that uses <iframe>s to embed content (or ads!) will find that the Service Worker API isn't available there. Or rather, it is there, and doesn't respond to anything.

So, what I've settled on (for now at least) is something very different. The JS API (that gets injected into the page, outlined above) immediately opens a request (on the sw:// protocol, so handled locally) as an EventSource, and all command responses (as well as property changes, like when a Service Worker's status changes) are send down into this event stream and picked up by the JS code. That request also serves as a "hello" to the native code, telling it that a new ServiceWorkerContainer needs to be made to serve that URL.

Caveat: that EventSource request does not load immediately. It does load nearly immediately, but there is a small period of time where the Service Worker API is not yet initialised in the webview. The good news is that 99% of the API is promise-based, so we can just wait for the stream to initialise as part of the promise resolution. The one exception is navigator.serviceWorker.controller, which is not asynchronous. If you try to use it too early, it'll throw an error. Just use navigator.serviceWorker.ready first.

Is all of this secure?

It certainly doesn't feel like it. The good news is that the JS API we inject isn't doing any security work - it's just taking arguments, wrapping them up as a promise and sending them over the wall. So someone nefarious wouldn't be able to, say, register a worker on a different domain. They would be able to register one on your domain, but so would nefarious JS code in any browser.

The EventSource request gives me some pause, though. You could easily spin up dozens of event stream feeds if you know the URL (which is hard-coded). I'm not entirely sure what benefit that gives you, but if you did it enough times it would certainly be a good way to crash the app.

Provide feedback

Saved searches