Skip to content
This repository has been archived by the owner on Oct 4, 2022. It is now read-only.

Implement a web worker that manages analyses. #1571

Closed
2 tasks
atimmer opened this issue Jul 2, 2018 · 5 comments
Closed
2 tasks

Implement a web worker that manages analyses. #1571

atimmer opened this issue Jul 2, 2018 · 5 comments
Labels

Comments

@atimmer
Copy link
Contributor

atimmer commented Jul 2, 2018

For Yoast/wordpress-seo#5393 we need to create an implementation of YoastSEO.js that works completely inside a web worker. This is a specification&roadmap about how it should work.

Why do we want to use web workers?

We want to keep the main thread free of any non-UI work. For a good primer on why we want this you can ready this great article on the subject of 120 fps. Our analysis is definitely non-UI work.

How to implement them?

Web workers are themselves fairly easy to implement. By reading the documentation about web workers on MDN you can understand how they work. I will recap this documentation shortly and highlight implementation details specific to us.

For our purposes I think we should have one 'simple' web worker. So everything about SharedWorker can be ignored. Shared workers are used if you want to share the same worker in multiple tabs/windows. I don't think we need that, at least for now.

In YoastSEO.js we want to implement code for both the complete web worker and code for the main thread that communicates with the web worker.

What do we need to change to make it fit.

A web worker is basically an extra piece of code that runs in the background that you can communicate with. To start a worker you need a standalone script. I've done a bit of test setup in Yoast SEO on the stories/web-worker-test branch. This contains a standalone script that can accept messages.

This means that web workers are by their nature asynchronous. You send them a message and a result is returned. Our architecture is not suited for this yet. So we need to fix that. Logic inside the worker can be synchronous, so that gives us some leeway.

The worker should contain all the logic to analyze any Paper object.

Message format

I would like the following general format

{
	type: "[type of the message]",
	payload: { /* [all-the-necessary-for-the-command */ },
}

This also makes organization natural and easy, for every message we can have a separate file in a .../worker folder.

Available commands

To the worker: initialize

The payload should contain all the configuration that is necessary for executing the analysis. For example whether the content and keyword analyses are active. If one of them isn't active that work is not necessary.

To the worker: analyze

This command is the main command. This will analyze a given Paper. The payload looks like this:

{
	id: /* An auto incrementing numeric ID. */
	paper: /* The Paper instance */,
	configuration: {
		useCornerstone: false, // Whether cornerstone analysis should be enabled.
	}
}

The configuration should be send with every message, because that way the analysis will always be correct for the given Paper + configuration.

To the worker: registerCustomScript

Other plugins need to be able to add custom research and assessments. Because we want all analysis to run in the worker, this can only be done by specifying the name of a script to import inside the worker. This script will be imported using importScripts. That script then has access to the self scope of the worker.

The self scope of the worker has these methods that can be used to register custom research and assessments:

/**
 * `research` should be a function. It will receive the paper as its first argument. The second argument is the `researcher`. That can be used to get at the other available research.
 */
registerCustomResearch( research )

/**
 * The `assessment` should be an object that has a `getResult` method. The `getResult` method will receive the paper as its first argument and the research as its second.
 */
registerCustomAssessment( assessment );

The payload to register a custom script looks like this:

{
	name: "[name-of-the-custom-script]",
	script: "[url-to-a-custom-script]",
	data: { /* Custom data available to the custom script */ }
}

The following methods will also be available to the custom script:

/**
 * Returns the custom data that has been passed when sending the message to register the custom script.
 */
getCustomData( name )

From the worker: registeredCustomScript

A simple message to let the code outside the worker know that the custom script has been registered and the all analyses including the given ID and any analysis afterwards will be run including the custom script.

{
	id: /* Numeric ID */
}

From the worker: results

This will be send as the result of an analyze command. The id should match the id passed in the analyze message. This way the receiving page can correlate the analysis with the result. It also makes it possible to discard results if the id does not correspond to the id of the last analyze request that was send to the worker.

The message looks like this:

{
	id: /* ID of the analyze request */,
	category: /* Either keyword or readability. */,
	results: [] // The analysis results.,
}

Temporary screenshot of the diagram

img_1885

Todos

  • Move pluggable to Yoast SEO. Pluggable is synchronous, so it cannot work in the worker context. Because all the things that plug-in to YoastSEO.js is not in the worker context.
  • Implement the messaging infrastructure.

Gotchas

Because workers only work on the same domain as the main page, you cannot use the web pack dev server. I've included a piece of code that works around this. This should not be used in production. But it is incredibly useful to be able to test workers locally.

function createWorker (workerUrl) {
	var worker = null;
	try {
		worker = new Worker(workerUrl);
	} catch (e) {
		try {
			var blob;
			try {
				blob = new Blob(["importScripts('" + workerUrl + "');"], { "type": 'application/javascript' });
			} catch (e1) {
				var blobBuilder = new (window.BlobBuilder || window.WebKitBlobBuilder || window.MozBlobBuilder)();
				blobBuilder.append("importScripts('" + workerUrl + "');");
				blob = blobBuilder.getBlob('application/javascript');
			}
			var url = window.URL || window.webkitURL;
			var blobUrl = url.createObjectURL(blob);
			worker = new Worker(blobUrl);
		} catch (e2) {
			//if it still fails, there is nothing much we can do
		}
	}
	return worker;
}
@atimmer atimmer added enhancement technical debt responsiveness innovation Innovative issue. Relating to performance, memory or data-flow. needs-spec labels Jul 2, 2018
@herregroen
Copy link
Contributor

In order to keep our code clean, easily understood and up to standards I think we need a WebWorker class to abstract this logic.

Outside this class no message should ever be send or received. This class should implement each of the commands listed here and return a Promise to them that resolves in the result. This class should be fully responsible for the ids passed along and these should never be exposed. An internal object to map these to promises which can be resolved when a results message is received would likely be the way to go.

An instance of this class can be exposed globally to allow other plugins to call it's methods.

In addition to that I think we should also look at supporting custom assessments and custom research using importScripts. Allowing other plugins to extract their researches into a separate file that's then imported into our worker so that it can also run asynchronously.

I also don't think we should support cancel. Implementing logic to keep track of cancellations and to stop analysis halfway could easily become a mess. ES6 Promises don't support it either so I don't see why we should.

@atimmer
Copy link
Contributor Author

atimmer commented Jul 3, 2018

Discussed during the architecture standup:

WebWorker 👍.

importScripts 👍.

cancel: Every analysis should run automatically. And the analyze message puts a new one in the queue. If there are 5 queued, once we are done with the current analysis we can drop 3 of them and do the most recent one.

@atimmer
Copy link
Contributor Author

atimmer commented Jul 9, 2018

A bit of thinking about the abstraction:

API around web worker

The code that implements YoastSEO.js should not have to know about the web worker at all. To implement this we need a wrapper around the worker. In browsers that don't support web workers it should all still work.

It should look a bit like this:

class AnalysisWorker {

	/**
	 * Analysis a paper. The results will be in the promise.
	 *
	 * This method/object should handle all the complexities of keeping track of the IDs and the responses from the worker. The code that is calling this method doesn't need to know about any of these details.
	 */
	analyze( Paper paper ): Promise {
		
	}
	
	/**
	 * Registers a custom script.
	 */
	registerCustomScript( name, script, data ) {
		
	}
}

@manuelaugustin
Copy link
Contributor

manuelaugustin commented Jul 10, 2018

Notes from the kick-off meeting:

  • Some questions that we need to think about (might already be fully or partly answered):
    • What happens when the paper changes?
    • What happens when the keyword changes?
  • The keywords should be in the config, because it's independent from the paper that needs to be analyzed.
  • When changing a keyword, there should be a special request that tells the web worker to use the same paper again to perform a new analysis.
  • One analysis request should send back the results for all keywords. There should be a way of prioritizing results, so that the keyword analysis for the currently active tab gets returned first.
  • The web worker should just return changed results.

img_20180710_095516480

@igorschoester igorschoester mentioned this issue Jul 23, 2018
28 tasks
@moorscode moorscode added innovation Innovative issue. Relating to performance, memory or data-flow. and removed innovation Innovative issue. Relating to performance, memory or data-flow. labels Feb 23, 2019
@atimmer
Copy link
Contributor Author

atimmer commented Oct 21, 2019

This has been implemented

@atimmer atimmer closed this as completed Oct 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants