-
Notifications
You must be signed in to change notification settings - Fork 2
Design
This document details the design of the WAMS project. It covers fundamental architectural design decisions, the purpose of each of the modules, and briefly discusses each of the classes. For more in-depth information about any particular class, method, or chunk of code, see the documentation.
This project has four runtime dependencies, as listed under the "dependencies"
tag in package.json
:
-
This package allows end users to define custom rendering sequences for the canvas. These sequences can be transmitted over the network and executed on the client without having to use the
eval()
function and all its incumbent issues. The downside to this approach is that the sequences must be declarative- there is no way to retrieve the return value from a call to a canvas context method or conditionally execute parts of the sequence.This package was written and published as part of this project.
-
This package is a gesture library that provides a normalization of interaction across browsers and devices, and makes the following kinds of gestures possible:
- Tap
- Pan
- Pinch
- Rotate
- Swipe
- Swivel
As well as providing tracking abilities (i.e. simple updates of input state at every change) and gesture customization options, including the ability to plug in entire custom gestures. This ability was used to package together the Pan, Pinch, and Rotate gestures into a single Transform gesture, so that all three updates can be transmitted over the network simultaneously, reducing the volume of traffic and eliminating jitter in the render which was caused by the updates to these three gestures being split across render frames.
This package was written and published as part of this project. Note that it is a fork of ZingTouch. ZingTouch and other existing gesture recognition libraries for JavaScript were found to be insufficient for the demands of this project, hence the creation of this package.
-
The
express
package provides a simple way of establishing routes for the node.js server. Theexpress
router used by a WAMS app can be exposed to the end user, allowing them to define custom routes. Using a popular library for this feature enhances the ease of use for end users. -
The
socket.io
package is used on both client and server side behind the scenes to maintain an open, real-time connection between the server and client. Each client is on its own socket connection, but all clients share a namespace.Therefore messages are emitted as follows:
- To a single client: on the client's socket.
- To everyone except a single client: on the client's socket's broadcast channel.
- To everyone: on the namespace.
The build tools are mostly listed under devDependencies
in package.json
. The
key exception is make
, which is used for running tasks.
The tools used and their rationale are as follows:
-
The
arkit
package builds dependency graphs out of JavaScript source code. These graphs are not full UML, but rather simply show which files are connected via explicitrequire()
statements. Although somewhat limited, this is still very useful, and helps a great deal in terms of keeping the code organized. All the architecture graphs present in this design document were generated usingarkit
. -
The
babel
suite of packages provides transpilation, allowing the use of convenient new features of the JavaScript language without breaking browser support. Note however that only relatively recent browswers are supported. Babel is used via thebabelify
transform forbrowserify
. -
The
browserify
package bundles JavaScript code together for delivery to clients. It may not be the most feature-rich bundler, but the basic functionality just simply works, and works well. -
eslint
is akin to an early-warning system. It parses the code and checks for style and formatting errors, so that these can be fixed before trying to run the code. It works very well and is fully customizable in terms of which of its style and format rules to apply. It can also fix some simple style errors on its own. -
jest
is a testing framework for JavaScript. The tests are all written to be run using the commandnpx jest
. (npx
is a command included whennode.js
is installed. It runs scripts and/or binaries from project dependencies in thenode_modules
package). -
jsdoc
generates documentation from internal comments, akin to javadocs. -
terser
is a JavaScript code minifier that supports newer JavaScript syntax. This allows the size of the source code bundle sent when a client connects to be shrunk. -
This package is a template for the HTML pages produced by
jsdoc
. -
make
is wonderfully flexible, so here it is used as a simple task runner, at which it is quite adept. It also interfaces nicely withvim
, even if the JavaScript build tools don't. Simply runningmake
from the main directory of the project will run eslint, browserify, and jsdoc on the code, keeping everything up to date at once. See the Makefile to see the targets. -
This package should be available from standard Linux repositories with the common package managers (e.g.
apt
). Parses source code and generates a map of important names in the code (for example, functions and classes) to their definition location. Works excellently withvim
and can really make navigating the source code a lot easier. -
This package provides the necessary plugins to enable 'exuberant-ctags' for JavaScript.
Testing is done with the jest
framework. The test suites can be found in
tests/
.
To run all the tests:
npx test
To test, for example, only the client-side code:
npx test client
To test, for example, only the WorkSpace class from the server-side code:
npx test WorkSpace
Extra configuration can be found and placed in the jest
field of
package.json
. The tests are incomplete, owing to the rapid pace of development
and refactors since January.
- Message / Reporter Protocol
- Unique Identification
- Model-View-Controller
- Mixin Pattern
- Smooth and Responsive Interaction
One of the early challenges encountered was to ensure that only the correct data is transferred over the network, and that when a message is received over the socket, it would have the expected data. The Message / Reporter protocol was developed to solve this issue, providing a funnel through which to pass all data.
The first step was to ensure that only the correct data gets transmitted. This
is where the ReporterFactory
comes in. By calling this factory with an object
consisting of key-value pairs describing a set of core properties and their
default values, the factory will return a class which extends the Reporter
class. An instance object of this class has a method, report()
, which returns
an object consisting only of the core properties and their current values.
This ensures that even though arbitrary additional properties may exist on the
object (either through further class extensions or direct additions by a user)
only the core properties will be sent if whatever routine sends the data calls
this report()
method.
Also available on all Reporter
classes is an assign(data)
method. All
properties immediately on the data object will be assigned to the Reporter
instance, allowing arbitrary data to be stored. A deeper search is done for the
core properties of the Reporter
instance, checking the entire prototype chain
of the data
object. (For information on the prototype chain, see Kyle
Simpson's book series, You Don't Know JavaScript).
The second step was to create a Message
class with a static list of acceptable
message types. A Message
is constructed with one of these message types and an
instance of a Reporter
. It can then emit a report()
of the instance.
If this protocol is follow strictly then only the critical pieces of data will
get transmitted, and they will be associated with the expected Message
type
when they get there. This requires discipline- it is obviously still possible to
directly emit
messages over socket connections. Of course, programmer
discipline is also required to make sure that the Message
type selected is
appropriate for the occasion and associated Reporter
instance, though it may
be possible to enforce this restriction if this proves difficult.
There is a work-around for cases where lots of different types of little pieces
of data need to be transmitted. See the DataReporter
class in the
documentation.
In a large system like this, where it is important to keep track of and uniquely
identify lots of different kinds of objects correctly on both the client and the
server, it is very useful to centralize the identification technique. This is
where the IdStamper
class comes in. It provides a common structure by which
unique IDs can be assigned and copied.
Note that uniqueness is generally on a per-class level. There is a mixin,
Identifiable
, which uses an IdStamper
to provide unique IDs to any class
which mixes it in. See the IdStamper
class in the documentation for more
information.
Both the client and the server implement their own version of the MVC pattern, ultimately operating together as a larger MVC pattern.
The client side version is the most straightforward, and looks a lot like simple
classical MVC. The catch of course is that the 'ClientController' sends user
events to the server, and only interacts with the model or view when it receives
instructions from the server. The other catch is that, as the only thing objects
in the model need to do is draw themselves, they each implement a draw()
method for the ClientView
to use.
The server side is more complicated. The most obvious reason for this is that, being an API, the users of the API need to be able to attach their own controller code. The one big simplification is that there's no view, as nothing needs to be rendered.
The approach taken is to split the model in two. One of these is the
WorkSpace
, which holds all the actual objects in the model that will need to
be rendered. Specifically, these are the objects which are explicitly spawned
into the model by the programmer.
The other is the ServerViewGroup
, which holds the server's representations of
the client's views (that is, what the clients can see). The programmer does not
have control over spawning and removing the views, they are spawned when a user
connects and removed when a user disconnects.
The ServerController
instances are spawned and maintained by the
Switchboard
, and these controllers maintain a link to their associated
ServerView
and its physical Device
.
One other wrinkle is that, in order to support multi-device gestures, the
ServerViewGroup
has a single GestureController
which is responsible only for
maintaining the state of active pointers and calculating whether gestures have
occurred. Storing the gesture controller in the view group opens up the
possibility of creating multiple groups of devices, with each group capable of
recognizing its own multi-device gestures.
Taken together, the client and the server form a larger MVC pattern, with the client representing the view and part of the controller, and the server representing the other part of the controller as well as the model.
The complexity of the code, particularly on the server, would be significantly higher were it not for the mixin pattern. The short and simple version for those more accustomed to software engineering with Java is that a mixin is an interface whose methods are already implemented.
More precisely, a mixin "mixes" functionality into an already existing class to form a new subclass. This allows the programmer to bundle related pieces of functionality together into a mixin, and then attach those bundles to classes as they see fit.
This pattern fits neatly on top of the Message
/ Reporter
protocol. This
protocol requires that Views
and Items
and their related classes needed be
distinct, yet functionally these two distinct types of classes ultimately need
to perform a lot of similar actions. Mixins solves this problem beautifully,
making the whole system more succinct and easier to maintain in the process.
A more in-depth discussion of mixins and the inspiration for the specific implementation approach used can be found here.
In order to ensure a smooth and responsive experience for the user, several issues must be taken into consideration.
The overall amount of traffic over the network must be kept to a minimum. Similarly, the size of packets sent over the network must also be kept down. The Message / Reporter protocol handles most of this work, by stripping out all but the core properties of any object transmitted, but care must still be taken to ensure that these core properties really are only those that are needed at every update, or at least are small enough to be negligible.
Therefore properties such as attribute lists for HTML elements, which could contain huge strings representing entire webpages, should only be sent occasionally as a special event, and not included in the core properties. Additionally, the only data sent during an update should be the data pertaining to the specific object that has been updated. This is as opposed to sending a full state packet representing all objects in the model.
The approach taken to maintain consistency with the server is for every update to consist of state packets, rather than sending transformation commands. The client therefore simply copies the new state information for each updated object.
The three core transformations are translation, scale, and rotation. Each of these is likely to happen every time that a user moves any of the active pointers. If the updates for each are not bundled together into the next state packet, the user will experience a discomforting jitter effect. Solving this issue requires careful attention on both the client and the server.
For the client side, the updates from the gestures corresponding to all three transformations need to be bundled together into a single event before being sent to the server. This way, the server can perform all the necessary transformations before publishing any new state packets that arise.
For the server side, the question becomes when to publish updates for model objects. Programmers should be allowed to update model objects whenever they like (i.e. not necessarily in response to a user event), but still have confidence that their changes will be published. In either this case or the case of responding to user gestures, if transformations are split across publications the jitter issue will surface.
To solve this issue, the Publishable mixin is used for all model objects. It
uses the node.js setImmediate()
callback timer to schedule a single
publication to occur once all code arising from the current event in the event
loop has executed. Therefore all transformations responding to a user gesture or
some other programmer defined event will get bundled together into a single
update.
For server-side gestures, another issue along these lines arises. With user input events being sent to the server from each device at rates of up to 60 updates per second, it only takes a few devices for the update rate to regularly balloon into the hundreds. Therefore user input updates are bundled together. This is done by simply updating the input state in response to user input events, then only evaluating gestures at a rate of up to 60 times per second by using a callback interval that checks whether input updates have occurred since the last evaluation.
A subtle issue with modern touch interfaces is that contact points, and fingers in particular, typically aren't points but rather areas that are resolved down to points. These points tend to shift around relative to the area while a user is interacting with the surface, as the area itself fluctuates in shape and size. This can be due to slight adjustments in the distribution of pressure onto the contact surface, or else because humans are in constant motion, especially on the miniature scales measured by touch surfaces.
Although not immediately obvious, this effect can cause gestures to behave in a
jumpy way, characterized by alternating relatively large and small updates, or
else updates in alternating directions. This is perceived by the user as jitter
or else as a less than smooth interaction experience. The solution applied by
this project is to use a cascading average for the update values. Note that the
implementation exists inside the westures
gesture recognition library that was
written as a part of this project.
This cascading average is defined, generally, by replacing each update with the average of the update and the cascade. The cascade is likewise updated to this average. The result is a practical application of Zeno's Dichotomy, as half of the remaining value is theoretically applied at each subsequent update until the user ends the gesture.
Practically, this means that the emitted values have some inertia and thus are significantly less prone to the jumpiness that is otherwise observed. Also each update is only effectively included in perhaps a dozen or so subsequent updates before the finite precision of floating point numbers wipes out any remaining value from the update.
Note that this graph (and all that follow) merely show explicit file
associations via a require()
statement (similar to an import
or #include
).
Also note that the above graph does not show the shared
module, as it provides
base classes and routines that are used throughout the code and would simply
clutter the graph without revealing any structure.
All graphs were generated using arkit
, as discussed in the build tools section above.
To coordinate activity between the client and server, a shared set of resources are exposed by shared.js.
Exported by the utilities module are a few quality-of-life functions intended to be used in throughout the codebase. They are there to make writing other code easier, and to reduce repetition.
The IdStamper class controls ID generation. The class has access to a private generator function for IDs and exposes a pair of methods for stamping new IDs onto objects and cloning previously existing Ids onto objects.
The ReporterFactory
takes a dictionary of default values and returns a Reporter
class definition.
Runtime definition of classes is possible due to the nature of JavaScript,
wherein classes are really just functions that can be "constructed" using the
keyword new
. Therefore as functions can be treated like variables, so too can
classes.
All the reporters provided by this module share a common interface, as they are all generated by the same class factory.
Specifically, each reporter exposes two methods for getting and setting a set of
core properties: assign(data)
and report()
.
As discussed in the core concepts section above, the motivation for this design was to provide some semblance of confidence about the data that will be passed between the client and server. With the core properties defined in a shared module, the chance of data being sent back and forth in a format that one end or the other does not recognize is greaty reduced. This was taken even further with the Message class, which uses this reporter interface for the data it transmits.
Crucially, the set of Reporters includes the common Item
and View
definitions that both client and server extend. Think of these common reporter
definitions as pipes that both client and server must push their data through if
they wish to share it.
The Message
class takes the notion of reporters one step further by centralizing the method
of data transmission between the client and server. It does this by explicitly
requiring that any data objects it receives for transmission be reporters.
Messages can be transmitted by any object with an emit
function.
JavaScript lacks a standard library, and no third party standalone module stood out. Therefore the Point2D class therefore provides the necessary two-dimensional point operations.
The Polygon2D class defines a two dimensional polygon class, capable of hit detection. Complex polygons are supported by the hit detection routine as well as simple polygons. A discussion of the algorithm used can be found here.
The Rectangle class provides a two dimensional rectangle class with support for hit detection.
- ClientController
- ClientModel
- ClientView
- ShadowView
- ClientItem
- ClientImage
- ClientElement
- Interactor
- Transform
The ClientController
is the
bridge between client and server. To do this, it maintains the socket.io
connection to the server. User interaction is forwarded to the server, while
model updates from the server are forwarded to the model.
The ClientModel is a full copy of the server model, but with only the data necessary to render each object.
The ClientView
class is
responsible for holding onto the canvas context and running the principle
draw()
sequence. It also aligns the canvas context to reflect the transformed
state of the client's view within the workspace.
The ShadowView class is a simple extension of the View class that is used for rendering the outlines of other views onto the canvas, along with a triangle marker indicating the orientation of the view. The triangle appears in what is the view's top left corner.
The ClientItem
class is an
extension of the Item class that is aware of and able to make use of the
CanvasSequence
class from the canvas-sequencer
package for rendering custom
sequences to the canvas. It is therefore intended for immediate mode renderable
items that don't require additional data beyond the render sequence.
The ClientImage class enables loading and rendering of images.
The ClientElement class enables the use of HTMLElements as workspace objects. It generates the elements and attaches the provided attributes. Transformations are handled by CSS methods instead of canvas context transforms, as the elements are independent of the canvas.
The Interactor class provides a layer of abstraction between the controller and the interaction / gesture library being used. This should make it relatively easy, in the long run, to swap out interaction libraries if need be.
When a user interacts with the application in a way that is supported, the Interactor tells the ClientController the necessary details so the ClientController can forward those details to the server.
The Transform class bundles together the Pan, Pinch, and Rotate gestures so that all three updates will occur simultaneously, reducing jitter.
- ServerController
- GestureController
- SwitchBoard
- WorkSpace
- ServerViewGroup
- ServerView
- Device
- ServerItem
- ServerImage
- ServerElement
- MessageHandler
- Router
- Application
The ServerController
class
acts as a bridge between a client and the server. To do this it maintains a
socket.io
connection with a client. It keeps track of the ServerView
corresponding to that client, as well as the ServerViewGroup to which it
belongs, and its physical Device.
User interaction events are forwarded either directly to the MessageHandler or to the view group's GestureController, depending on whether server-side or the traditional client-side gestures are in use.
Outgoing messages will be handled directly by the view or by items, via their 'publish' mechanism. This mechanism ensures that updates will be sent to clients directly, without any special care required on the part of the programmer (save that they use the transformation methods provided, instead of modifying properties directly).
The GestureController
class
is in charge of processing server-side gestures for the purpose of enabling
multi-device gestures. It accomplishes this by interfacing with the gestures
module.
The SwitchBoard controls connection establishment, as well as disconnection. It hooks up all the necessary components when a client connects to a WAMS app.
The WorkSpace is the model for all items that are programmatically added or removed. That is, for ServerItems, ServerImages, and ServerElements.
The ServerViewGroup class is the model for ServerViews. It has an associated GestureController to enable server-side gestures. Transformations applied to a group are applied to each view in the group.
Mixins used by this class: Locker, Lockable, Transformable2D.
The ServerView represents a client's logical view within the workspace.
Mixins used by this class: Locker, Interactable.
The Device class represents a client's physical device. It is used for transforming input point coordinates when server-side gestures are in use.
Mixins used by this class: Transformable2D.
The ServerItem maintains the model of an Item. It allows for transformations and hit detection. Transformations are published automatically to the clients.
Mixins used by this class: Identifiable, Hittable.
The ServerImage is similar to the ServerItem class, but with methods and properties specific to images.
Mixins used by this class: Identifiable, Hittable.
The ServerElement class is similar to the ServerItem class, but with methods and properties specific to HTML elements.
Mixins used by this class: Identifiable, Hittable.
The MessageHandler is the interface between the WAMS system and the programmer. All recognized user interactions ultimately end up being transmitted to the MessageHandler, which will call the appropriate listener, if the programmer has attached one.
The Router provides a layer of abstraction between the server and the request handling library and its configuration.
The Application is the API endpoint of the WAMS system.
The Lockable mixin allows a class to enable itself to be locked and unlocked, with the default being unlocked.
The Locker mixin allows a class to obtain and release a lock on an item.
The Publishable mixin provides a basis for types that can be published. It ensures that publications will not be sent until all transformations relating to an event have been applied.
The Transformable2D mixin provides 2D transformation operations for classes with 'x', 'y', 'scale' and 'rotation' properties.
The Interactable mixin combines the Transformable2D, Lockable, and Publishable mixins to produce an object that can be interacted with by a WAMS application.
The Hittable mixin extends the Interactable mixin by allow hit detection.
The Identifiable mixin labels each instantiated object with a unique, immutable ID. All classes that use this mixin will share the same pool of IDs.
A Binding associates a gesture with a handler function that will be called when the gesture is recognized.
An Input tracks a single input and contains information about the current and initial events. Also tracks the client from whom the input originates.
The PHASE object normalizes inputs events to the phases start, move, end, or cancel.
The PointerData class provides low-level storage of pointer data based on incoming data from an interaction event. Specifically, it stores the (x,y) coordinates of the pointer, the time of interaction, and the phase.
The Region class is the entry point into the gestures module. It maintains the list of active gestures and acts as a supervisor for all gesture processes.
The State class maintains the list of input points.
The items namespace is a collection of factories for predefined item types.
The layouts namespace is a collection of factories for predefined layout handlers.
The utilities namespace is an assortment of predefined helper functions.
When a user visits the IP address and port where the app is hosted, the following sequence of events occurs:
- HTML and client JavaScript code are delivered.
- When the page is loaded, the client's ClientModel, ClientView, and ClientController are instantiated and hooked up.
- The ClientController resizes the canvas to fill the client's browser window.
- The ClientController registers
socket.io
message listeners and other assorted non-gesture-related listeners for maintaining the system. - The ClientController initiates the render loop.
- The ClientController attempts to establish a socket connection with the server.
- The Switchboard receives the 'connect' request. If the client limit has been reached, it rejects the connection. The user is informed of this rejection, and all functionality stops. Otherwise, it accepts the connection.
- When the connection is accepted, a ServerController is instantiated and slotted into the collection of active connections.
- The ServerController asks the ServerViewGroup to spawn a view for it, and spawns a Device to store the representation of the client's physical device.
- The ServerController attaches
socket.io
message listeners and issues a "full state report" to the client, detailing the current state of the model so that the client can render the model, as well as options specified by the programmer such as whether to use client or server-side gestures. - The ClientController informs the ClientModel of this data and registers user event listeners, either in the form of an Interactor for client-side gestures or by directly forwarding input events for server-side gestures.
- The ClientController emits a layout message to the server, detailing the size of the view.
- The ServerController receives this message, and records the size of the view in the model.
- If a layout handler has been registered for the application, it is called for the new view.
- The view is updated with the new parameters from the layout, and all the other views are now informed of the view, adding it as a "shadow".
- The connection is now fully established, and normal operation proceeds.
Listed here are references to all external sources, be they code, books, algorithms, tutorials, or other articles.
- canvas-sequencer
- westures
- zingtouch
- express
- socket.io
- arkit
- babel
- browserify
- eslint
- jest
- jsdoc
- terser
- tui-jsdoc-template
- make
- exuberant-ctags
- ctags-patterns-for-javascript
- You Don't Know JavaScript
- Mixins
- Zeno's Dichotomy
- Polygonal Hit Detection
- Open Source Assets (Used for image assets in examples).