Skip to content
This repository has been archived by the owner on Apr 5, 2022. It is now read-only.

Architecture

aeisenberg edited this page Oct 3, 2012 · 13 revisions

There are two components to Scripted. Can you guess what they are? Yes, the client and the server.

The Server

The server is a node application. It is small and straightforward. It is responsible for serving the client side code to the browser and answering requests from the client:

  • give me the contents of a file
  • save these contents into this file
  • can you tell me what files/directories are on the disk
  • can you find me a file that has this regex in the name
  • can you tell me the dependencies of this JavaScript file?
  • can you execute this command for me and send me the results

To achieve this it uses node modules like static, formidable, amdefine and htmlparser. For some of the more asynchronous behavior in dialogs like 'open file' it uses sockjs.

It would be relatively easy to switch it out for a server implemented in another technology but node meets our needs right now and is very very fast to startup. Right now the server is basically hardwired to run locally so you cannot host it somewhere other than the machine running the browser without making code changes. One reason for this is the ability to execute commands, without a security mechanism in place we want to limit command execution to just the local user.

The Client

The client is JavaScript/HTML/CSS code an AMD module structure, it uses several libraries:

  • jquery For DOM manipulation and querying.
  • dojo For some UI components (we would like to move away from dojo)
  • jsbeautify For formatting in the editor
  • requirejs To load the client side AMD modules
  • jslint For code style checking in the editor
  • esprima For parsing JavaScript code before performing type inferencing
  • doctrine To parse JSDoc comments
  • qunit For unit testing of Scripted
  • sockjs For web sockets
  • Orion For the editor component and some other internal libraries.

Interesting client side features

Parser Recovery

In an editor the code is usually a work in progress - perhaps incomplete or just plain broken - and yet it is often necessary to parse it. If the user has just typed '.' and invoked content assist, the code is very unlikely to be well formed and yet a parse is required so that the context of where the user has invoked content assist can be determined and used to ensure appropriate proposals are suggested. For this reason a recoverable parser is required. In Scripted a modified esprima parser is used that can cope with bad syntax and still return a usable AST.

Inferencing

JavaScript and other dynamic languages pose particular problems for tool developers since the semantics of a program are not known until runtime. If the goal is to provide typical editor features like semantically aware content assist, hovers, and navigation, extra work needs to be done to guess this information from the source.

In Scripted, we use a simple control flow analysis to determine the types of variables and what properties they are likely to have. For example, in the following snippet, we know that the type of x is Number and performing content assist at the | should propose all properties available on Numbers:

function doSumpin() {
  return { val : 9 };
}
var x = doSumpin().val;
x.|

Type inferencing is implemented by walking an abstract syntax tree (AST) representation of the source code. As each AST node is visited, it is decorated with its inferred type as well as any other relevant information about it (such as whether or not a function is a constructor, etc).

This sort of type inferencing is relatively fast because it is single-pass. And the inferencing must be fast since it must happen in user-time on page load, on hovers, and before the content assist pane appears. However, there are some ramifications of this style of inferencing.

First, we can never be certain about what properties are available on any given object, we can only make guesses. No matter how sophisticated a static inferencer is, there will always be runtime properties that are invisible to it.

Next, using a property before it has been initialized will mean that the inferencer cannot use information about it, even if the code works at runtime. For example,

var x;
function getX() {
  return x;
}
x = 0;

Since the inferencer runs in lexical order, hovering over the x in the return statement will show that x is undefined, even though later on the value is set to type Number.

Finally, parameter types cannot be inferred. For example:

function myFun(num) {  return num+1; }
myFun(7);

Scripted has no information about the expected type of num in this case which can be fairly frustrating if the type is obvious to the programmer.

JSdoc support

To partially solve the lack of parameter typing, Scripted recognizes JSdoc-style comments. All JSdoc comments are parsed and attached to appropriate function and variable declarations. And this feeds into the inferencer.

There are many flavors of JSdoc, and we have chosen to use those described by the Google Closure Compiler since they are the most sound and complete. See Google Closure Compiler documentation for a complete description on how to specify the comments. All JSdoc comments are parsed by doctrine.

AMD/CommonJS module awareness

One of the more powerful features of Scripted is its awareness of JavaScript modules. This means that editing features like content assist can leverage information across multiple files. Scripted understands AMD style modules (used by requirejs) and CommonJS modules (used by node). We may add support for other kinds of modules later.

After loading a JavaScript file, the Scripted client asks the server for all transitive dependencies of that file. And for each transitive dependency, Scripted will summarize it and store the summary in the browser's local storage. A file summary consists of all exported properties, their types and any type reachable through any of the exported properties. Dependencies are re-summarized on file-save.

Here is a simple example

  • foo.js:
define('foo', [], { val : 9 });
  • bar.js
define('bar', ['foo'], function(foo) {  return foo;  });
  • baz.js
define('baz', ['bar'], function(bar) {  console.log(bar.val);  });

Hover over val in baz.js and you will see that val is resolved to Number. Press F8 and the caret is navigated to the definition in foo.js.

Root files of a project may have many dependencies (our scriptedSetup.js file has over 100). This can take a significant amount of time to fully parse. And so all summarizing occurs inside of a web worker in a background thread. The result is that all this background work is not noticeable to the user. And if a user invokes content assist or some other operation that requires inferencing before summarizing is complete, partial results will still be available.

There are many ways to improve the performance of summarizing. For example, on the client side, we could be doing some more sophisticated caching of summaries and not re-summarize if neither the file nor its dependencies has changed since the last summary. But so far, we are finding the performance perfectly fine without this enhancement.