Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Great Refactoring™️ #165

Merged
merged 202 commits into from
Jan 23, 2020
Merged

The Great Refactoring™️ #165

merged 202 commits into from
Jan 23, 2020

Conversation

kasperisager
Copy link
Contributor

@kasperisager kasperisager commented Dec 17, 2019

This pull request will chronicle the rather large refactor that I've been working on for the past two months. Many of the changes mentioned here will also warrant a separate ADR; one purpose of this pull request is identifying exactly which changes do so.

Let me start by addressing the most obvious question...

Why on earth?

More than two years ago, Alfa was born on a simple premise as outlined in ADR 2: Work with static data to provide reliable and reproducible accessibility audits. The way this was implemented in practice was by operating on a strict subset of DOM and CSSOM and to ensure that this subset was directly serialisable to JSON. By keeping the core data format simple to produce and pass around, we however had to introduce a bit of complexity on the consuming side:

  • As a JSON serialisable format meant no circular references, the DOM subset could not include parent pointers. This meant that every function that needed access to parents had to be passed a top level "context" node from which parent pointers could be inferred. Among other things, this lead to fairly complex traversal APIs as evidenced by Turn visitor functions into generators #147.

  • Many DOM and CSSOM properties are optional and can therefore be set to null and/or undefined. Per ADR 3, this was perfectly "safe" as the TypeScript compiler would ensure that we guarded against dereferencing something nullable, but it littered our code with nullable checks. Even worse, nullable values are so ingrained in the JavaScript ecosystem that the TypeScript compiler will actually not complain about excessive checking.

  • As the data format only contained the bare minimum needed to represent DOM and CSSOM, a lot of additional functionality had to be provided on top. For the most part this was fine, but we often ran into situations where the functionality would be rather expensive, such as constructing parent pointers, and therefore had to be implemented using a cache.

While these complexities initially amounted to only a small overhead, the overhead quickly increased as the code base grew. After working on #147, it became clear to me that something had to be done. The following sections will outline all the major changes that were made to reduce the overall complexity of Alfa to a point where I can once again consider its future development sustainable. While all of the above points will be addressed, I'll also go into detail about some other long-needed changes that made their way in.

Major changes

Parent pointers and JSON serialisable DOM

In order to provide both parent pointers and JSON serialisable nodes, DOM and CSSOM types are now implemented as classes that all implement the toJSON() method. This method returns a canonical JSON representation of nodes that can also be used to translate back into concrete nodes.

This also means that Alfa no longer works on a strict subset of DOM and CSSOM, which has afforded us the possibility to fix some of the "quirks" in its design. For example, many properties now have shorter names and are implemented as methods, such as node.parent() instead of of node.parentNode, and many of these methods optionally allow specifying traversal parameters, such as node.parent({ flattened: true }).

To actually construct these DOM and CSSOM types, constructor functions are used:

Element.of(
  Option.of(Namespace.HTML), // Namespace
  None,                      // Prefix
  "div",                     // Name
  self => [                  // Attributes
    Attribute.of(
      None,                  // Namespace
      None,                  // Prefix
      "title",               // Name
      "Hello world",         // Value
      Option.of(self)        // Owner
    )
  ],
  self => [                  // Children
    Text.of(
      "Hello world",         // Data
      Option.of(self)        // Parent
    )
  ]
);

When using JSX, each JSX element is translated to the canonical JSON representation of an element, which can then be further translated to an actual element:

Element.fromElement(
  <div title="Hello world">
    Hello world
  </div>
);

With this new design, traversal is also trivial to implement without exposing any additional APIs for doing so. For example, we can satisfy all traversal requirements laid out in #147 using a simple generator function:

function *visit(node: Node): Generator<void, boolean> {
  if (shouldSkipSubtree) {
    return true;
  }

  if (shouldExit) {
    return false;
  }

  // Pre-order code

  for (const child of node.children()) {
    const shouldContinue = yield* visit(child);

    if (!shouldContinue) {
      return false;
    }
  }

  // Post-order code

  return true;
}

Replacing nullables with the Option<T> type

In the previous section, you might have noticed the use of Option.of(...) and None. These are examples of the Option<T> type in use, which will eventually replace all uses of nullable values. The Option<T> type represents an optional value T which can be either Some<T> if the value exists and None if it doesn't. You might have already stumbled upon a similar concept in functional programming languages such as F# or Scala, or perhaps even in Rust.

Other than of course making it very explicit that a type T is optional, rather than use "magic" values such as null or undefined, the primary benefit of the Option<T> type is its chained API. For example, given something like this...

const a: A | null = getA();

if (a !== null) {
  const b: B | null = getB(a);

  if (b !== null) {
    const c: C | null = getC(b);
  }
}

...we can instead write the following using the Option<T> type:

const c: Option<C> = getA().flatMap(getB).flatMap(getC);

Put another way, Option<T> is to nullable checking madness what Promise<T> is to callback hell.

Side-effect free questions in non-automated rules

In order to allow callers to provide input to non-automated rules, Alfa employs a question/answer mechanism by which rules pose questions to be answered by the caller. Previously, this mechanism was implemented by means of side-effects; a question() function would be invoked with a question after which an answer to the question would be looked up and, if found, returned from the question() function. If no answer was found, a question would be added to a list of questions to be returned to the caller after finishing the audit. The caller would then need to provide answers to these questions by making a new audit with the answers provided.

However, the above flow often lead to some unfortunate situations when rules posed multiple questions, such as this:

const q1 = question("q1", ...);
const q2 = question("q2", ...);

return {
  1: { holds: q1 === true || q2 === false }
}

The intent of the code above is to first ask q1 and then, if q1 was not answered with "yes", ask q2. If q1 is indeed answered with "yes", q2 would however still end up in the list of questions returned to the caller even though it's not needed. To avoid this, we would have to write the code as such:

const q1 = question("q1", ...);

if (q1 === true) {
  return {
    1: { holds: true }
  };
};

const q2 = question("q2", ...);

return {
  1: { holds: q2 === false }
}

This way, the side-effect of the second question never occurs and the code behaves like it's supposed to. You can however imagine how even slightly more complex cases quickly turn into a tangled mess if we constantly have to keep track of which side-effects should and shouldn't occur and when.

To untangle this questionnaire mechanism, questions are now returned as values and therefore no longer rely on side effects. To illustrate what this means, let's rewrite the previous example:

const q1 = Question.of("q1", ...);
const q2 = Question.of("q2", ...);

return {
  1: q1.map(q1 => q1 === true 
      ? /* pass */
      : q2.map(q2 => q2 === false 
          ? /* pass */ 
          : /* fail */
        )
  )
};

In effect, questions only ever reach callers when they're returned from an expectation, meaning that an answer is actually needed.

Optionally asynchronous audits using the Future<T> type

Previously, audits were required to be fully synchronous is order to facilitate integration to projects without support for asynchronicity, such as ESLint. While rules themselves have no use for asynchronous behaviour per ADR 2, other processes, such as user input, do. This restriction lead to an unfortunate user input flow, as outlined in the previous section, and roughly looked like this:

  1. Run an audit.
  2. For all questions returned, provide user input.
  3. Run a new audit with provided user input.
  4. If new questions are returned, go to step 2.

The flow we really want is this:

  1. Run an audit.
  2. When requested, provide user input directly to the running audit.

In order to accomplish this flow, we would need the ability to asynchronously provide user input to a running audit. However, if we wanted to inject any kind of asynchronicity anywhere within the audit flow, we would be forced to make the entire flow asynchronous assuming we had chosen to rely on the standard Promise<T> type. The reason for this is that the Promise<T> type by design cannot be resolved synchronously.

So, to circumvent this design caveat of the standard Promise<T> type, we opted for the simpler Future<T> type, which can optionally be resolved synchronously by means of the Future.now() and Future.delay() constructor functions. As such, the user input flow can default to a synchronous "no answer available" mode, with the ability for callers to also provide an asynchronous user input flow.

While the Promise<T> type provides functionality for dealing with computations that might throw exceptions, the Future<T> type leaves it up to the caller to deal with errors and provides no functionality for handling exceptions. This is very much by design as the Future<T> type is meant as a fairly low-level primitive for working with possibly asynchronous computations. As the next section will make apparent, there are however ways of building error handling on top of the Future<T> type in a more type-safe manner than what is possible with the Promise<T> type.

Typed errors using the Result<T, E> type

JavaScript somewhat suffers from the fact that the common way of dealing with even non-exceptional errors is through exceptions. While I won't go into the details of why exceptions are a poor primitive for error handling (go read something like https://www.atlassian.com/blog/archives/exceptions_are_bad for more on that), I'll mention two things that are fairly critical when working with a code base of a non-trivial size: Control flow and type safety. With exceptions, control flow can be difficult to reason about as exceptions break the normal flow of execution, and type safety surrounding errors is non-existent as the TypeScript compiler can provide no type information on errors in catch (err) {} blocks as anything can be thrown as an exception in JavaScript.

Fortunately for us, Alfa does not need to deal with non-exceptional errors in that many places. The most common use of errors are in its parsers and the JSON-LD expansion algorithm. However, these places are also prime candidates for type safe error handling: Why and where did the parsing fail, and why and where did the JSON-LD document not expand correctly? When we rely on exceptions to inform the caller about such errors, we risk two things:

  1. The caller forgetting to handle potential errors altogether by not wrapping their call in a try {} block.
  2. The caller forgetting to handle a specific error properly due to them having to manually narrow the any error caught by a catch (err) {} block.

To avoid these risks, the Result<T, E> type is now used instead. The type parameter T represents the type of a "successful" value whereas the type parameter E represents a "failed" value, referred to as an error. Similar to the Option<T> type, you might already be familiar with the Result<T, E> type from F# or Rust. Also similar to the Option<T> type, the Result<T, E> provides a chained API that allows us to turns this...

try {
  const a: A = getA();
  const b: B = getB(a);
  const c: C = getC(b);
} catch (err) {
  if (err instanceof ErrorA || err instanceof ErrorB || err instanceof ErrorC) {
    throw log(err);
  } else {
    throw err;
  }
}

...into this:

const c: Result<C, ErrorA | ErrorB | ErrorC> = getA().map(getB).map(getC).mapErr(log);

We can also compose the Result<T, E> type and the Future<T> type in order to model asynchronous computations that might fail: Future<Result<T, E>>. Turning this into the standard Promise<T> type which is rejected on errors is then as simple as:

const task: Future<Result<T, E>> = ...;

const promise: Promise<T> = new Promise((resolve, reject) => 
  task.then(result => result.map(resolve).mapErr(reject))
);

Do notice that by doing so we've however lost the error type E.

* master:
  Remove contributors from `package.json`
  Implement `Reducer` type
  Implement `Monad` type
  Implement `Iterable` type
  Implement `Generator` type
  Implement `Functor` type
  Implement `Mapper` type
  Remove some comments
  Implement `Result` type
  Implement `Predicate` type
* master:
  Implement `Branched` traversal and sequencing
  Don't consider empty IDs in SIA-R3
* master:
  Add missing project reference
  Remove `flatten()` methods and implement `Equality` for some types
  Implement `Equality` type
* master:
  Wrap up some commentary
  Implement `List#pop` and fix `List#push`
  Add missing project reference, again
  Implement `Foldable` type
  Implement `Equality` for `Branched` type and rename a package
  Implement `List` type
  Add additional `Generator` and `Iterable` functions
  Add missing project reference
* compat:
  A couple of fixes
  Remove `Result.Maybe`
  Rework compatibility package
* master:
  Various fixes to `Iterable`, `List`, `Option`, and `Predicate` types
* master:
  Roll back to TypeScript 3.6 to get things building
* master:
  Implement `Cache#merge()`
* master:
  Remove crypto package, take 2
  Remove crypto package
* master:
  Update lockfile
* master:
  Refine `Equality` type
* master:
  Various little refactors
* master:
  Make `Option` and `Result` iterable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants