Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic Compiler Plugin Proposal #45886

Open
5 tasks done
uraj opened this issue Sep 15, 2021 · 0 comments
Open
5 tasks done

Diagnostic Compiler Plugin Proposal #45886

uraj opened this issue Sep 15, 2021 · 0 comments
Assignees
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript

Comments

@uraj
Copy link

uraj commented Sep 15, 2021

Suggestion

🔍 Search Terms

Plugins, Extensions, Custom Diagnostics

✅ Viability Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

⭐ Suggestion

There are many cases where TypeScript users want to extend the language with additional diagnostic rules, but tsc has very limited extensibility. This proposal suggests a new plugin mechanism for tsc to allow TypeScript users to develop and deploy add-ons that produce custom semantic diagnostics.

There are other proposals (#16607 and #38736) that seek to extend the behavior of the TypeScript compiler. It seems that a major concern that blocks those proposals from moving forward is that these proposals seek to allow features like extending the language syntax and applying additional transformations that affect JS code emission. This proposal should not raise such a concern, since the plugins we want are merely for emitting new semantic diagnostics. Other functionalities, such as extending language syntax or applying transformations, are out of the scope of this proposal.

📃 Motivation

One of Google’s recent major objectives of enhancing the security of the web is to roll out Trusted Types, a new Web API proposal that provides in-depth defenses against XSS. If Trusted Types are enabled, the browser will monitor security-sensitive DOM APIs like element.innerHTML and script.src at run time, making sure these APIs only accept non-spoofable, typed values instead of strings. To make existing applications compatible with Trusted Types, developers have to identify and review the uses of those XSS-prone APIs in their code.

We found that it would be extremely helpful to make the TypeScript compiler able to detect those violations during compilation, since those code locations will lead to run-time errors when turning on Trusted Types. This led to tsec, a tool that mimics the behavior of tsc but additionally reports Trusted Types violations as semantic errors.

We started building tsec to support the migration of several open source projects to Trusted Types, including Visual Studio Code. Along this journey, we found that existing compiler APIs are not convenient for creating a tool that aims to augment the functionality of tsc, for reasons listed in “Current Solutions".

We would like to have a new set of compiler APIs that allow developers to install static analysis plugins directly into tsc, instead of calling existing compiler APIs to mimic how tsc behaves. The new APIs should handle all the common setups for the plugins so that plugins work inside tsc out of the box and plugin writers can focus on core functionalities. We list our major requirements below, all of which are derived from the lessons learned from developing tsec:

  • One plugin works for all compilation modes, e.g., build mode, watch mode.
    • Each mode requires different APIs. For the purpose of enforcing additional diagnostic rules, we found those APIs overly complex, leading to a less readable and hard-to-maintain code base.
  • Configurable through tsconfig.json with a certain degree of standard option parsing support.
    • Sometimes users of tsec may want to tune the sensitivity of the security analysis, e.g., turning off certain rules or enabling loose type matching. These require plugins to accept options through tsconfig.json. For now we need to parse and validate those options inside the plugin. It will be much more convenient for tsc to automatically parse the options based on plugin-provided argument specs.
  • Changing plugin options in tsconfig.json should invalidate the build cache.
    • Diagnostics emitted by plugins can change when plugin options change, even if the source code is not edited. With the current compiler APIs, we need to parse and manipulate the .tsbuildinfo cache by ourselves.
  • The plugin can refer to other files for additional configuration. The freshness of those files should be considered by tsc in incremental compilation since their contents can affect how the plugin works.
    • This may be a requirement very specific to tsec. It’s likely that developers cannot address all compiler-reported violations at once, and there can be false positives. Therefore, tsec allows project maintainers to keep a centralized exemption list for suppressing certain diagnostic rules for certain files. The path to this exemption list should be provided to tsec, and its content should be considered as part of the plugin configuration. Whenever an entry is added to or removed from the list, the build cache should be invalidated so that tsec has the chance to recheck everything.
  • Compatible with the existing language service plugin mechanism.
    • It’s very desirable to automatically manifest all compiler errors in IDEs.

💻 Use Cases

It may seem unnecessary to customize tsc for new coding rules, considering that it can be done via linting tools such as TSLint and ESLint. However, we believe there are many cases that can benefit from bundling this functionality directly into tsc instead of relying on a linter.

We want to promote some of the diagnostic rules to “first-class” compiler checks, since compiler errors produce a more immediate response from developers (backed by research based on industrial data [1, 2]). Google has verified this idea by developing and deploying a tool called Error-Prone, which can customize a Java compiler for more sophisticated error checking. As for TypeScript, Google has developed a “safe coding” practice that forces developers to write secure code through a set of in-compiler security checks that forbid the use of XSS-prone DOM APIs (see this paper for details).

The lack of an officially supported plugin system has led to ecosystem fragmentation. There have been different TypeScript language tools that support plugins, e.g., ttypescript (Transformer TypeScript), TSLint and ESLint. Some of them have their own AST definitions, making plugins not portable. Migrating from one tool to another can introduce prohibitively expensive engineering costs. With the deprecation of TSLint, these costs are not speculative but have real impact on many developers and organizations, including Google.

TypeScript already supports a plugin mechanism, but it is only available in the language service for improving code editing experience in IDEs. However, diagnostics emitted in this way do not manifest when the same code is directly compiled by tsc, even though the language service plugins are configured through a compiler option in tsconfig.json which is entirely visible to tsc. This leads to inconsistent development experience, affected by whether developers are using an IDE or not.

Current Solutions

Several projects (such as tsec and Bazel) have been maintaining “wrapped” versions of tsc to emit customized diagnostics. These wrappers accept the same command line flags as tsc does. Internally, they call the TypeScript compiler APIs to perform standard type checking and code emission.

This methodology has some notable inconveniences:

  • A fairly large amount of code repetition with tsc, basically implementing the same functionality, e.g. parsing command line flags, parsing tsconfig.json, supporting build mode and watch mode.
  • It's hard to "teach" other compiler APIs to treat the additional diagnostics like native diagnostics. For example, when the wrapper decides there should be semantic errors in a source file, it’s difficult to stop the compiler from emitting JS code for that file.
  • There is no easy way to invalidate build cache when plugin versions or configurations change with the current compiler APIs. This is a problem since different versions and configurations can lead to different diagnostics to be emitted.

Some other projects rely on solutions based on ttypescript, a tool that allows project owners to dynamically patch the TypeScript compiler with custom transformers specified in tsconfig.json, which avoids some of the aforementioned problems. Although ttypescript is designed for purposes that do not quite align with our use cases (it focuses on transformation rather than analysis), the same idea can still be applied to implementing diagnostic plugins. Nevertheless, the ttypescript implementation depends on knowledge about the TypeScript compiler internals which are subject to change across version updates. Therefore, this solution can be brittle. There are many reports from users of ttypescript stating that certain functionalities ceased to work properly after TypeScript version upgrades.

Proposal

Plugin Configuration

Diagnostic plugins are configured through tsconfig.json, with the same option currently used to configure language service plugins To distinguish the new plugins from language service plugins, a new “type” property is introduced.

{
  "compilerOptions": {
    "plugins": [
      {"name": "language-service-plugin-a" }, // Without "type", the plugin is for language service by default
      {"name": "language-service-plugin-b", "type": "language-service" },
      {"name": "diagnostic-plugin", "type": "diagnostic"},
      // A diagnostic plugin that wants the diagnostics to also surface in IDEs; it shouldn't need
      // to implement the LSP interface separately, as the diagnostics are treated like first-class
      // compiler messages that ought to automatically work in IDEs.
      {"name": "language-service-and-diagnostic-plugin", "type": "diagnostic|language-service"}
    ]
  }
}

Plugin Definition

A plugin needs to define the following three components.

Plugin Option Spec

Similar to other compiler options, a diagnostic plugin is defined by name and type. After loading the plugin, tsc will automatically parse the additional plugin options as the arguments and emit errors or warnings if the arguments provided are malformed. The compiler should also automatically resolve path-type arguments to absolute paths. Plugin arguments can be defined by an interface similar to the internal API CommandLineOptionBase.

interface DiagnosticPluginOption {
  name: string;
  type: "string"|"number"|"boolean"|"object"|"list"|Map<string, string>;
  required?: boolean;
  isFilePath?: boolean;
  extraValidation?: (value: CompilerOptionsValue) => DiagnosticMessage|undefined;
}

All plugin options affect the semantic diagnostics during the compilation, so tsc should be able to smartly handle option changes by recording plugin options inside .tsbuildinfo and invalidating build cache whenever different plugin options are used for compilation. In particular, if the option is a file path, modifications to that file should be tracked as well. A real-world use case of file path options is that a plugin wants to respect an exemption list indicating that developers are aware of certain diagnostics and would like to suppress them in future compilations. Changes to that exemption list will affect diagnostics emitted in future compilations.

To ensure compilation with diagnostic plugins is hermetic, plugins should

  • Only read additional files from tsconfig options.
  • Be deterministic.

Initializer

A callback for the plugin to initialize itself when loaded by the compiler for a project. The initializer takes the parsed compiler options as input. With those options (including those for the plugin itself), the initializer should be able to construct the necessary data structures for the plugin to perform analysis later.

// The initializer is called whenever the program or plugin configurations change.
type DiagnosticPluginInitializer =
    (program: Program, pluginOptions: Record<string, CompilerOptionsValue>) => void;

Analyzer

The analyzer is the main component of a plugin. It is a callback that takes a type checked source file and returns a list of semantic diagnostics that the plugin wants to report. When the plugin is being dispatched, tsc will ensure that the analyzer is called on each source file that has been compiled.

type DiagnosticPluginAnalyzer = 
    // If no source file is provided, the whole program should be checked.
    (source: SourceFile|undefined, cancellationToke?: CancellationToken) => Diagnostic[];

Complete Plugin Definition

A plugin may implement an interface like this:

interface DiagnosticPlugin {
  init: DiagnosticPluginInitializer;
  dispatch: DiagnosticPluginAnalyzer;
}

The module that defines the plugin should export an object (provisionally named as “builder”) that implements the following interface:

interface DiagnosticPluginBuilder {
  optionsSpec: readonly DiagnosticPluginOption[];
  // Need a constructor since tsc needs to create a plugin instance for each
  // project being compiled.
  new(): DiagnosticPlugin;
}

class MyPlugin implements DiagnosticPlugin {
  static optionsSpec: readonly DiagnosticPluginOption[] = [...];
  private program: Program|undefined;
  init(program: Program, pluginOptions: Record<string, CompilerOptionsValue>): Promise<void>|void {
    // In watch mode or incremental build, `init` can be called multiple times.
    if (self.program === undefined) {
      self.program = program; // First call; keep a reference to the program
    } else {
      // perform some incremental changes.
    }
  }
  dispatch(source: SourceFile, cancellationToke?: CancellationToken):
      Diagnostic[] {
    ...
  }
}

export const builder: DiagnosticPluginBuilder = MyPlugin;

Plugin Loading

Diagnostic plugins are specified through “compilerOptions”“plugins”“name” in tsconfig.json. Being consistent with how language service plugins are loaded, “by default, this name is interpreted as an NPM package name; that is, it can be either the name of an NPM package with an index.js file, or it can be a path to a directory with an index.js file.” Plugin authors need to make sure the plugin package provides a module of that name whose path can be resolved by tsc, using the tsc’s Node resolution strategy.

There may be security concerns that a malicious plugin gets loaded by tsc and therefore runs arbitrary code on the development machine. To alleviate this threat, we can add an optional field “path”. When “path” is present, tsc will not try to automatically resolve the plugin location. Instead, it explicitly loads the plugin from ${path}/${name}/index.js.

To be extra cautious, cross-project plugin loading is not allowed, i.e., tsc may only load plugins from code locations that share the common project root with itself. That means:

  • The globally installed tsc should not load plugins from local project locations.
  • A tsc installed for project A (/project-a/node_modules/typescript/bin/tsc) should not load plugins downloaded for project B (/project-b/node_modules/my-plugin/index.js).

This segregation rule applies regardless of the value of “path”. Whenever a project configuration violates the security rules, tsc aborts compilation and emits an error.

The starting point of the plugin search is the directory in which the tsconfig.json is located. For example, given /root/workspace/project/tsconfig.json with the following content:

{
  "compilerOptions": {
    "plugins": [
      {
        "name": "my-plugin",
        "type": "diagnostic",
        // If `"path"` is provided, tsc only loads node modules from the specified location.
        //"path": "./trusted_node_modules", // Optional, for additional security
      }
    ]
  }
}

The compiler will try to load the plugin from the following locations in order:

  1. /root/workspace/project/node_modules/my-plugin/index.js
  2. /root/workspace/node_modules/my-plugin/index.js
  3. /root/node_modules/my-plugin/index.js
  4. /node_modules/my-plugin/index.js

If the plugin configuration is inherited from another JSON file, the starting point of the search should be relative to the configuration file being extended.

If any of the configured diagnostic plugins cannot be found by tsc, it aborts the compilation. In verbose mode, tsc can additionally report the paths it searched to ease debugging.

Diagnostic plugins will always be loaded by the tsc invoked for compilation, regardless of whether the tsc is local to the workspace, a global installation, or from an IDE. Plugin authors can assume that their plugins will be loaded by a compatible tsc, even though this may not always be true, e.g., the tsc invoked may be from a system-wide installation or VS Code. There can be run-time errors in those cases, as the plugins may be using a higher version of TS compiler APIs. This proposal does not consider any error reporting recovery strategy for those situations.

Plugin Initialization

When compiling a project, tsc will first parse tsconfig.json and gather all compiler options. It then examines each entry in “plugins”. For each diagnostic plugin, tsc will dynamically load the module with the specified name. The compiler can then get a reference to the exported plugin builder. Using builder.optionsSpec, tsc will parse other properties of the plugin configuration JSON object and report errors if there are malformed options.

If option parsing is successful, tsc will construct a plugin instance with the plugin builder and pass the options to the plugin initializer. It can then continue the compilation, until it is about to report semantic diagnostics. At that point, tsc will dispatch the plugin analyzer on each source file being compiled in the project. The compiler should be able to dispatch plugins in all relevant compilation modes.

Diagnostic Reporting

All diagnostic reporting should be taken care of by the compiler. We don’t expect plugins to customize that behavior, so tsc will treat diagnostics from plugins in the same way as it reports first-class compiler diagnostics, except that tsc will tag the additional diagnostics with plugin names so that developers know where the new errors and warnings come from.

Build Cache Invalidation

When performing an incremental build, tsc should be able to decide which files should be re-compiled based on both first-class diagnostics and plugin diagnostics. Also, if a file contains plugin-emitted errors, the same errors should be displayed when the project is recompiled in the next incremental build, even if there are no “actual” compiler errors that prevent JS code emission. Additionally, changes in plugin configurations should also invalidate existing build cache.

Matching Requirements with Existing Prototype

There was a work-in-progress prototype for tsc plugins, authored by @rbuckton. The relevant artifacts are:

We have evaluated this prototype against our requirements. The conclusion is that it can fulfill most of our needs, mostly through the preEmit instrumentation API. Other requirements not covered by this prototype can be met with minor additions to the prototype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

4 participants