Skip to content

Latest commit

 

History

History
295 lines (223 loc) · 9.23 KB

overview.md

File metadata and controls

295 lines (223 loc) · 9.23 KB

Overview

NOTE: This document is out of date, but outlines the basic idea of how it works.

  1. Source code is parsed to an AST (recommended, but not required).
  2. AST is traversed and IR is generated.
  3. IR is printed by printer.

IR Generation

The immediate representation describes how the nodes should be formatted. It consists of...

  1. Texts
  2. Infos
  3. Conditions
  4. Signals
  5. Anchors

These are referred to as "print items" in the code.

Texts

Strings that the printer should print. For example "async".

Infos

These objects are invisible in the output. They may be placed into the IR and when resolved by the printer, report the following information about where the info ended up at:

  • lineNumber
  • columnNumber
  • indentLevel
  • lineStartIndentLevel
  • lineStartColumnNumber

Conditions

Conditions have three main properties:

  • Optional true path - Print items to use when the condition is resolved as true.
  • Optional false path - Print items to use when the condition is resolved as false.
  • Condition resolver - Function or condition that the printer uses to resolve the condition as true or false.

Condition Resolver

Conditions are usually resolved by looking at the value of a resolved info, other condition, or based on the original AST node.

The infos & conditions that are inspected may appear before or even after the condition.

Condition Re-evaluations

Condition re-evaluations can be added to the graph to force a condition to be re-evaluated once the printer reaches that point. If the condition has changed then the printer will jump back to that condition.

let condition: Condition = ...;
let condition_reevaluation = condition.create_reevaluation();

let mut items = PrintItems::new();
items.push_condition(condition);
// ...more items added here...

// forces the re-evaluation of the condition this re-evaluation
// was created from at this point
items.push_reevaluation(condition_reevaluation);

Signals

This is an enum that signals information to the printer.

  • NewLine - Signal that a new line should occur based on the printer settings.
  • Tab - Signal that a tab should occur based on the printer settings (ex. if indent width is 4 it will increase the column width by 4 for each tab).
  • PossibleNewLine - Signal that the current location could be a newline when exceeding the line width.
  • SpaceOrNewLine - Signal that the current location should be a space, but could be a newline if exceeding the line width.
  • ExpectNewLine - Expect the next character to be a newline. If it's not, force a newline. This is useful to use at the end of single line comments in JS, for example.
  • StartIndent - Signal the start of a section that should be indented.
  • FinishIndent - Signal the end of a section that should be indented.
  • StartNewLineGroup - Signal the start of a group of print items that have a lower precedence for being broken up with a newline for exceeding the line width.
  • FinishNewLineGroup - Signal the end of a newline group.
  • SingleIndent - Signal that a single indent should occur based on the printer settings (ex. prints a tab when using tabs).
  • StartIgnoringIndent - Signal to the printer that it should stop using indentation.
  • FinishIgnoringIndent - Signal to the printer that it should start using indentation again.

Anchors

Anchors provide a way to update infos in the future when an anchor's line number changes.

For example, given the following code...

let mut items = PrintItems::new();
let end_ln = LineNumber::new("end");

// ...lots of code here...
items.push_anchor(LineNumberAnchor::new(end_ln));
items.extend(actions::action("outputEndLine", |context| {
  eprintln!("{:?}", context.resolved_line_number(end_ln));
}));
// ...lots of code here...
items.push_signal(Signal::NewLine);
items.push_info(end_ln);

...the anchor at the start of the code will update the end_ln line number based on how the anchor's line number has changed since it was last evaluated. This can be useful when evaluating conditions that "look into the future".

Printer

The printer takes the IR and outputs the final code. Its main responsibilities are:

  1. Resolving infos and conditions in the IR.
  2. Printing out the text with the correct indentation and newline kind.
  3. Seeing where lines exceed the maximum line width and breaking up the line as specified in the IR.

Rules

The printer never checks the contents of the provided strings—it only looks at the length of the strings. For that reason there are certain rules:

  1. Never use a tab in a string. Instead, use Signal.Tab (see Signals below). Tabs increase the column width based on the indent width and need to be treated differently.
  2. Never use a newline in a string. Instead use Signal.NewLine.

Strings that include newlines or tabs should be broken up when parsed (ex. template literals in JavaScript may contain those characters).

The printer will enforce these rules in non-release mode.

Example IR Generation

Given the following AST nodes:

enum Node<'a> {
  ArrayLiteralExpression(&'a ArrayLiteralExpression),
  ArrayElement(&'a ArrayElement),
}

#[derive(Clone)]
struct Position {
  /// Line number in the original source code.
  pub line_number: u32,
  /// Column number in the original source code.
  pub column_number: u32,
}

#[derive(Clone)]
struct ArrayLiteralExpression {
  pub position: Position,
  pub elements: Vec<ArrayElement>,
}

#[derive(Clone)]
struct ArrayElement {
  pub position: Position,
  pub text: String,
}

With the following expected outputs (when max line width configured in printer is 10):

// input
[a   ,   b
    , c
   ]
// output
[a, b, c]

// input
[four, four, four]
// output (since it exceeds the line width of 10)
[
    four,
    four,
    four
]

// input
[
four]
// output (since first element was placed on a different line)
[
    four
]

Here's some example IR generation:

use std::rc::Rc;

use dprint_core::formatting::*;

pub fn format(expr: &ArrayLiteralExpression) -> String {
  dprint_core::formatting::format(
    || gen_node(Node::ArrayLiteralExpression(expr)),
    PrintOptions {
      indent_width: 4,
      max_width: 10,
      use_tabs: false,
      newline_kind: "\n",
    },
  )
}

// IR generation functions

fn gen_node(node: Node) -> PrintItems {
  // in a real implementation this function would deal with surrounding comments

  match node {
    Node::ArrayLiteralExpression(expr) => gen_array_literal_expression(expr),
    Node::ArrayElement(array_element) => gen_array_element(array_element),
  }
}

fn gen_array_literal_expression(expr: &ArrayLiteralExpression) -> PrintItems {
  let mut items = PrintItems::new();
  let start_ln = LineNumber::new("start");
  let end_ln = LineNumber::new("end");
  let is_multiple_lines = create_is_multiple_lines_resolver(
    expr.position.clone(),
    expr.elements.iter().map(|e| e.position.clone()).collect(),
    start_ln,
    end_ln,
  );

  // actions::if_column_number_changes is a helper that uses lower level IR to tell when the column number
  // changes at this point
  items.extend(actions::if_column_number_changes(move |context| {
    context.clear_info(end_ln);
  }));

  items.push_info(start_ln);
  items.push_anchor(LineNumberAnchor::new(end_ln)); // updates the line number of end_ln when this changes

  items.push_str("[");
  items.push_condition(conditions::if_true("arrayStartNewLine", is_multiple_lines.clone(), Signal::NewLine.into()));

  let generated_elements = gen_elements(&expr.elements, &is_multiple_lines).into_rc_path();
  items.push_condition(conditions::if_true_or(
    "indentIfMultipleLines",
    is_multiple_lines.clone(),
    ir_helpers::with_indent(generated_elements.into()),
    generated_elements.into(),
  ));

  items.push_condition(conditions::if_true("arrayEndNewLine", is_multiple_lines, Signal::NewLine.into()));
  items.push_str("]");

  items.push_info(end_ln);

  return items;

  fn gen_elements(elements: &[ArrayElement], is_multiple_lines: &ConditionResolver) -> PrintItems {
    let mut items = PrintItems::new();
    let elements_len = elements.len();

    for (i, elem) in elements.iter().enumerate() {
      items.extend(gen_node(Node::ArrayElement(elem)));

      if i < elements_len - 1 {
        items.push_str(",");
        items.push_condition(conditions::if_true_or(
          "afterCommaSeparator",
          is_multiple_lines.clone(),
          Signal::NewLine.into(),
          Signal::SpaceOrNewLine.into(),
        ));
      }
    }

    items
  }
}

fn gen_array_element(element: &ArrayElement) -> PrintItems {
  element.text.to_string().into()
}

// helper functions

fn create_is_multiple_lines_resolver(parent_position: Position, child_positions: Vec<Position>, start_ln: LineNumber, end_ln: LineNumber) -> ConditionResolver {
  Rc::new(move |condition_context: &mut ConditionResolverContext| {
    // no items, so format on the same line
    if child_positions.is_empty() {
      return Some(false);
    }
    // first child is on a different line than the start of the parent
    // so format all the children as multi-line
    if parent_position.line_number < child_positions[0].line_number {
      return Some(true);
    }

    // check if it spans multiple lines, and if it does then make it multi-line
    condition_helpers::is_multiple_lines(condition_context, start_ln, end_ln)
  })
}