Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

refactor(formatter): Introduce write, format, and format_args macros #2634

Merged
merged 38 commits into from Jun 6, 2022

Conversation

MichaReiser
Copy link
Contributor

@MichaReiser MichaReiser commented May 31, 2022

Summary

tldr: Rust's allocation free format, write and format_args for Rome formatter!

First of all, I'm sorry for this massive PR.

The motivation behind this PR is to change our formatting to work from left to right. Something that may become useful when formatting comments. The current formatter formats the most-grouped elements first rather than left to right, mainly because all IR elements like group_elements, fill etc. accept a FormatElement as a parameter and not a Format implementation. The other motivation behind this PR is to make all formatting macros allocation free compared to the current formatted! and format_elements macro that requires at least one allocation (except the compiler optimises it away).

This PR enforces left to right formatting by changing Format's signature from:

fn format(&self, f: &Formatter<JsFormatOptions>) -> FormatResult<FormatElement>

to

fn format(&self, f: &mut Formatter<JsFormatOptions>) -> FormatResult()

The main change is that format no longer returns the formatted result but instead writes it into the Formatter, similar to Rust's Debug and Display trait with the write and format_args macros. The fact that format now writes to a shared FormatElement buffer enforces format rules to write the element in order or they will appear out of order in the output.

Second, the PR changes all builders (group_elements, fill, space) to return objects that implement Format rather than FormatElements directly. This decouples the formatting DSL from the IR used by the printer. The other added benefit is that builders that accept some inner content no longer accept FormatElement but instead accept any object implementing Format. This should remove the need for many formatted! calls that were necessary just because some helper needs a FormatElement because it's the least common denominator.

OK, but how do I write my formatting logic now:

impl FormatNodeFields<JsFunctionBody> for FormatNodeRule<JsFunctionBody> {
    fn format_fields(
        node: &JsFunctionBody,
        f: &mut Formatter<JsFormatOptions>,
    ) -> FormatResult<()> {
        let JsFunctionBodyFields {
            l_curly_token,
            directives,
            statements,
            r_curly_token,
        } = node.as_fields();

        let format_statements = format_with(|f| {
            let mut join = f.join_nodes_with_hardline();

            for stmt in &statements {
                join.entry(stmt.syntax(), &stmt.format_or_verbatim());
            }

            join.finish()
        });

        write!(
            f,
            [f.delimited(
                &l_curly_token?,
                &format_args![directives.format(), format_statements],
                &r_curly_token?,
            )
            .block_indent()]
        )
    }
}

The main differences are

  • You call write!(f, []) instead of formatted if you want to write something to the document.
  • You use format_args if you want to pass multiple Format objects to a helper like group_elements.
  • The formatter now exposes helpers like f.join(), f.join_with(), f.fill() and f.join_nodes_with_softline_break etc to format a sequence of objects

Part of #2571

Drawbacks

The main drawback in my view is that it becomes harder to debug formatter code because it's no longer possible to dbg print some sub-result. This has become less of a concern in my view with the new playground but can still make it difficult to debug some formatting if things go amiss.

What would be nice at least is if all built in Format implementations could implement Debug so that it becomes possible to write write!(f, [dbg!(group_elements(...))]). However, the only way I can think of doing this is by calling Format inside of Debug with the default formating options which may be misleading.

SOLVED: I added a dbg_write! macro, see this comment

Benchmark

group                                    main                                   write
-----                                    ----                                   -----
formatter/checker.ts                     1.14    222.6±3.92ms    11.7 MB/sec    1.00    196.0±2.61ms    13.3 MB/sec
formatter/compiler.js                    1.17    127.1±1.36ms     8.2 MB/sec    1.00    109.0±1.02ms     9.6 MB/sec
formatter/d3.min.js                      1.17     94.6±1.60ms     2.8 MB/sec    1.00     80.8±0.61ms     3.2 MB/sec
formatter/dojo.js                        1.16      7.1±0.02ms     9.7 MB/sec    1.00      6.1±0.02ms    11.2 MB/sec
formatter/ios.d.ts                       1.22   164.4±11.68ms    11.3 MB/sec    1.00    135.0±2.86ms    13.8 MB/sec
formatter/jquery.min.js                  1.19     28.2±0.11ms     2.9 MB/sec    1.00     23.7±0.14ms     3.5 MB/sec
formatter/math.js                        1.12    206.9±2.75ms     3.1 MB/sec    1.00    184.1±2.40ms     3.5 MB/sec
formatter/parser.ts                      1.20      5.1±0.05ms     9.4 MB/sec    1.00      4.3±0.07ms    11.4 MB/sec
formatter/pixi.min.js                    1.13    109.5±0.96ms     4.0 MB/sec    1.00     96.6±7.00ms     4.5 MB/sec
formatter/react-dom.production.min.js    1.20     33.9±0.61ms     3.4 MB/sec    1.00     28.3±0.54ms     4.1 MB/sec
formatter/react.production.min.js        1.23  1675.2±27.82µs     3.7 MB/sec    1.00   1366.2±9.63µs     4.5 MB/sec
formatter/router.ts                      1.17      3.8±0.19ms    16.2 MB/sec    1.00      3.2±0.05ms    18.9 MB/sec
formatter/tex-chtml-full.js              1.13    259.6±1.60ms     3.5 MB/sec    1.00    230.2±5.77ms     4.0 MB/sec
formatter/three.min.js                   1.16    126.6±1.49ms     4.6 MB/sec    1.00    109.6±2.92ms     5.4 MB/sec
formatter/typescript.js                  1.14    871.7±3.93ms    10.9 MB/sec    1.00    766.9±4.23ms    12.4 MB/sec
formatter/vue.global.prod.js             1.21     44.1±0.14ms     2.7 MB/sec    1.00     36.3±0.91ms     3.3 MB/sec

Build time

Reduces the release build time of rome_js_formatteralmost by 40% from 10s to 6.4s for a clean build.

Crate Size

Reduces the rome_formatter and rome_js_formatter crate by 287.8Kib (-40%)

Before

 File  .text     Size Crate
 5.7%  17.1% 710.9KiB rome_js_formatter
 0.2%   0.7%  29.6KiB rome_formatter
Total        739.5Kib

After

 File  .text     Size Crate
 3.5%  11.1% 428.7KiB rome_js_formatter
 0.2%   0.6%  23.0KiB rome_formatter
Total        451.7Kib

Allocations

Screenshot 2022-05-31 at 15 05 09

Left: Main
Right: This PR

It mainly reduces the number of small transient object (~128kb) from 200k to 80k. My interpretation is that this is because it's no longer necessary to allocate a vector for every format functions call that return more than one element.

Future Improvements

  • All the "nesting" IR elements still allocate a vector internally or Box a FormatElement. It should be possible to re-write our IR to a flat IR that uses Start and End markers and everything in between would be the content of that element. It should then become possible to only do a single allocation for formatting the whole document which should reduce another massive amount of allocations.
  • Refactor member chain, conditional, and binary expression formatting to not use VecBuffer internally anymore

Open Question

  • Macro names: The format, format_args and write macros clash with Rust's std::fmt macros. It's, therefore, required to have an explicit use at the top of every file and use std::format in places where you want to format a string. Do we have better alternative names for these macros?
  • Macro signature: format, and write have the signature format![options, [...formattable]] and require the explicit [] to separate the formattable. I like the distinction to Rust's built-in macros but it often is difficult to figure out if I've added the right amount of parentheses (and rustfmt formats the arguments a little less nice). Should we remove the [] tokens?

Test Plan

cargo test

TODO

  • Documentation
  • Merge (uff)
  • Placement of helper structs inside rome_formatter and rome_js_formatter.
  • Support BestFitting
  • Code gen

@MichaReiser MichaReiser changed the title Refactor/format and write macros refactor(formatter): Introduce write, format, and format_args macros May 31, 2022
@MichaReiser
Copy link
Contributor Author

!bench_formatter

@cloudflare-pages
Copy link

cloudflare-pages bot commented May 31, 2022

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 00e8740
Status: ✅  Deploy successful!
Preview URL: https://638ef447.tools-8rn.pages.dev

View logs

@github-actions
Copy link

Formatter Benchmark Results

group                                    main                                   pr
-----                                    ----                                   --
formatter/checker.ts                     1.13    362.4±2.68ms     7.2 MB/sec    1.00    320.7±2.49ms     8.1 MB/sec
formatter/compiler.js                    1.14    218.4±2.60ms     4.8 MB/sec    1.00    192.4±2.33ms     5.4 MB/sec
formatter/d3.min.js                      1.16    169.7±0.82ms  1581.8 KB/sec    1.00    146.8±0.91ms  1828.8 KB/sec
formatter/dojo.js                        1.14     12.3±0.07ms     5.6 MB/sec    1.00     10.8±0.08ms     6.4 MB/sec
formatter/ios.d.ts                       1.17    276.6±0.92ms     6.7 MB/sec    1.00    236.2±0.99ms     7.9 MB/sec
formatter/jquery.min.js                  1.15     46.6±0.35ms  1815.4 KB/sec    1.00     40.4±0.49ms     2.0 MB/sec
formatter/math.js                        1.17    347.1±3.73ms  1910.5 KB/sec    1.00    295.8±4.63ms     2.2 MB/sec
formatter/parser.ts                      1.14      8.6±0.19ms     5.6 MB/sec    1.00      7.5±0.03ms     6.4 MB/sec
formatter/pixi.min.js                    1.15    193.0±0.94ms     2.3 MB/sec    1.00    167.3±1.69ms     2.6 MB/sec
formatter/react-dom.production.min.js    1.16     59.0±0.40ms  1996.1 KB/sec    1.00     50.7±0.57ms     2.3 MB/sec
formatter/react.production.min.js        1.18      2.9±0.23ms     2.1 MB/sec    1.00      2.5±0.04ms     2.5 MB/sec
formatter/router.ts                      1.13      6.3±0.02ms     9.6 MB/sec    1.00      5.6±0.03ms    10.9 MB/sec
formatter/tex-chtml-full.js              1.16    433.8±3.04ms     2.1 MB/sec    1.00    374.4±3.34ms     2.4 MB/sec
formatter/three.min.js                   1.15    221.2±2.65ms     2.7 MB/sec    1.00    191.7±1.64ms     3.1 MB/sec
formatter/typescript.js                  1.13   1375.4±4.87ms     6.9 MB/sec    1.00  1213.7±14.45ms     7.8 MB/sec
formatter/vue.global.prod.js             1.16     77.2±1.47ms  1598.1 KB/sec    1.00     66.5±1.32ms  1856.1 KB/sec

crates/rome_formatter/src/arguments.rs Outdated Show resolved Hide resolved
crates/rome_formatter/src/arguments.rs Outdated Show resolved Hide resolved
@IWANABETHATGUY
Copy link
Contributor

IWANABETHATGUY commented Jun 1, 2022

Hi, would you mind telling me what software did you use in Allocations graph?

@MichaReiser MichaReiser force-pushed the refactor/format-and-write-macros branch from ce4c4d1 to 0b7fc35 Compare June 1, 2022 10:36
@MichaReiser MichaReiser temporarily deployed to aws June 1, 2022 10:36 Inactive
@github-actions
Copy link

github-actions bot commented Jun 1, 2022

@github-actions
Copy link

github-actions bot commented Jun 1, 2022

Parser conformance results on ubuntu-latest

js/262

Test result main count This PR count Difference
Total 45878 45878 0
Passed 44938 44938 0
Failed 940 940 0
Panics 0 0 0
Coverage 97.95% 97.95% 0.00%

jsx/babel

Test result main count This PR count Difference
Total 39 39 0
Passed 36 36 0
Failed 3 3 0
Panics 0 0 0
Coverage 92.31% 92.31% 0.00%

symbols/microsoft

Test result main count This PR count Difference
Total 5946 5946 0
Passed 350 350 0
Failed 5589 5588 ✅ ⏬ -1
Panics 7 8 ❌ ⏫ +1
Coverage 5.89% 5.89% 0.00%
💥 Failed to Panic (3):
moduleAugmentationExtendFileModule1.symbols
moduleAugmentationNoNewNames.symbols
sourceMapValidationDestructuringForArrayBindingPatternDefaultValues2.symbols
⁉️ Panic To Failed (2):
isomorphicMappedTypeInference.symbols
typeGuardsAsAssertions.symbols

ts/babel

Test result main count This PR count Difference
Total 588 588 0
Passed 519 519 0
Failed 69 69 0
Panics 0 0 0
Coverage 88.27% 88.27% 0.00%

ts/microsoft

Test result main count This PR count Difference
Total 16257 16257 0
Passed 12391 12391 0
Failed 3866 3866 0
Panics 0 0 0
Coverage 76.22% 76.22% 0.00%

@MichaReiser
Copy link
Contributor Author

Hi, would you mind telling me what software did you use in Allocations graph?

I'm using cargo instruments on mac and the built in CLion tools on Linux

@MichaReiser MichaReiser temporarily deployed to aws June 1, 2022 10:57 Inactive
@MichaReiser
Copy link
Contributor Author

Interesting how this change is close to zero added lines, considering that it adds quiet a bit of documentation.

@MichaReiser MichaReiser temporarily deployed to aws June 3, 2022 15:15 Inactive
@MichaReiser MichaReiser temporarily deployed to aws June 3, 2022 15:40 Inactive
@MichaReiser
Copy link
Contributor Author

I would add some documentation in the docs of the formatter to explain how to debug the IR. Now we can't use dbg so it's better to have a paragraph to link to the contributors.

Good idea, I added it to the rome_js_formatter README

Copy link
Contributor

@yassere yassere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still plan to read through this in more depth later, but I have no objection to merging it as soon as you feel there's enough consensus (to avoid having to do another tricky rebase).

Assuming people are happy with the basic idea, we can iterate later on the implementation.

@MichaReiser
Copy link
Contributor Author

I still plan to read through this in more depth later, but I have no objection to merging it as soon as you feel there's enough consensus (to avoid having to do another tricky rebase).

Assuming people are happy with the basic idea, we can iterate later on the implementation.

Sounds good. I don't plan to merge this before Monday or Tuesday. I want to spend some time to see if I can reduce the amount of code that depends on generic type parameters

@ematipico
Copy link
Contributor

I still plan to read through this in more depth later, but I have no objection to merging it as soon as you feel there's enough consensus (to avoid having to do another tricky rebase).
Assuming people are happy with the basic idea, we can iterate later on the implementation.

Sounds good. I don't plan to merge this before Monday or Tuesday. I want to spend some time to see if I can reduce the amount of code that depends on generic type parameters

Please let's make Monday, not Tuesday, if possible. I will need to prepare the PR for the release I would like to have some time to prep things. And I would like to not delay the release again.

builder,
result: Ok(()),
options: PhantomData,
unsafe {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it be possible to also create a builder called best_fit?

While trying this API I was in need to the following case:

let mut list = f.join_with(soft_line_break_or_space());

if something == true {
	let best_fit = best_fit([]);
	list.entry(best_fit);
}

Unless there's a better way to do it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the example. You initalise best fit with empty?

There's no technical reason not to have it. The builder could accept the most flat version when constructing the builder and the most expanded when finishing it.

The reason I'm hesitant from adding it is that the api should encourage devs to add as little variants as possible, ideally all variants should be known at compile time which is what the macro enforces. The reason for this is that best fitting has a quadratic complexity

One way to work around this is by having multiple format_with functions that you then pass to best fitting.

However, I haven't used best fitting myself yet. So there are probably some patterns that we first need to find that work best

Copy link
Contributor

@ematipico ematipico Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am realizing that now. I am trying to see what we can do. There are also other patterns around separated lists that I am exploring, because there are cases that we have never encountered before (in call arguments for example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to hear about those!

@MichaReiser MichaReiser temporarily deployed to aws June 4, 2022 07:47 Inactive
@MichaReiser MichaReiser temporarily deployed to aws June 4, 2022 08:18 Inactive
@MichaReiser
Copy link
Contributor Author

!bench_formatter

@github-actions
Copy link

github-actions bot commented Jun 4, 2022

Formatter Benchmark Results

group                                    main                                   pr
-----                                    ----                                   --
formatter/checker.ts                     1.16   446.4±12.99ms     5.8 MB/sec    1.00   383.6±15.85ms     6.8 MB/sec
formatter/compiler.js                    1.16    272.6±6.68ms     3.8 MB/sec    1.00    235.1±8.43ms     4.5 MB/sec
formatter/d3.min.js                      1.10    194.8±7.52ms  1377.6 KB/sec    1.00    177.6±6.42ms  1511.4 KB/sec
formatter/dojo.js                        1.06     13.7±0.61ms     5.0 MB/sec    1.00     13.0±0.47ms     5.3 MB/sec
formatter/ios.d.ts                       1.20   318.7±10.46ms     5.9 MB/sec    1.00    266.3±9.51ms     7.0 MB/sec
formatter/jquery.min.js                  1.13     55.5±3.18ms  1524.1 KB/sec    1.00     49.2±2.03ms  1720.0 KB/sec
formatter/math.js                        1.09   391.3±13.42ms  1694.5 KB/sec    1.00   358.3±14.69ms  1850.7 KB/sec
formatter/parser.ts                      1.20     10.0±0.42ms     4.8 MB/sec    1.00      8.3±0.46ms     5.8 MB/sec
formatter/pixi.min.js                    1.09   226.3±11.22ms  1985.7 KB/sec    1.00    206.8±8.00ms     2.1 MB/sec
formatter/react-dom.production.min.js    1.18     72.7±2.32ms  1620.1 KB/sec    1.00     61.4±2.17ms  1919.5 KB/sec
formatter/react.production.min.js        1.15      3.4±0.12ms  1856.9 KB/sec    1.00      3.0±0.13ms     2.1 MB/sec
formatter/router.ts                      1.16      7.5±0.39ms     8.2 MB/sec    1.00      6.4±0.31ms     9.5 MB/sec
formatter/tex-chtml-full.js              1.04   497.3±18.36ms  1876.5 KB/sec    1.00   477.5±12.54ms  1954.3 KB/sec
formatter/three.min.js                   1.12    261.1±9.95ms     2.2 MB/sec    1.00    233.4±8.74ms     2.5 MB/sec
formatter/typescript.js                  1.13  1609.9±60.97ms     5.9 MB/sec    1.00  1428.6±48.09ms     6.6 MB/sec
formatter/vue.global.prod.js             1.18     95.0±3.99ms  1299.0 KB/sec    1.00     80.5±3.65ms  1533.3 KB/sec

…nting the list formating in the parent node.
@MichaReiser MichaReiser temporarily deployed to aws June 4, 2022 09:59 Inactive
@MichaReiser MichaReiser temporarily deployed to aws June 4, 2022 10:11 Inactive
@MichaReiser MichaReiser force-pushed the refactor/format-and-write-macros branch from 2fee7bf to 1196545 Compare June 4, 2022 10:12
@MichaReiser MichaReiser temporarily deployed to aws June 4, 2022 10:12 Inactive
@MichaReiser MichaReiser temporarily deployed to aws June 6, 2022 10:39 Inactive
@MichaReiser MichaReiser temporarily deployed to aws June 6, 2022 10:44 Inactive
@MichaReiser MichaReiser merged commit 941faf9 into main Jun 6, 2022
@MichaReiser MichaReiser deleted the refactor/format-and-write-macros branch June 6, 2022 10:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants