Skip to content

Take advantage of the formatter in Macro.to_string/1,2 #9291

@whatyouhide

Description

@whatyouhide

Continuing the discussion started in #9266.

The current situation is a bit painful regarding formatting of AST. Macro.to_string/1 is the only thing that we have that takes AST and returns a somehow "formatted" string, and Code.format_string!/2 is the only thing that we have that properly formats (according to the formatter) a string of code. We're missing a piece of the puzzle, that is, a thing that takes AST and returns properly formatted code. The problem is that the formatter needs a string input in order to gather information about the original shape of the code: for example, the formatter needs to know if an integer was written out as 0x00 or as 0, because the AST for those two is the same (0). In fact, the formatter operates on a slightly modified (but still valid) AST where there are no literals. All literals are wrapped inside {:__block__, meta, literal} tuples so that they can have metadata attached to them. The metadata is used to store lots of stuff like the original representation of a literal, or the do: vs do/end blocks, and so on.

With this information, there's two things we can do to add the missing piece to the puzzle.

  1. allow formatter to work on normal AST: the first thing that @josevalim and I discussed was allowing the formatter to work on normal AST. This would mean leaving the formatter as it is today for all "formatter AST", but adding handling of normal AST with sane defaults. For example, an integer like 0x00 would be fed to the formatter as its AST (0) and hence formatted as 0. This could work, but the problem is that if a tool wants to read code, turn it into AST, modify the AST, and then call the formatter on the AST, it would still lose a lot of the original formatting intended by the user. This could allow us to completely get rid of Macro.to_string/1 and only have Code.format_ast/2, but we're not sure yet because of the callback supported by Macro.to_string/2.

  2. inject formatter metadata from Macro.to_string/1: the alternative we have is to remove the "formatting" from Macro.to_string/1 and only have Macro.to_string/1 take AST and decorate it with all the stuff that the formatter needs. This would mean turning every literal into a :__block__, adding necessary metadata, and so on.

I think we're leaning on the option 2. because it's likely simpler to implement. Both options would still have to deal with the callback in Macro.to_string/2 which might be a bit of a pain (but can't be removed for backwards compatibility).

One other thing to consider is performance. In the formatter, we don't really focus on performance since it's a "static" tool that runs outside of the runtime of an application. On the other hand, Macro.to_string/1,2 could be used in production code and is used internally by Elixir itself as well.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions