-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Description
Continuing the discussion started in #9266.
The current situation is a bit painful regarding formatting of AST. Macro.to_string/1
is the only thing that we have that takes AST and returns a somehow "formatted" string, and Code.format_string!/2
is the only thing that we have that properly formats (according to the formatter) a string of code. We're missing a piece of the puzzle, that is, a thing that takes AST and returns properly formatted code. The problem is that the formatter needs a string input in order to gather information about the original shape of the code: for example, the formatter needs to know if an integer was written out as 0x00
or as 0
, because the AST for those two is the same (0
). In fact, the formatter operates on a slightly modified (but still valid) AST where there are no literals. All literals are wrapped inside {:__block__, meta, literal}
tuples so that they can have metadata attached to them. The metadata is used to store lots of stuff like the original representation of a literal, or the do:
vs do
/end
blocks, and so on.
With this information, there's two things we can do to add the missing piece to the puzzle.
-
allow formatter to work on normal AST: the first thing that @josevalim and I discussed was allowing the formatter to work on normal AST. This would mean leaving the formatter as it is today for all "formatter AST", but adding handling of normal AST with sane defaults. For example, an integer like
0x00
would be fed to the formatter as its AST (0
) and hence formatted as0
. This could work, but the problem is that if a tool wants to read code, turn it into AST, modify the AST, and then call the formatter on the AST, it would still lose a lot of the original formatting intended by the user. This could allow us to completely get rid ofMacro.to_string/1
and only haveCode.format_ast/2
, but we're not sure yet because of the callback supported byMacro.to_string/2
. -
inject formatter metadata from
Macro.to_string/1
: the alternative we have is to remove the "formatting" fromMacro.to_string/1
and only haveMacro.to_string/1
take AST and decorate it with all the stuff that the formatter needs. This would mean turning every literal into a:__block__
, adding necessary metadata, and so on.
I think we're leaning on the option 2. because it's likely simpler to implement. Both options would still have to deal with the callback in Macro.to_string/2
which might be a bit of a pain (but can't be removed for backwards compatibility).
One other thing to consider is performance. In the formatter, we don't really focus on performance since it's a "static" tool that runs outside of the runtime of an application. On the other hand, Macro.to_string/1,2
could be used in production code and is used internally by Elixir itself as well.