Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated code formatter #59

Open
bbarenblat opened this issue Nov 6, 2016 · 9 comments
Open

Automated code formatter #59

bbarenblat opened this issue Nov 6, 2016 · 9 comments

Comments

@bbarenblat
Copy link
Member

bbarenblat commented Nov 6, 2016

Ur/Web should have an automated code formatter à la clang-format or YAPF. That is, we should have a tool which parses Ur/Web code like the compiler and then pretty-prints it.

Most of the framework for this already exists – we have a parser and a pretty-printer, and you can invoke only those parts of the compiler with urweb -stop parse. However, urweb does quite a bit of desugaring in the parse stage, so the earliest parse tree looks rather different from the original source file. For example, running urweb -stop parse stuff, where stuff.ur contains

val main : transaction page = return <xml><body>Hello, world!</body></xml>

gives

structure Stuff =
 struct
  val
   main : transaction page =
    return
     (Basis.tag Basis.null Basis.None Basis.noStyle Basis.None {}
       (body {}) (Basis.cdata "Hello, world!"))
  end
 export Stuff

which is an equivalent program, but not any easier to read.

I’d implement this by moving desugaring into its own phase so that the immediate parse tree has a one-to-one correspondence with the original code. This will require either introducing a new ‘pre-parse tree’ data structure or expanding the parse tree type, since the parser currently throws away a good amount of information about concrete syntax during the parse step. Then, we’d need to tweak the pretty-printer until it emits a nice-looking parse tree.

@achlipala
Copy link
Contributor

This sounds like a value-added feature, but I'm not planning to tackle it myself in the foreseeable future.

@ashalkhakov
Copy link
Contributor

What if we implement a separate parser for this task? How hard could that be?

Some day a separate parser might be useful regardless of this use-case, e.g. if you want to plug into an IDE.

@achlipala
Copy link
Contributor

Yeah, probably it would be a pretty reasonable job to build a separate parser. It would probably take me about a week... but I don't see this rising to the top of my personal to-do list soon.

@ashalkhakov
Copy link
Contributor

If you can provide some details, then I guess it shouldn't be too hard to tease out the lexer/parser from the rest of Ur/Web. The first version would use SML, but we could rewrite it Ur/Web too, I guess (to make it readily available in the browser, for instance).

@achlipala
Copy link
Contributor

Files src/urweb.grm and src/urweb.lex are already the freestanding parser and lexer. No teasing-apart required!

@bbarenblat
Copy link
Member Author

I think a bit of teasing-apart may still need to happen – the parser throws away information about the concrete syntax of the source files. For instance, if you specify classes with class="foo bar baz", the parser desugars that to class={classes (classes foo bar) baz} or something like that.

The first step really is to split out the desugaring code from the parser code. Instead of having the parser directly produce an AST, we should have the parser produce, well, a parse tree – a data structure that has a one-to-one correspondence with the concrete source syntax but is more easily manipulated to do things like pretty-printing. Then, desugaring should be its own compiler phase.

@achlipala
Copy link
Contributor

I agree that all that sounds useful, but I don't expect to spend time building that alternative parser myself. Presumably the literal ml-yacc grammar in Ur/Web's source can be used, with different semantic actions that avoid desugaring.

@ashalkhakov
Copy link
Contributor

I've been working on that tool: https://github.com/ashalkhakov/urwebfmt (because urweb-mode formatting is broken for me and I couldn't figure out how to fix it).

@achlipala
Copy link
Contributor

Regrettably, there are plenty of known issues with urweb-mode that I'm used to working around with manual character-by-character editing. That's what happens when I copy another mode whose code I don't understand well, to use as the base for mine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants