Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow outputting declaration-level comments in CoreFn #4442

Open
JordanMartinez opened this issue Feb 28, 2023 · 9 comments
Open

Allow outputting declaration-level comments in CoreFn #4442

JordanMartinez opened this issue Feb 28, 2023 · 9 comments

Comments

@JordanMartinez
Copy link
Contributor

Summary

The compiler only outputs module-level comments for CoreFn. I propose all comments are included.

Motivation

The primary use case I have in mind is better supporting the purs-backend-es inline directives. When one wishes to put inline directives into the source code file (as opposed to some external file), all such directives must be appear in the module-level comment because that's the only place where such comments will appear in the outputted CoreFn. Ideally, at least declaration-level comments would be outputted as well, though I don't see why all comments wouldn't be outputted.

What I have to use right now

-- @inline export add arity=2
module Foo where

add :: Int -> Int -> Int
add a b = a + b

What I want:

module Foo where

-- @inline export arity=2
add :: Int -> Int -> Int
add a b = a + b

I'm not sure why non-module comments aren't already outputted. Perhaps this was actually an oversight? AFAICT, including --comments will output such comments on the JS output, but not the CoreFn one.

Proposal

We update the coreFn target to always output comments, regardless of whether the --comments flag is passed. I argue for this for two reasons:

  1. Since corefn is consumed by those outputting some other target than the default js one, consumers will likely include their own options for how to handle comments.
  2. If this isn't the default, I imagine it will produce more harm than good, especially when tools like purs-backend-optimizer are used.
  3. It seems that module-level comments are already outputted, whether the --comments flag is passed or not.

A counterargument is that emitting comments makes the computer do extra work when there are tools that don't use them. Moreover, the resulting corefn files are unnecessarily larger when files have large or many comments.

@rhendric
Copy link
Member

This (in its maximal version, at least) would make comments more than simply lexical whitespace; they would become part of the grammar of the language if they can be associated with other syntax elements in a way that downstream tools might rely upon.

I wonder whether we should consider a proper annotation syntax instead.

@natefaubion
Copy link
Contributor

natefaubion commented Feb 28, 2023

I wrote up this on discourse a while back, but I don't have the time to see it through. It may be useful as a starting point:
https://discourse.purescript.org/t/adding-syntax-for-annotations-on-declarations/988

@natefaubion
Copy link
Contributor

natefaubion commented Feb 28, 2023

Regarding this issue specifically, new annotation syntax and semantics is a significant amount of work involving quite a bit of bikeshedding, whereas including more comments in corefn seems fairly uncontroversial. I believe the main compiler itself emits more comments in it's own output to JS, but just strips them from corefn. Is using comments for tooling annotations kind of janky? Sure, but the compiler doesn't have to know or care about it.

@JordanMartinez
Copy link
Contributor Author

Another argument for doing this is that it allows us to see how others use this to develop new features and tools without supporting it initially. After a while, we have more use cases to help guide what kinds of annotation syntax we might want to provide.

@rhendric
Copy link
Member

I wrote up this on discourse a while back

I very weakly think annotations make more sense as an extension point for external tools than as a universal framework for inelegant core language features like most of the cases you called out in that thread.

I believe the main compiler itself emits more comments in it's own output to JS, but just strips them from corefn.

But that's a best-effort convenience, not something we would specify in a language spec. We emit some, but not all, of those comments, and sometimes they're in awkward places, and too bad if you rely on that downstream and we change things on you. If we instead started offering guarantees about which comments in which locations would be associated with which AST nodes, that opens up its own can of bikeshed-painting worms. Answering those questions—even with a ‘inline annotations aren't allowed here’ response as a default—is isomorphic to the process of defining a new annotation syntax that we choose to make backward-compatible with comments. I'm saying let's be explicit about that choice, because neutrality isn't an option—either way, we are adding annotations as a language feature and we'll have to pay the price of changing a language feature if we want to revise the design later.

@rhendric
Copy link
Member

rhendric commented Mar 1, 2023

A relatively simple example:

x =
  -- @allow-overflow
  a * b + c

We need to decide:

  • Will the CoreFn include this comment? i.e., can a backend rely on it being available?
  • If so, which CoreFn node holds it:
    • a?
    • App mul a?
    • App (App mul a) b?
    • App add (App (App mul a) b)?
    • App (App add (App (App mul a) b)) c?
    • NonRec x (App (App add (...)) c)?

There is no leaving this bikeshed unpainted. It must have a color, even if that color is ‘this comment is not included because it isn't in a valid position for inclusion’. Defining the set of valid positions relative to the language grammar is defining an annotation syntax. Simply turning off the comment filtering and letting the chips fall as they may is abdicating responsibility for the decision, but the decision is still being made and its consequences will still have effect.

@natefaubion
Copy link
Contributor

natefaubion commented Mar 1, 2023

If we instead started offering guarantees about which comments in which locations would be associated with which AST nodes.

Should it be done either way? As you said, it's already hit and miss, and sometimes in awkward locations. I guess it's not clear to me why the compiler is doing this at all then. Regardless, I don't really agree that such bikeshedding is equivalent to annotations since you first have to agree on annotations syntax, declarations, and semantics.

@rhendric
Copy link
Member

rhendric commented Mar 1, 2023

I guess it's not clear to me why the compiler is doing this at all then.

I see it as part of the first bullet point on https://www.purescript.org/: we try to emit readable JavaScript, and some best-effort copying of comments to output probably helps that goal more than it hurts it when we miss.

I don't really agree that such bikeshedding is equivalent to annotations since you first have to agree on annotations syntax, declarations, and semantics.

Am I being too pedantic? I'm saying that going with comments as annotations smuggles agreement on those questions along with it: syntax is the syntax of comments, we don't require them to be declared anywhere, and semantics are that annotations are free-text strings to be interpreted entirely by the external tool consuming them.

Steel-manning your position, I think you're trying to point out that an annotation proposal should be expected to look more like the one in your Discourse thread, which would probably have more involved answers to those questions, and thus would take more time to design, but that's not necessarily true. Look at docstrings in Python, for example—and yes, culturally there's a rich ecosystem of rules for how to write docstrings that has evolved over years but as far as the core language is concerned, they're just string literals used as the first statement in a function or class declaration, and the language only specifies that when such a literal is present it's exposed as the __doc__ attribute on the relevant object.

What I'm saying is that comments-as-annotations should be considered one possible design for annotations rather than a free option that doesn't require us to make design decisions. It is certainly cheaper than some other designs for annotations I could imagine, but it still involves some decisions (including the positioning questions I'm raising) and some costs, and I think we should consider alternatives before jumping to implement it.

@natefaubion
Copy link
Contributor

natefaubion commented Mar 1, 2023

I'm suggesting that:

  • The compiler already guarantees comments for the module header (It's not actually clear at all if this is the case, since the annotations type it emits also contains "comments", it just always emits the empty list).
  • The compiler already emits additional comment syntax for it's own backend, but doesn't let any other backend do that.
  • The compiler can just not filter comments, or it can define a set of comments that it will emit, and that have nothing to do with "annotations."

Yes, any and all comments can be used for annotations. And yes, I'm using them right now for the module header even though I don't believe there is any guarantee from the compiler that these comments will be there in the future (based upon it's other treatment of comments).

My argument for annotations bikeshedding comes from the assumption that users will want some sort of structure, as that is the expectation for this feature in other languages that support something officially called "annotations" (Java, C#, Rust, ..). Python doc comments are the equivalent of specifying which comments we will emit in CoreFn. You can put "strings" anywhere in python, but only strings in certain positions are preserved as metadata on declarations. It's not obvious to me why we would go through the effort of bikeshedding a completely new language feature for something that primitive.

Edit: I was mistaken about the comments in annotations. It does not emit that field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants