Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bleep's build normalize command removes comments from build file #350

Open
carlosedp opened this issue Aug 23, 2023 · 6 comments
Open

Bleep's build normalize command removes comments from build file #350

carlosedp opened this issue Aug 23, 2023 · 6 comments

Comments

@carlosedp
Copy link

I have some comments in my build file, like:

  httpserver-agent:
    # Use this project to generate graalvm native image metadata
    dependsOn: httpserver
    extends: template-common
    platform:
      jvmOptions: -agentlib:native-image-agent=config-merge-dir=httpserver/src/resources/META-INF/native-image
      mainClass: ZioHttpApp

But when I run bleep build normalize, the comment is removed. I also tried with the comment inlined with the project name but it also vanishes.

Would be nice if they could be kept as a kind of self-documenting feature.

@oyvindberg
Copy link
Owner

Yeah, it's unfortunate.

The reason why the comments disappear is that the YAML file is immediately read into a JSON AST when it's parsed, so comments are discarded right away.

Either we need to re-implement the handling of YAML to use a comment-aware library, or we need to come up with a hack. A hack could be to parse out the comments ourselves and patch then into the newly generated YAML file ourselves

@sideeffffect
Copy link
Contributor

Or introduce custom "comment" field(s).

@oyvindberg
Copy link
Owner

that's a pretty good idea actually!

@carlosedp
Copy link
Author

That would be great and waaaay simpler :)

@SlowBrainDude
Copy link
Contributor

SlowBrainDude commented Sep 17, 2023

I've also run into this. Very unfortunate, as it deleted commented out dependencies while I was experimenting with different versions of stuff.

I don't think including comments as regular part of the build model is a good solution.

There is no value in having comments on the model as comments have no semantics in the build…

Also: Pretending to support YAML configs but than not supporting one of the distinctive features that make YAML a slightly better choice for configs than JSON, namely being able to put comments in your config, is not only a clear volition of the principle of least surprise but actually leaks implementation details straight into the face of the user.

I was at first not even sure I like the idea this tool uses YAML for its config.

YAML is one of the worst data formats that ever "happened", and one should usually avoid it like the plague, as YAML is straight from hell.

But I came to the conclusion that YAML is actually a good choice for this tool here. You don't invent any programming language embedded in YAML, and the build model is supposed to be simple out of principle. For such a "simple" use-case YAML is actually good, as it's really noise free, and has convenient features like schema support.

But when you choose YAML, it should actually work properly. YAML supports comments, so comments need to be supported by the tool… (Everything else is unexpected for the user, and defeats to some degree the point of using YAML in the first place.)

So I've investigated the issue a little bit.

First conclusion: The situation is frankly dire…

First of all there are not much YAML processing libs that support YAML comments at all.

Part of the reason is likely that the YAML "spec" is just nuts!

Even there are hundreds of lines of spec that specify all the crazy YAML stuff around comments (yes, YAML is so badly messed up that it needs hundreds of lines to specify comments!) it also says at least a few times that you should not process the comments… Comments are "presentation details" and processing tools should "ignore" presentation details; even the spec also says a few times how important comments are for humans; and the word "comment" appears dozens of times even on the formal spec parts… Self contradicting nonsense… But who would expect otherwise, that's YAML…

There is likely one pure Scala lib which would currently even make it possible to parse comments out of YAML:

https://github.com/aml-org/syaml

But that's a very low level lib… (Also not broadly know. I've never heard of it before.)

The most popular pure Scala YAML lib at the moment seems to be:

https://github.com/VirtusLab/scala-yaml

But they completely ignore comments. Would be likely a middle sized project to change that.

They follow the YAML spec quite closely, and the YAML spec says you should ignore comments. So you would likely need to hack something on the side in direct opposition to the words of the spec… (OK, maybe not, as the spec also has some notion of "YAML Information Model" which actually includes Comments; just that's not what scala-yaml implements; it seems they implement the "Representation Model". But who knows, YAML is self contradictory trash!)

https://yaml.org/spec/1.2.2/#32-information-models

The other Scala libs doing YAML are the same as the Circe addition: They just fake YAML on top of JSON, which comes with the know limitations…

The other option would be to use some Java YAML lib. There is actually something that looks promising, at least regarding the features:

https://github.com/decorators-squad/eo-yaml

But of course, it's Java; and it doesn't support codec derivation…

One would need to rework the bleep model to base it on that. This looks like a lot of work (even this would be at least possible).

Imho the best approach here would be to extend scala-yaml so it supports comments, use this support in something like https://github.com/armanbilge/circe-scala-yaml and extend the bleep models to support comments on top of that. This is a lot of work… (And one would need to figure out how to actually support comments at all… Especially when it comes to codec derivation.)

Just a very basic and rough idea I had yesterday: Maybe you could add a "presentation details" field on the YAML Node type, and put comments (and their syntactic properties like the indentation level) there. Then one could have some codec derivation option which would require your model types to include Option-typed _commentBefore and _comment fields to which "maybe comments" could be copied from the Node. (The naming is "stolen" from the YAML AST found in https://eemeli.org/yaml/#yaml).

As the underscores in the field names suggest this wouldn't be part of the actual bleep build model, but a kind of internal machinery to carry comments though transformations. It would be likely more clean to have this info somehow separated (as the actual YAML spec would in some kind suggest), but than it would be really really hard to come with some automatic codec derivation. So the _commentBefore and _comment fields are a kind of hack, I admit, but the best I came up so far. The issue at the core is of course that YAML doesn't specify how to do it "correctly"…

@oyvindberg
Copy link
Owner

Thanks for researching. I agree that it looks basically impossible right now to get the comments all the way through yaml -> scala -> yaml. It would also heavily affect the bleep model classes, at that point you need to extend all the structures to be able to carry comments as well. I did this in ScalablyTyped and it's infuriating to do well.

So this leaves us with two options:

  • The comment field suggested above. An easy to implement solution, and also useful because we can render it in the user interface. for instance if a project has a description we can show it in the project list or in zsh auto-completions.
  • Patching. in theory there is nothing stopping us from writing some code which, at the textual level, picks out all comments from the old yaml file along with the yaml line they are attached to, and after we generate the new yaml file to patch them back in. this could be a fun project I think, as long as we accept that some low percentage of comments will disappear if the anchor yaml fragments change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants