Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support (un)marshaling of comments #132

Closed
DirectXMan12 opened this issue Oct 7, 2015 · 22 comments
Closed

Support (un)marshaling of comments #132

DirectXMan12 opened this issue Oct 7, 2015 · 22 comments

Comments

@DirectXMan12
Copy link

Reopening from #129

It would be useful if go-yaml supported outputting comments when marshaling an object into YAML.

The relevant bits of the YAML specification:

Comments can be useful to provide human-readable information about the contents of a YAML document without affecting the actual content of the YAML itself. For instance, comments could be used to provide documentation about fields in generated configuration information, or to point out errors in the configuration (e.g. "key A may only have values X, Y, or Z").

@DirectXMan12
Copy link
Author

We can certainly support comment manipulation, but we'll need to design this a little bit better first.

Namingly, we need to ignore the internal implementation completely and look from the perspective of someone using the feature both to consume comments and to produce them back into the expected locations.

According to the YAML 1.2 specification, comments are not supposed to live anywhere but the presentation form (see http://yaml.org/spec/1.2/spec.html#id2762140 , http://yaml.org/spec/1.2/spec.html#id2767100 , http://yaml.org/spec/1.2/spec.html#id2766150), and are not supposed to be "attached" to any particular node.

Since comments are not supposed to survive into the unmarshaled data structure, and other presentation details (e.g. style) don't survive a round trip of unmarshaling into interface{} and then remarshaling, IMO we don't need to worry about presenting deserialized comments to the user or having the "survive" a round trip.

Currently, it appears that go-yaml supports three ways of influencing the marshaling process:

  • Using MarshalYAML (and/or MarshalText), the user can control the representation stage
  • Using struct tags, the user can affect the representation stage (inline and omitempty) and the presentation stage (flow).
  • Using MapItem and MapSlice, the user can affect the serialization phase (specifically the key ordering part)

In my opinion, the ability to have "dynamic" (i.e. not hard-coded) comments is required (e.g. in the case of indicating validation errors, as I described above). Ergo, while the struct tag method could be an option for adding comments, it cannot be the only option. If we wish to utilize the existing machinery, this means we are left with two options -- either an extension to MarshalYAML that somehow adds in comment comments (this allows for some degree of dynamic comments but makes certain things hard), or attaching comments to MapItem values (as I implemented in #129).

All three of the options presented above, however, suffer from the fact that they necessarily "attach" comments to nodes. According to the specification, this is not fully correct behavior. One solution to this issue would be to implement some sort of callback or filter mechanism that would allow users to influence the low-level event stream used to produce the output. This has a couple benefits: it allows comments to be inserted at arbitrary points in the documents and it would allow users to control other style decisions on the fly. The downsides of this approach is that extra validation would be needed to ensure that the user did not modify the event stream in an invalid way. It's also a bit more complicated for users who have the "annotate fields in their YAML document" use case.

We also need to outline which cases we'll try to support when writing, and which cases are going to cause clear havoc in the output.

The "simplest" case is the case where a comment is on a new line. The slightly more complicated case is for an "end-of-line" comment (scalar values and flow-style values would have to be shifted to the next line if it appeared between a key and a scalar value or flow-style value etc), and the indentation should match up for comment lines after the end-of-line comment. The former case is a must-have, IMO. The latter case would be nice to have.

@niemeyer
Copy link
Contributor

niemeyer commented Oct 7, 2015

Without comments surviving a roundtrip it sounds like a very bad idea to support them at all.

In terms of the specification recommending against attributing meaning to them, that doesn't mean we shouldn't attempt to preserve comments. That's equivalent to how Go itself handles comments in its specification and its standard parser.

Regarding handling end of line comments, this sounds like the easiest to handle, semantically, as it's very clearly assigned to the value on that line.

@DirectXMan12
Copy link
Author

Without comments surviving a roundtrip it sounds like a very bad idea to support them at all.

While I'm certainly willing to implement this, it seems strange, IMO to preserve some presentation details and not others -- why preserve comments and not style, for instance?

Did you have any ideas about how to allow users to specify comments (such as the ideas I presented above?)

@niemeyer
Copy link
Contributor

niemeyer commented Oct 8, 2015

Either we ignore comments, or we don't. If comments are not relevant to be marshaled and unmarshaled, then we ignore them completely which is what we do today. If we decide to not ignore them, then we can't pretend they are irrelevant.

No, I haven't had a chance to stop and design it considering all the problems that may surface.

@niemeyer niemeyer changed the title Support Comments Support (un)marshaling of comments Oct 8, 2015
@yobert
Copy link

yobert commented Oct 14, 2015

So, this tells me we need to do everything through an AST first. Then the AST nodes can keep full roundtrip information for comments, line numbers from the source file, etc.

Most people who don't need that could use the regular API, but other people could do something to get a parse tree before unmarshal?

That sounds like a giant rewrite though... :(

@DirectXMan12
Copy link
Author

@yobert I've been hacking around at exposing a "user-friendly" version of the underlying events-based model (then you could just emit or receive a "comment" event). This way, users that need/want to can build stuff using the event-based model, or by mixing it in with the traditional "use struct tags and/or dicts and slices" model for specific objects. I'll post some code when it's a bit more fleshed out.

The main issue that I've encountered so far is that certain design decisions of the underlying model are not conducive to comments in certain places (e.g. the ':' gets written by the value in a map, so the following is not immediately doable:

foo: # some comment
     # continues here
  bar

(as shown in 6.11)

@jvehent
Copy link

jvehent commented May 27, 2016

Has there been more interest in implementing comment encoding/decoding support? I found this patch from @DirectXMan12 which doesn't appear to work yet, but is an interesting starting point. I'm looking at porting sops over to Go and comment preservation is something we care a lot about.

@DirectXMan12
Copy link
Author

@jvehent I have a more complete patch around somewhere, but I didn't quite get time to finish it. I'll see if I can dig it up.

@alexbrand
Copy link

@DirectXMan12 what's the status of this issue? It would be great to be able to (un)marshal comments.

@jmreicha
Copy link

Any update on this? I also have a use case where I'd like to preserve comments.

@DirectXMan12
Copy link
Author

I don't think I ever quite finished the full round-trip patch, but I'll see if I can dig it up (it had mostly slipped my mind at this point).

@lox
Copy link

lox commented Mar 12, 2018

@rogpeppe will v3 bring anything to the table for something like this? Otherwise it sounds like it would be a massive change.

@niemeyer
Copy link
Contributor

It will bring us closer to it, by having a representation that preserves more of the syntax tree.

@IED8D
Copy link

IED8D commented Jun 20, 2018

@niemeyer can you share more about the timeline of v3. And is there any chance that #219 will be merged to master given the test are improved and conflicts resolved ?

@khimaros
Copy link

khimaros commented Jul 6, 2018

It appears that Mozilla has provided a fork of go-yaml which supports comment parsing: v2...mozilla-services:v2

@rfay
Copy link

rfay commented Jun 4, 2019

This has apparently been fixed in yaml.v3? (explained in announcement. However, the announcement doesn't show how you can actually use or preserve comments, or how you to navigate the new Node structure. Stack Overflow question on this

@martindekov
Copy link

So from documentation, I saw that node can carry itself around with its comments. So this code should return me the comments preceding, inline, and after the (augmented from the stack overflow) yaml:

func main() {
	type Person struct {
		Name    string    `yaml:"name"`
		Address yaml.Node `yaml:"address"`
	}

	data := `
name: John Doe
#comment here
address: 
    street: 123 E 3rd St
    city: Denver
    state: CO
    zip: 81526
`

	var person Person
	err := yaml.Unmarshal([]byte(data), &person)
	if err != nil {
		fmt.Printf("Failed to unmarshall: %v", err)
		os.Exit(1)
	}
	address := &yaml.Node{}
	_ = person.Address.Decode(address)
	fmt.Printf("%v", address.FootComment)
}

Alternated version from the stack overflow, though this should give the comment above the address part, considered as node, I can access Content field though. I hope I am doing something wrong.

@DirectXMan12
Copy link
Author

you can also just deserialize directly to a node tree, instead of having the Person struct in the middle. I can confirm that that works for me.

@jpninanjohn
Copy link

Is there a way I can add comments directly from the struct? Consider I dont yet have a yaml, but I want unmarshal a struct into the yaml with every field having an optional comment. Is this achievable now? I am not able to find any example of this.

@rockholla
Copy link

I've waited for quite a while to see what might happen with comment support in go-yaml or elsewhere. In case it's useful for anyone else, I've been using this for a while as an alternative for being able to marshal yaml with some comments based on a field tag comment: https://github.com/rockholla/go-lib/tree/main/marshal. It's not perfect, but has worked pretty well for me in a lot of cases, e.g. generating template/starter config YAML files from structs.

@jsumners
Copy link

Is there a way I can add comments directly from the struct? Consider I dont yet have a yaml, but I want unmarshal a struct into the yaml with every field having an optional comment. Is this achievable now? I am not able to find any example of this.

This is exactly what I'm looking to do. Is there some sort of recommended practice to solve this? I'd rather not have to maintain a static text file, and instead generate it from the defined structs.

@jroper
Copy link

jroper commented Aug 11, 2023

I'd like to second adding comments directly from the struct.

Round tripping comments/style is a completely separate issue for completely different use cases. As noted earlier, YAML comments are not intended to be used anywhere but in the presentation form. This I believe could be used as an argument for why it doesn't make sense to round trip comments. However, if you are marshalling YAML for the purposes of presenting it to a human - well there's a place where being able to marshal comments is in line with comments intended purpose.

A common use case for this would be in commands similar to kubectl edit, where you are dropped into an editor with a YAML representation of the resource presented to you. If that presentation of the YAML included some documentation for each field in comments, that would greatly improve the user experience for such tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests