Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use yaml.v3 instead of modified yaml.v2 for handling YAML files #791

Merged
merged 15 commits into from
Feb 21, 2021

Conversation

felixfontein
Copy link
Contributor

This is my first shot at using yaml.v3 instead of the modified version of yaml.v2 for handling YAML files.

So far, I only implemented reading YAML files. My first conclusions:

  • yaml.v3 seems to also have comment handling problems (comments placed in wrong places).
  • yaml.v3 only supports one YAML document per file.

@autrilla
Copy link
Contributor

This is my first shot at using yaml.v3 instead of the modified version of yaml.v2 for handling YAML files.

Great! Glad to see someone taking a stab at this.

So far, I only implemented reading YAML files. My first conclusions:

  • yaml.v3 seems to also have comment handling problems (comments placed in wrong places).

Well, that's very unfortunate. That's the biggest benefit I expected from migrating to v3. Even so, longer term I think it's better to migrate, since I imagine contributions to v3 will be more common than to the mozilla-services fork, and eventually it'll all be ironed out.

  • yaml.v3 only supports one YAML document per file.

From reading the documentation (probably less thoroughly than you did, though), ISTM that it's supported, but through the more cumbersome Decoder API.

@felixfontein
Copy link
Contributor Author

  • yaml.v3 seems to also have comment handling problems (comments placed in wrong places).

Well, that's very unfortunate. That's the biggest benefit I expected from migrating to v3. Even so, longer term I think it's better to migrate, since I imagine contributions to v3 will be more common than to the mozilla-services fork, and eventually it'll all be ironed out.

I hope so too!

  • yaml.v3 only supports one YAML document per file.

From reading the documentation (probably less thoroughly than you did, though), ISTM that it's supported, but through the more cumbersome Decoder API.

I might not have done that. I mainly saw the note in https://github.com/go-yaml/yaml/tree/v3#compatibility that it only supports one document. The Decoder API sounds interesting, I have to take a closer look at how it works.

For now I implemented a hack which simply splits YAML files by "\n---\n" and parses every part independently.

@felixfontein
Copy link
Contributor Author

Decoder was a good idea, df78351 implements it instead of the hack.

@felixfontein
Copy link
Contributor Author

Ok, I think this is a first working version.

It unfortunately fails tests due to a problem with comments that was fixed with the modified yaml.v2 copy, namely TestComment3 fails with difference:

@@ -4,5 +4,5 @@
     podLabels:
+        jobLabel: node-exporter
         ## Add the 'node-exporter' label to be used by serviceMonitor to match standard common usage in rules and grafana dashboards
         ##
-        jobLabel: node-exporter
     extraArgs:

@autrilla
Copy link
Contributor

This is great to see! Have you found any instances that the previous version did not handle well, but this one does?

@felixfontein
Copy link
Contributor Author

I've cherry-picked #793 into this PR so I can fully compile sops from this PR. So far I've been mostly sticking to the YAML store tests, since they work without the INI store that doesn't compile. I'll now do some more tests...

@felixfontein
Copy link
Contributor Author

One improvement of this branch I found: if encrypting something like

a:
  b:
    c: d

# comment
e: f

the comment was moved to the c: d level with the old version. The new version keeps it at the correct level.

Also, indentation of list elements differs (IMO it is better): instead of

list:
-   item:
    -   another item

you get

list:
    - item:
        - another_item

@autrilla
Copy link
Contributor

Ah, good, so at least we're also getting some improvement. Because of the indentation change, we should recommend to our users to "rotate" all their files with the new release, to avoid difficult to read diffs in the future.

@felixfontein
Copy link
Contributor Author

Yes, definitely. This should also be more than a bugfix release, since this is potentially breaking (especially since comments might "jump" to new places, I guess this already was the cause for #695 (comment), and this PR will have the same problem).

I've tried fixing some of the YAML problems in yaml.v3 in go-yaml/yaml#684.

@felixfontein
Copy link
Contributor Author

I think I'll comment out the tests that wait for the yaml.v3 fix, so maybe we can get this merged already (and update again later once the yaml.v3 fix lands).

@felixfontein
Copy link
Contributor Author

@autrilla I cannot reproduce the functional test errors locally:

$ ./sops -d res/comments.enc.yaml
# first comment in file
lorem: ipsum
# this-is-a-comment
dolor: sit
$ ./sops -d res/comments_unencrypted_comments.yaml 
[SOPS]	 WARN[0000] Found possibly unencrypted comment in file. This is to be expected if the file being decrypted was created with an older version of SOPS.  comment=" first comment in file"
[SOPS]	 WARN[0000] Found possibly unencrypted comment in file. This is to be expected if the file being decrypted was created with an older version of SOPS.  comment=" this-is-a-comment"
# first comment in file
lorem: ipsum
# this-is-a-comment
dolor: sit

From how I interpret functional-tests/src/lib.rs, it executes the two above commands and expects first comment in file and this-is-a-comment to show up in stdout. Which they do in the above runs.

@felixfontein
Copy link
Contributor Author

@autrilla any idea what could be the problem here?

@autrilla
Copy link
Contributor

autrilla commented Feb 9, 2021

No idea. It passes locally. Let's try rerunning the action.

@felixfontein
Copy link
Contributor Author

Ok, so from the debug output, the first comment is really missing. So for some reason the binary built for the functional tests seems to use a different yaml.v3 version.

@autrilla
Copy link
Contributor

autrilla commented Feb 10, 2021 via email

@felixfontein
Copy link
Contributor Author

@autrilla I noticed that the build step (whose result is used in the functional tests) uses the -v argument, which prints the package names as they are being compiled. It lists two YAML packages:

gopkg.in/yaml.v2
github.com/mozilla-services/yaml

But it does not list gopkg.in/yaml.v3. So whatever it is building in that step, it is NOT the version from this branch.

Maybe the checkout command needs to be adjusted to check out the repo in the correct place ($GOPATH/src/go.mozilla.org/sops/), maybe then it will work better? Maybe similar to here: https://github.com/go-yaml/yaml/blob/v3/.github/workflows/go.yaml#L28-L31

@autrilla
Copy link
Contributor

That's weird and concerning, glad you found out. ISTM like we might be building the code from master every time in Github Actions, which is... not what we want to run PR tests against.

@felixfontein
Copy link
Contributor Author

I've squashed the commits and created a new PR for the CI fix commit: #820

@felixfontein felixfontein changed the title [WIP] Use yaml.v3 instead of modified yaml.v2 for handling YAML files Use yaml.v3 instead of modified yaml.v2 for handling YAML files Feb 21, 2021
@felixfontein
Copy link
Contributor Author

I think I'll comment out the tests that wait for the yaml.v3 fix, so maybe we can get this merged already (and update again later once the yaml.v3 fix lands).

Now that tests finally pass: @autrilla, what do you think?

Copy link
Contributor

@autrilla autrilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@autrilla autrilla merged commit e2d6d0f into getsops:develop Feb 21, 2021
@felixfontein felixfontein deleted the yaml-v3 branch February 21, 2021 20:04
@felixfontein
Copy link
Contributor Author

@autrilla thanks for reviewing and merging both PRs!

@ajvb ajvb mentioned this pull request Mar 24, 2021
mapping.Kind = yaml.MappingNode
// Marshal branch to global mapping node
store.appendTreeBranch(branch, &mapping)
if len(mapping.Content) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me what this does?
This looks cause #907 , and removing this looks cause nothing bad (as I don't know what good this does).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #908

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed so that empty document works well with go-yaml/yaml#690. Unfortunately that fix for yaml.v3 isn't progressing, so I guess for now we can remove that branch. Once that fix landed in yaml.v3, this branch is definitely needed though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the behavior of sops with your implementation of yaml.v3 by replacing in go.mod:

replace gopkg.in/yaml.v3 v3.0.0-20210107172259-749611fa9fcc => github.com/felixfontein/yaml v0.0.0-20210209202929-35d69a41298b

(35d69a41298b is the last commit of https://github.com/felixfontein/yaml/tree/improve-empty-document-handling)

It results sops confuse "no document" and "empty mapping":

# echo -e "# no document\n---\n# empty mapping\n{}\n" | sops --input-type yaml --output-type yaml -e /dev/stdin | sops --input-type yaml --output-type yaml -d /dev/stdin
# no document
---
# empty mapping

They are distinguished in yaml: http://ben-kiki.org/ypaste/data/21512/index.html

Unfortunately, the internal structure of sops (TreeBranch) cannot distinguish "no document" and "empty mapping".
I think we need to extend TreeBranch to have a flag that indicates "no document".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It results sops confuse "no document" and "empty mapping":
Unfortunately, the internal structure of sops (TreeBranch) cannot distinguish "no document" and "empty mapping".

Exactly. The need to represent "no document" seems to be more important than "empty mapping" - see #695 (comment). I haven't seen (or at least can't remember :) ) any request yet for being able to recover {}.

I think we need to extend TreeBranch to have a flag that indicates "no document".

I don't think it can be solved that simply. While this works when cycling between YAML and the internal data structures, it does not help at all when a YAML file is encrypted. Adjusting the metadata won't help since that will not work for multi-document YAML files (metadata is only there once per YAML file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants