Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json.marshal_with_options() builtin for indented/"pretty-printed" and/or line-prefixed JSON #6636

Conversation

sean-r-williams
Copy link
Contributor

@sean-r-williams sean-r-williams commented Mar 16, 2024

Why the changes in this PR are needed?

Currently, OPA only outputs "compressed" JSON when marshaling - that is, JSON with no newlines or indentation between elements. This is favorable for automated consumption, but hard for humans to read.

When the outputs of an OPA policy are used in a human-readable area, indented (or "pretty-printed") JSON is more preferable. A great example of this is SpaceLift notification policies - the pull_request.body output is expected to be Markdown and attached to a PR. This example in their docs shows how this works in practice.

What are the changes in this PR?

This PR adds a new json.marshal_with_options builtin, which wraps over Go's json.MarshalIndent() function. The function behaves similarly to json.marshal, but has an additional parameter for encoding options. The supported options (indent and prefix align with the superset features in MarshalIndent(), but could be extended with additional features later if needed.

If this is null, the function defaults to indenting with \t and no line prefixes.

Fixes #6630.

Notes to assist PR review:

I wrote this feature on a Windows host and had a lot of trouble getting tests to run. A lot of tests are failing with path-related errors, but FWICT these are all unrelated to this change. Let me know if there's something I missed.

I'm not sure if I need to squash my commits before or after getting the PR approved, so I left them as-is for now.

Further comments:

Some other implementations I considered:

  • Supporting optional indentation parameter As I discovered, Rego does not currently support variadic functions with returned data, so this is a non-starter.
  • Exposing full json.MarshalIndent() functionality from Go (e.g. 3rd prefix argument): I wasn't sure how frequently this is used in practice and wanted to limit the amount of boilerplate parameters users might have to provide. I've added support for the prefix parameter from MarshalIndent - see this comment for more details. I'm happy to align marshal_with_options() with its corresponding Go function if this is something others deem valuable to do this now before there's any dependency on the function's prior implementation.

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link

netlify bot commented Mar 16, 2024

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 176fc95
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/65f5573bdd18840008796f10
😎 Deploy Preview https://deploy-preview-6636--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link

netlify bot commented Mar 16, 2024

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit f70ca76
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/65f5583c650d120008f0745e
😎 Deploy Preview https://deploy-preview-6636--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link

netlify bot commented Mar 16, 2024

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit eafa8be
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/660f675ea715300008d872b2
😎 Deploy Preview https://deploy-preview-6636--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@charlieegan3 charlieegan3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start, thanks for persevering with the Windows Dev env. I have left a few comments to provide a few pointers, namely, adding support for the prefix arg would be a great addition here.

return builtins.NewOperandTypeErr(2, operands[1].Value, "string", "null")
}

bs, err := json.MarshalIndent(asJSON, "", indentWith)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recalling your original use case (generating JSON for markdown output), it might be helpful for other users in similar cases to be able to specify the prefix.

As I see it, this is the flexible version of json.marshal and it'd be preferred to have all the options available. Prefixing all lines with 4 spaces, might allow users to generate indented code blocks etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this would be easiest to add now (before the built-in has shipped to customers) so there's no chance of dependencies on a prefix-free function interface.

One of the reasons I shied away from adding the prefix argument was that golang's MarshalIndent (more specifically Indent, which it uses internally) has a strange "feature" in that the first line is emitted sans-prefix. This naturally makes sense for some scenarios, but specifying a prefix and then having to prefix the first line yourself anyways seems like it would violate the principle of least astonishment.

I see two options for adding a prefix parameter:

  • Do not abstract over Golang's MarshalIndent() prefix behavior, and call it out in the rego docs.
    This seems important enough to warrant a highlighted warning (or similar) with contextual links to the Golang docs, and I wasn't sure how to structure the function definition (or manually edit the docs) such that the docs were formatted appropriately.
  • Add in a copy of the prefix string before the first line of the emitted json payload. This naturally (and intentionally) deviates from Go's implementation, but I'd posit that knowledge of this implementation (or lack thereof) shouldn't be necessary for using OPA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a prefix param that aligns with Option 2 described above. Looking forward to your thoughts on this.

@sean-r-williams
Copy link
Contributor Author

Thanks for your patience, @charlieegan3 - I realized my team hadn't actually secured legal approval for contributing to this project, so I had to pause for a bit while we got that sorted internally. Legal has signed off on us contributing back to this project, so that shouldn't be an issue in the future.

sean-r-williams and others added 6 commits March 22, 2024 23:32
Co-authored-by: Charlie Egan <git@charlieegan3.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good to me 👍 🙂, thank you!

Sorry to propose a design change this late into the game, but I think we should consider using an options object argument instead of indent and prefix, for future-proofing and possibly helping readability.

ast/builtins.go Outdated
Decl: types.NewFunction(
types.Args(
types.Named("x", types.A).Description("the term to serialize"),
types.Named("indent", types.NewAny(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having two arguments, indent and prefix, might I instead suggest taking an options object as arg which contains these two string parameters (similar to providers.aws.sign_req() or http.send(), kinda).

This way, we gain a couple of benefits:

  • prefix will be truly optional, and users don't need to provide a null value for it, which will likely be the most common usage.
  • we're future-proofing this built-in, and can add new marshaling options (e.g string escaping) in the future without needing to bloat the list of built-ins.
  • users (primarily reviewers) don't need to keep track of the argument order, which is likely just gonna force them to revisit the docs a bunch.

If we make the above change, then this new built-in wouldn't be confined to indentation, so a rename would be necessary. Naming is hard, so this is where I regret making this suggestion 😅. Maybe json.marshal_with_options or simply json.marshal2 🤔 .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ I like that. Defaults can then be applied if the key prefix or indent isn't present in the opts object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can adjust the arguments, sure. A couple of questions:

  • How do we handle keys we don't know what to do with? Do we want to return an error if someone calls json.marshal_with_options(foo, {indent: " ", include_winning_lottery_numbers: true})? (What kind of error?)
    • There's two schools of thought here: Ignoring params we don't support allows easy forwards/backwards-compatibility of a given policy, but it also means we could miss an option that's otherwise "important" because it's not something the current version of OPA understands.
  • How do we adjust the documentation accordingly? The docs for those two functions show a fairly detailed description of every KVP supported in object-params, but the parameters for http.send() aren't documented in ast.go. Is there some other way to manually add docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like json.marshal_with_options myself. Similar to this recently merged one.

I think that we do want an error for someone passing an unknown key, as most likely it's a typo or something from an LLM etc. I think that returning a simple error from the built-in like unexpected marshal opt "indentt" or similar would be a good course of action. Similar to:

return verifyOpt, fmt.Errorf("invalid key option")

Re documentation, I think it'd be much preferred to copy what is used for providers.aws.sign_req rather than http.send. Http send is a special case as it has just so many different options. I feel like the docs table for these is in need of some work, but for now if we follow providers.aws.sign_req that makes sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the typing is done correctly here where the built-in is registered, the built-in implementation shouldn't need to check any other attributes than those it's explicitly interested in. See e.g. the json.patch builtin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason to delay this PR because of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also add that if we adopt the *_with_options pattern more frequently, we should figure out how to better present the opts objects in the docs.

With marshal_with_options, the formal type of opts is so long that text appears scrunched. (It was even worse when the description was longer 😅)
image

With x509.parse_and_verify_certificates_with_options, the type declaration is shorter but the lack of formatting flexibility in AST description tags means the param description is difficult to read:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason to delay this PR because of that.

@johanfylling Sorry, missed the earlier message. What's the next steps for getting this PR approved? I'm not sure who else needs to review, or what the signoff process looks like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need the types.NewAny wrappers on the parameters, or if it's enough to just use types.S and types.B where applicable 🤔.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any green review from a maintainer is enough for the PR to be merged. If no one beats me to it, I'll try to go through it all today or tomorrow.

Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing 👏 This looks good -- I was on the fence about the default mechanisms, but I think @johanfylling's proposal makes for a good compromise. What do you think?

topdown/encoding.go Outdated Show resolved Hide resolved
ast/builtins.go Outdated Show resolved Hide resolved
ast/builtins.go Outdated
Decl: types.NewFunction(
types.Args(
types.Named("x", types.A).Description("the term to serialize"),
types.Named("indent", types.NewAny(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ I like that. Defaults can then be applied if the key prefix or indent isn't present in the opts object.

…ameters to dictionary

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
@sean-r-williams sean-r-williams changed the title Add json.marshal_indent() builtin for indented/"pretty-printed" JSON Add json.marshal_with_options() builtin for indented/"pretty-printed" and/or line-prefixed JSON Mar 28, 2024
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
@anderseknert
Copy link
Member

Re-ran the failed check, and it's now green.

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
…dd `pretty` option to force pretty-print w/ defaults

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you 🙂.

Just a few suggestions.

ast/builtins.go Outdated
Comment on lines 1718 to 1728
types.Named("opts", types.NewAny(
types.NewNull(),
types.NewObject(
[]*types.StaticProperty{
types.NewStaticProperty("pretty", types.NewAny(types.B)),
types.NewStaticProperty("indent", types.NewAny(types.S, types.NewNull())),
types.NewStaticProperty("prefix", types.NewAny(types.S, types.NewNull())),
},
types.NewDynamicProperty(types.S, types.A),
),
)).Description("encoding options"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we treat null properties the same way we do missing ones, I think we can drop the optional nulls. This might also lessen the cognitive load for the user, as there are fewer options to select from.

Similarly, a null opts property will be treated the same as an empty object {}, so I think we can drop the null value option for opts too. We also have a precedence for this in crypto.x509.parse_and_verify_certificates_with_options.

This should also make the generated docs look somewhat nicer.

Suggested change
types.Named("opts", types.NewAny(
types.NewNull(),
types.NewObject(
[]*types.StaticProperty{
types.NewStaticProperty("pretty", types.NewAny(types.B)),
types.NewStaticProperty("indent", types.NewAny(types.S, types.NewNull())),
types.NewStaticProperty("prefix", types.NewAny(types.S, types.NewNull())),
},
types.NewDynamicProperty(types.S, types.A),
),
)).Description("encoding options"),
types.Named("opts", types.NewObject(
[]*types.StaticProperty{
types.NewStaticProperty("pretty", types.B),
types.NewStaticProperty("indent", types.S),
types.NewStaticProperty("prefix", types.S),
},
types.NewDynamicProperty(types.S, types.A),
)).Description("encoding options"),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - 7990d66 removed support for null values in opts or its child properties. Let me know if there's anything else you would like to see here.

topdown/encoding.go Outdated Show resolved Hide resolved
docs/content/policy-reference.md Outdated Show resolved Hide resolved
docs/content/policy-reference.md Outdated Show resolved Hide resolved
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
… be unneeded?)

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the hard work! 😃

@johanfylling johanfylling merged commit e0ee741 into open-policy-agent:main Apr 5, 2024
27 checks passed
tsidebottom pushed a commit to tsidebottom/opa- that referenced this pull request Apr 17, 2024
…ty-printed" and/or line-prefixed JSON (open-policy-agent#6636)

Fixes open-policy-agent#6630

Signed-off-by: Sean Williams <72675818+sean-r-williams@users.noreply.github.com>
Signed-off-by: Thomas Sidebottom <thomas.sidebottom@va.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indented/"pretty-printed" JSON marshaling
5 participants