Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: Do not modify the structure of opaque config #7935

Closed

Conversation

dnephin
Copy link
Contributor

@dnephin dnephin commented May 20, 2020

Branched from #7964
Fixes #4971

HCL and JSON config may have different structures. With this change the opaque configuration will not be modified. When the opaque config is mapped to a type the same hook is used to handle the difference in raw structure.

@@ -3223,120 +3223,6 @@ func TestConfigFlagsAndEdgecases(t *testing.T) {
}`},
err: "config_entries.bootstrap[0]: invalid name (\"invalid-name\"), only \"global\" is supported",
},
{
desc: "ConfigEntry bootstrap proxy-defaults (snake-case)",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case is TestConfigFlagsAndEdgecases. The tests fail because the hcl and json produce different opaque config in the Config section. I think with this change these are no longer special edgecases. The standard behaviour is handled by the TestFullConfig test case.

I couldn't think of any test to replace these, but I am happy to write some more if anyone has ideas.

@@ -222,8 +222,8 @@ func TestParseConfigEntry(t *testing.T) {
Config: map[string]interface{}{
"foo": 19,
"bar": "abc",
"moreconfig": map[string]interface{}{
"moar": "config",
"moreconfig": []map[string]interface{}{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look quite right. Given the JSON and HCL above what we want is for this to result in a map not a list of maps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the config_entry_test.go line.

Copy link
Member

@rboyer rboyer May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HCL library always decodes interface{} within a map[string]interface{} as a []map[string]interface{} because of how HCL works it doesn't know the difference between something showing up once or multiple times. This was part of the whole issue with needing PatchSliceOfMaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand what HCL is doing, and how this was solved before. I believe we were doing it wrong. I guess I did not explain the rational very well in the PR description.

Attempting to guess at what the user is expecting from an opaque config is impossible. If a user specifies some HCL, then they should get out the raw config that HCL produces, which is a slice of maps. Sometimes they are going to want a slice of maps, sometimes they are going to want a map. It is up to them to either:

  1. Use JSON, which is predictable and does what most people expect
  2. Use HCL, and handle the ambiguity it produces themselves.

In the case where the config is only partially opaque (opaque to config loading, but not to other parts of the code), we can decode properly using the same hooks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was really unsure about this change to begin with for the same reason Matt and RB bring it up - it "feels" wrong that we push this foot gun onto the user (at least the integrator who is storing config in these places and then parsing it back out from our API). We should probably at least figure out how we'd document that clearly wherever we write to or expose the config object in the API docs.

That said, I don't think there is a generally correct alternative since we by definition don't have the schema to know which way the ambiguous HCL presented should be interpreted so we will always be wrong in some cases and hence limit what can be expressed "opaquely" with HCL.

Until this PR that means it's just impossible to nest arbitrarily since we've made it wrong to ever be ambiguous and that is the whole issue we're trying to fix. The only other option I see would be to only allow exactly one off nested maps or nested array/list types in opaque config. That way we can always assume it's one or the other but I think this limits the usefulness of the syntax of HCL and would still cause confusing parse errors if you express things incorrectly.

There could be alternative UX options to consider. Although likely way more work, it's worth exploring them before we decide on the ultimate path:

  1. Disallow HCL repeated blocks in our files. To do this well we'd probably need changes to the HCL parser to be able to detect their presence unambiguously (i.e. vs explicit lists of HCL objects).
    • Pros
      1. Fixes the issue at source and makes it unambiguous on input and output
      2. Although it's possible to use repeated blocks in a few places in Consul today, it's already compromised by the plural naming convention we adopt to make the JSON format in the API sensible. e.g. you can specify multiple services as repeated blocks but you have to label each one services rather than service since the JSON equivalent array field is naturally pluralized.
    • Cons
      1. Lots of work to make the error handling right including HCL changes - might not even be feasible. Would likely be easiest to implement an optional "JSON Compatible" parsing mode.
      2. Breaks HCL expectations from other projects. Doesn't seem like a big issue given the poor ergonomics of these above.
      3. It's a breaking change. Although the ergonomics are poor with the naming thing, there will be existing HCL config files that rely on repeated blocks (I've written plenty myself although not important ones).
  2. Change "opaque" config to be a string where the actual config is either HCL or JSON depending on what the integration supports but is passed through without being parsed at all.
    • Pros
      1. Simple and unambiguous.
    • Cons
      1. Breaking change
      2. The UX of editing HCL with nested HCL or JSON strings is not great although HEREDOC syntax makes it OK. We already have examples of this in ACL policies and Envoy escape hatch though and it's workable but feels gross especially as that blob of config grows.
      3. THE UX of editing JSON with nested JSON or consuming it from the API is pretty awful though.
  3. This approach.
    • Pros
      1. Implemented and relatively simple in code.
    • Cons
      1. User is potentially left to figure out why the "same" config in HCL or JSON produces different results. Integrators have to be made aware of and handle both cases correctly or the UX of configuring via Consul is broken. This could lead to integrators documenting things like "don't use HCL as it messes stuff up, we only support JSON" which kind of undermines our slight preference towards HCL for more readable (e.g. commentable) config. An example here would be if a plugin (lets say a proxy implementation) choose to nest config like this in proxy.config:

        config {
            foo_proxy { defaults { connection_timeout_ms = 1000 } } 
        }
        {
          "config" : {
              "foo_proxy": {"defaults": {"connection_timeout_ms": 1000}}
          }
        }

        A user might reasonably expect both of those to produce the same output in the API for the proxy to decode but in fact the JSON one would look the same, while the HCL one would come back as:

        {
          "config": {
            "foo_proxy": [
              {
                "defaults": [
                  {
                    "connection_timeout_ms": 1000
                  }
                ]
              }
            ]
          }
        }
        

Copy link
Contributor Author

@dnephin dnephin May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take on these approaches is that all 3 of them are breaking changes, but option 3 (this PR) is the least sever breakage as it is isolated to only config-entries for external proxy integrations.

  • Option 1 can potentially break any user, even those that don't use connect.
  • Option 2 could break anyone using config entries, including users of the envoy proxy.
  • Option 3 can only break users of external proxy integrations which use nested structures (config entries for envoy proxies are safe and won't break). This should be the smallest possible group of users. We can further reduce the risk here by looking at any known integrations and confirming they do not use nested config. I've confirmed https://github.com/haproxytech/haproxy-consul-connect will not break. A very loud notice in the changelog would also be appropriate.

Backwards compatibility aside, I also think that option 3 is the best UX. HCL is unfortunately not a good option for opaque configuration that needs to be output as JSON. However, the one negative point for option 3 that you outline is real and something we will need to mitigate.

I think we would do the following:

  • add support for heredoc strings for the opaque configurations. This should be relatively easy to do with mapstructure.
  • continue to support inline opaque configuration so that we do not break any existing users
  • change all of our documentation to show either HCL+heredoc, or JSON+inline (ie: remove any inline HCL opaque config examples)
  • add a deprecation warning to our documentation about HCL+inline, and explain why it is not recommended. It will continue to work (with the caveat that nested structures will not come out as most people would expect), but we should encourage everyone to move away from it.
  • we could potentially even go as far as printing a warning to stderr when an HCL config uses inline opaque configuration.

@dnephin dnephin force-pushed the dnephin/remove-patch-slice-of-maps-opaque-config branch from 59c0d62 to 0de8e0d Compare May 20, 2020 18:51
@dnephin dnephin force-pushed the dnephin/remove-patch-slice-of-maps branch 2 times, most recently from 7f49759 to d1d9d8a Compare May 27, 2020 16:54
Currently opaque config blocks (config entries, and CA provider config) are
modified by PatchSliceOfMaps, making it impossible for these opaque
config sections to contain slices of maps.

In order to fix this problem, any lazy-decoding of these blocks needs to support
weak decoding of []map[string]interface{} to a struct type before
PatchSliceOfMaps is replaces. This is necessary because these config
blobs are persisted, and during an upgrade an older version of Consul
could read one of the new configuration values, which would cause an error.

To support the upgrade path, this commit first introduces the new hooks
for weak decoding of []map[string]interface{} and uses them only in the
lazy-decode paths. That way, in a future release, new style
configuration will be supported by the older version of Consul.

This decode hook has a number of advantages:

1. It no longer panics. It allows mapstructure to report the error
2. It no longer requires the user to declare which fields are slices of
   structs. It can deduce that information from the 'to' value.
3. It will make it possible to preserve opaque configuration, allowing
   for structured opaque config.
So that if they encounter config accepted by a newer version of Consul
they will know how to handle it.
@dnephin
Copy link
Contributor Author

dnephin commented May 27, 2020

If the changes in #7964 make it into 1.8, it will ensure forwards compatibility with this change. Consul 1.8 would be able to correctly read any configs saved by Consul 1.9 without errors.

I believe that only leaves 1 backwards compatibility concern, which is integrations using structured config. I've verified that https://github.com/haproxytech/haproxy-consul-connect only depends on scalar values, so will not be impacted by this change. I'll look into other integrations as well.

HCL and JSON config may have different structures. With this change the
opaque configuration will not be modified. When the config is
deserialized the same hook is used to handle this difference in
structure.
@dnephin dnephin force-pushed the dnephin/remove-patch-slice-of-maps-opaque-config branch from 0de8e0d to e11c27a Compare May 27, 2020 21:04
@dnephin dnephin changed the base branch from dnephin/remove-patch-slice-of-maps to dnephin/remove-patch-slice-of-maps-forward-compat May 27, 2020 21:05
@banks
Copy link
Member

banks commented May 28, 2020

which is integrations using structured config

🤔 isn't it the case that no integrations can have used structured config until now since it would cause a panic? I guess that means it's safe to assume we aren't breaking any one? Or maybe I missed something.

@dnephin
Copy link
Contributor Author

dnephin commented May 28, 2020

isn't it the case that no integrations can have used structured config until now since it would cause a panic?

Integrations can use structured config, as long as it doesn't contain lists of structures. A list of structures would cause a panic. But the top level of the config can contain other structures as long as all the fields are scalar values or other maps.

@dnephin dnephin force-pushed the dnephin/remove-patch-slice-of-maps-forward-compat branch 3 times, most recently from 9e07deb to 4ec6036 Compare May 29, 2020 19:21
@dnephin dnephin force-pushed the dnephin/remove-patch-slice-of-maps-forward-compat branch 3 times, most recently from 072211c to 75cbbe2 Compare June 8, 2020 21:05
Base automatically changed from dnephin/remove-patch-slice-of-maps-forward-compat to master June 8, 2020 23:53
@dnephin
Copy link
Contributor Author

dnephin commented Jun 11, 2020

After more reflecting, I think this breaking change is not necessary after all.

With #7964 we no longer have a panic. Single item lists are still modified and replaced with their single value, but mapstructure.WeakDecode handles turning single values into a single-item list when necessary.

This means that we can start using slices of types in our own partially opaque config. Anyone writing an integration will need to know that their decode may need to turn values into single-item lists, but that is a lot better than a panic, and roughly comparable to the workaround this PR was going to require (handling slices for every structure).

@dnephin dnephin closed this Jun 11, 2020
@kisunji kisunji deleted the dnephin/remove-patch-slice-of-maps-opaque-config branch June 2, 2022 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

config: opaque maps like service.proxy.config are mangled by patchSliceOfMaps hack
4 participants