Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI should not use non-public Cilium APIs #1896

Open
joestringer opened this issue Aug 8, 2023 · 9 comments
Open

CLI should not use non-public Cilium APIs #1896

joestringer opened this issue Aug 8, 2023 · 9 comments

Comments

@joestringer
Copy link
Member

joestringer commented Aug 8, 2023

The following code uses a non-public API from https://github.com/cilium/cilium in order to extract configurations out from an individual agent:

cfg := &option.DaemonConfig{}

There are a few issues with this:

  • This is the internal structure used by Cilium and is subject to change without warning
  • Over time, Cilium will migrate away from including all daemon configurations in this structure towards per-Cell configuration, so it will not contain all configurations.
  • This ties the Cilium CLI to a specific version of the internal agent code, which may cause compatibility issues between multiple versions.
    • Revert "Refactor hubble redact settings schema" cilium#27352 points out an example where a new flag was introduced during cilium development, then cilium-cli was synced to cilium, then the flag was changed. This caused connectivity failures because the type was made invalid:
      connectivity test failed: unmarshaling cilium runtime config json: json: cannot unmarshal bool into Go struct field DaemonConfig.HubbleRedact of type []string

This code should probably unmarshal the configuration into a more liberal type to account for differences between Cilium versions.

@joestringer
Copy link
Member Author

Would it be reasonable to do this autodetection from the helm configuration level or using cilium-config ConfigMap rather than executing into a Cilium pod to assess desired state from the agent?

@michi-covalent
Copy link
Contributor

yeah i was thinking we should just look at cilium-config config map 💡

@tklauser
Copy link
Member

tklauser commented Aug 9, 2023

FWIW, the PR adding the runtime feature detection (#1022) has some context on why this is needed and why the information cannot easily be gathered from the ConfigMap:

Previously, we have been relying on the Cilium ConfigMap to detect the
presence of features. Relying on the ConfigMap has however some
downsides, as one has to know the default value of a certain feature if
the option is absent from the ConfigMap. To make things worse, this
default value is also version dependent in some cases.

/cc @gandro

@michi-covalent
Copy link
Contributor

that's a good point. i guess helm values will have the same issue? 💭

@tklauser
Copy link
Member

tklauser commented Aug 9, 2023

Yeah, I think we'll have the same issue with helm values. IIRC we don't specify defaults for them (in all cases).

@michi-covalent
Copy link
Contributor

struggle

@3u13r
Copy link
Contributor

3u13r commented Aug 18, 2023

There was also an issue with the default used for the tunneling feature, which tried to use the config-map value if available (da5c0c9).

Therefore I tried to document the current guidelines in the code. I mostly copied the descriptions from the commits, bit I thought it makes sense to also have them in code (7c54c5f).

@gandro
Copy link
Member

gandro commented Aug 28, 2023

I agree that relying on the agent-runtime-config.json is a hack. But as already mentioned, we want to know the runtime configuration of the agent, not what has been configured via Helm or ConfigMap, since Cilium does a lot of auto-detection or runtime-dependent configuration of certain features. I opted to use the JSON file because it was already there and worked with existing versions of Cilium (something that the CLI also needs to consider).

I think one way answer here is to start exposing per-cell configurations in the Cilium API. We already have an API endpoint that exposes a subset the global runtime daemon configuration https://github.com/cilium/cilium/blob/95e25bfb132750e97304dfbe0f5770f4b6debd96/api/v1/openapi.yaml#L74-L85

The /config endpoint is lacking per-cell configurations, but that can be solved by having cells exposing their configuration via API too. It could even be semi-automated, i.e. have a registry where cells register themselves as "having some configuration" which then automatically exposes that state via an API endpoint.

@joestringer
Copy link
Member Author

I wonder if we could put some infrastructure in place so that those configuration API endpoints are somehow tied to feature stability in the code. For instance, in the case from the issue description, the feature is brand new, so could be considered alpha. Maybe for alpha options we don't expose it in the API because it's subject to change. Then we wouldn't trigger this overall issue in the first place because Cilium-CLI wouldn't be relying on an alpha setting having a specific type or format. For other settings, we could graduate the features themselves to beta/GA and somewhere later in that process we start exposing the flags as a public API from Cilium.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants