Conversation
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
| - Rollout considerations, like can we _backport_ flags to previous features so that they can be associated with verions | ||
| or disabled? | ||
|
|
||
| ## FIXME Design Details |
There was a problem hiding this comment.
One important note with Zarf is that anything in the schema will span to both create time and deploy time. If a feature is enabled and used at create time, do we assume that it's enabled for the deployer. I'd lean towards yes, but curious on others thoughts.
There was a problem hiding this comment.
That's a good point, yeah. This is maybe a hacky solution, but what if we retrieved the set ofAll() enabled features when creating a package and serialized their names to a build or metadata field?
There was a problem hiding this comment.
I could see that being useful in the event we are changing how a specific feature in the schema works (though maybe if we're doing that it's a sign we should be using a different key).
I was thinking of an even simpler case where we have a gated feature in the schema and someone opts into it, would that feature also be gated on deploy?
There was a problem hiding this comment.
I believe this will depend on whether a particular functionality alters the package or its deployment or not. But storing that information in package metadata is definitely good idea.
| // if len(flags) == 0 {...} | ||
| ``` | ||
|
|
||
| ### TODO DISCUSS: Global API |
There was a problem hiding this comment.
I'd prefer global. From what I can tell it looks like Kubernetes declares their feature flags globally on init() https://github.com/kubernetes/kubernetes/blob/c4beea784c4ba2ab4cdf98d13dfe094fcb746288/pkg/features/kube_features.go#L1945
| StageDeprecated Stage = "deprecated" | ||
| ) | ||
|
|
||
| type Flag struct { |
There was a problem hiding this comment.
Considering changing this to Feature to make it a bit more clear what it's describing.
There was a problem hiding this comment.
Migrating Flag type to Feature in latest
| ### Types | ||
| Below is the proposed data model for Feature Flags, including the `Flag` type and supporting fields in the `flag` pkg. | ||
| ``` | ||
| pkg flag |
There was a problem hiding this comment.
Same as the type name below, considering changing the pkg name to feature to make it a bit more clear what it's describing.
There was a problem hiding this comment.
MIgrating package flag -> feature in latest.
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
brandtkeller
left a comment
There was a problem hiding this comment.
Processing my understanding - appreciative of the huge scope this is appropriately engaging.
| of release and deprecation, eg. "alpha", "beta", "GA", "deprecated". It introduces a ctx-based API, similar to previous | ||
| usages of feature flags in Zarf, a centralized location where all CLI feature flags are **declared** so that users may | ||
| easily find Flags and their documentation in code. Additionally, centralizing where Flags are declared allows us to | ||
| automate site documentation that corresponds one to one with the docs in code. Inspiration for the Flag fields is taken |
There was a problem hiding this comment.
Agree with intent - curious if there is an overlay of alternative designs that uses our already-available utilization of cobra/viper/pflag . Are there constraints present here with the lifecycle exposed by those and our current use? maybe.
pflag has a lot in the way of lifecycle management and the integration with cobra/viper already answers questions for how we consume from config / generate documentation and manage the lifecycle of active/hidden/deprecated flags.
Just want to ensure we weigh that optionality before understanding if we require a bespoke implementation.
There was a problem hiding this comment.
I believe the plag will only to handle the flag parsing part of things. This document focuses primarily on the managing the feature gates, enabled through flags. For example, given the following 4 features at various stages:
- featureA - stable, enabled by default, can't be disabled.
- featureB - beta, enabled by default, can be disabled.
- featureC - beta, disabled by default, can be enabled.
- featureD - alpha, disabled by default, can be enabled.
A zarf user, by default, has access to featureA and featureB (on by default). If a user wants to enable featureC, one needs to run zarf --feature-gate=featureC=true, if one wants both featureC and featureD - zarf --feature-gate=featureC=true,featureD=true. Alternatively if user wants to disable a feature that is on by default - zarf --feature-gate=featureB=false.
From a developer perspective you have 2 helpful utilities:
- A registry of feature gates, ideally with information about its state (alpha, beta, stable), whether it's on or off by default, and lastly if it can be turned off.
- A helper method which allows you to check if featureA is on or off, that you then use in your code base.
An example from k8s:
- Feature metadata: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/features/kube_features.go#L405-L410 with a pointer to KEP, and history https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/features/kube_features.go#L1315-L1319
- Sample usage: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/apis/batch/v1/defaults.go#L62
There was a problem hiding this comment.
On the other hand, kubectl uses env vars to achieve this:
- Feature metadata: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/staging/src/k8s.io/kubectl/pkg/cmd/util/helpers.go#L434-L438
- Sample usage: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/staging/src/k8s.io/kubectl/pkg/cmd/cmd.go#L362-L364
Although, I'm planning to simplify the interface for kubectl a bit, since it's a bit clunky atm. The goal is to make it closer to what --feature-gates do.
There was a problem hiding this comment.
Also, we could just re-use the k8s library for feature gates, rather than rolling yet another one https://github.com/kubernetes/component-base/blob/master/featuregate/feature_gate.go. Or use some other library that I'm pretty exist 😅
There was a problem hiding this comment.
Seconding what @soltysh is saying here where pflag works great as a kind of CLI frontend for sure, but doesn't provide a flexible enough model or API for features. Especially when it comes to users needing to query or change feature state within code.
Regarding borrowing or reusing the K8s feature-gates - I definitely took a lot of inspiration from the implementation! Reading through the type was my starting point. K8s runtime mutable features which are compelling, as well as multiple features per gate. Maybe more than we need pre-1.0 though?
If are effective with the API and configuration layer I do think we can come back to the implementation and add functionality in or swap out for more if we need it in the future. The config layer (CLI flags, env vars, zarf-config.{yaml,toml}) already does this well imo. Where we'll need to be careful are the abstractions we offer to users for the SDK.
|
|
||
| Feature flags are implemented on an opt-in basis. Upgrading to versions with feature flagging supported, and with | ||
| toggle-able feature flags ought to be simple and transparent. There are no backwards compatibility concerns for previous | ||
| versions of Zarf as feature flags deal with the code itself. |
There was a problem hiding this comment.
There are no backwards compatibility concerns for previous versions of Zarf as feature flags deal with the code itself.
Does this consider the impact of the package created and potential drift between the zarf version at creation and the version used during deploy?
There was a problem hiding this comment.
| - Also if maintainers rely on enabling feature flags at specific versions then this keeps SDK to CLI parity. | ||
| - Without global defaults we have to flip the boolean from enabled to disabled when providing a beta feature by default. | ||
| - A major drawback with ctx is that it's difficult to inspect for users. "where did this flag come from?" | ||
| - Maybe this is a good nudge to get features into GA sooner. More to discuss here. |
There was a problem hiding this comment.
++ It is a great reason for us to have a process that tracks current state and what we require for graduation?
| --> | ||
|
|
||
| - One critical pitfall to avoid with feature flags is creating _unexpected_ behavior, both for end users and maintainers. | ||
| Unclear state about which flags are enabled and not, and with and without defaults. We have some solutions below under |
There was a problem hiding this comment.
appreciate this - how can we make the propagation throughout the system more observable.
There was a problem hiding this comment.
I think that w/ the addition of separate sets for default and user features that sourcing these gets even easier.
| of release and deprecation, eg. "alpha", "beta", "GA", "deprecated". It introduces a ctx-based API, similar to previous | ||
| usages of feature flags in Zarf, a centralized location where all CLI feature flags are **declared** so that users may | ||
| easily find Flags and their documentation in code. Additionally, centralizing where Flags are declared allows us to | ||
| automate site documentation that corresponds one to one with the docs in code. Inspiration for the Flag fields is taken |
There was a problem hiding this comment.
I believe the plag will only to handle the flag parsing part of things. This document focuses primarily on the managing the feature gates, enabled through flags. For example, given the following 4 features at various stages:
- featureA - stable, enabled by default, can't be disabled.
- featureB - beta, enabled by default, can be disabled.
- featureC - beta, disabled by default, can be enabled.
- featureD - alpha, disabled by default, can be enabled.
A zarf user, by default, has access to featureA and featureB (on by default). If a user wants to enable featureC, one needs to run zarf --feature-gate=featureC=true, if one wants both featureC and featureD - zarf --feature-gate=featureC=true,featureD=true. Alternatively if user wants to disable a feature that is on by default - zarf --feature-gate=featureB=false.
From a developer perspective you have 2 helpful utilities:
- A registry of feature gates, ideally with information about its state (alpha, beta, stable), whether it's on or off by default, and lastly if it can be turned off.
- A helper method which allows you to check if featureA is on or off, that you then use in your code base.
An example from k8s:
- Feature metadata: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/features/kube_features.go#L405-L410 with a pointer to KEP, and history https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/features/kube_features.go#L1315-L1319
- Sample usage: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/pkg/apis/batch/v1/defaults.go#L62
| of release and deprecation, eg. "alpha", "beta", "GA", "deprecated". It introduces a ctx-based API, similar to previous | ||
| usages of feature flags in Zarf, a centralized location where all CLI feature flags are **declared** so that users may | ||
| easily find Flags and their documentation in code. Additionally, centralizing where Flags are declared allows us to | ||
| automate site documentation that corresponds one to one with the docs in code. Inspiration for the Flag fields is taken |
There was a problem hiding this comment.
On the other hand, kubectl uses env vars to achieve this:
- Feature metadata: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/staging/src/k8s.io/kubectl/pkg/cmd/util/helpers.go#L434-L438
- Sample usage: https://github.com/kubernetes/kubernetes/blob/2a03dd1d5ecc66ca6a21fc9f526d4eedc9fef179/staging/src/k8s.io/kubectl/pkg/cmd/cmd.go#L362-L364
Although, I'm planning to simplify the interface for kubectl a bit, since it's a bit clunky atm. The goal is to make it closer to what --feature-gates do.
| ### Non-Goals | ||
|
|
||
| - Elaborate configuration nightmares, where users don't know what is enabled or multiple flags are intertwined. | ||
| - Forever features that are endless reworked and never make it to GA. |
There was a problem hiding this comment.
I'm inclined to move this to goals, to ensure we don't have forever features. Ideally you want to enforce some kind of timeline for every feature, but if you see something is stuck for too long you might want to consider to drop it, which is also easier depending on the stage the features is at. Alpha features can be dropped almost at a whim, whereas beta (especially those enabled by default) or stable require a proper deprecation period to allow users to stop using those.
There was a problem hiding this comment.
Yeah I see what you mean, having features while models all of this explicitly does help with the problem. And the problem exists whether it's modeled or not.
| - Rollout considerations, like can we _backport_ flags to previous features so that they can be associated with verions | ||
| or disabled? | ||
|
|
||
| ## FIXME Design Details |
There was a problem hiding this comment.
I believe this will depend on whether a particular functionality alters the package or its deployment or not. But storing that information in package metadata is definitely good idea.
| # Enabled is set if a feature is set. | ||
| Enabled bool | ||
| # Default is set if a feature is enabled by default, without being set by users. | ||
| Default bool |
There was a problem hiding this comment.
I'd suggest not to store Enabled next to Default, but rather keep this structure minimal such that it will only hold information about feature gate, such as: name, description, version, enabled, stage. Separately, you'll want to keep a registry of currently enabled flags (those with enabled=true out of the box and those enabled by users explicitly through flags), and devs should always use the registry helper to know if it's on or off.
This separation ensures we don't accidentally enable something that should not be enabled.
There was a problem hiding this comment.
That's a good point, and the default vs. enabled states were intended to allow for this behavior in the same set of features. Thinking through this more though, there's another complication if a feature is disabled by default then enabled by users (default=true, enabled=false) -> (default=true, enabled=true) - that effectively overwrites the flag and there's no way for us to distinguish during runtime that it was changed.
I see the advantages of removing the "default" field and instead having two different sets of features. Having a simpler API starts to outweigh the downsides of some more impl. complexity. Namely, it allows us to store multiple copies of flags and see if the state has been changed by users. e.g. psuedocode:
default={Foo=false, bar=...}
enabled={Foo=true}
This could be accomplished with another field on the struct, e.g. modified and constructors (e.g. feature.NewDefault, feature.NewEnabled, etc.) but I'd also lean towards keeping the struct minimal to keep the API simple and "just" use sets instead of boolean logic.
I'll update the API some so it can filter for default or not rather than passing the fields back and having the user do so.
| - Make new features opt-in. | ||
| - Have a clear and documented reference for users on which versions of Zarf introduce new features. | ||
| - Runtime-level feature configuration | ||
| - Stretch goal: multiple avenues of configuration (code, env vars, flags, config files) |
There was a problem hiding this comment.
Potentially, you could also have a better way to inform users about deprecated features, if they are using those.
There was a problem hiding this comment.
Definitely a benefit, going to add it to goals.
| // if len(flags) == 0 {...} | ||
| ``` | ||
|
|
||
| ### TODO DISCUSS: Global API |
|
|
||
| Feature flags are implemented on an opt-in basis. Upgrading to versions with feature flagging supported, and with | ||
| toggle-able feature flags ought to be simple and transparent. There are no backwards compatibility concerns for previous | ||
| versions of Zarf as feature flags deal with the code itself. |
There was a problem hiding this comment.
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
|
Revision 2 of the proposal is done which instead proposes globally stored features and updates the API accordingly. I've provided some examples of how this is intended to work, but if there's any gaps please let me know! I believe this is ready to merge given final draft review and alpha impl is ready to begin. There's still some TODOs left in that I'd like to come back to after implementing alpha. Nitpicks absolutely welcome! |
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
…eads better Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
…mments to slashes Signed-off-by: Kit Patella <kit@defenseunicorns.com>
AustinAbro321
left a comment
There was a problem hiding this comment.
Nice work so far. Clarifying question on CLI UX
| // My beautiful Zarf-enabled application | ||
| ``` | ||
|
|
||
| ### CLI: Enabling and Disabling Flags |
There was a problem hiding this comment.
Curious on how this would look for something like --registry-proxy in #37. I think that'd be a good candidate for feature flag.
I would expect that if users are required to run zarf init --feature-enable=registry-proxy --registry-proxy, just running zarf init --feature-enable=registry-proxy wouldn't actually do anything in that case. Also if someone runs zarf init -h before registry-proxy is enabled should they see the --registry-proxy flag in help text?
There was a problem hiding this comment.
The first form that you mentioned is also what I'd expect, that way the feature itself can be enabled but not actually run. Users could also enable the feature through ZARF_FEATURE_ENABLE="registry-proxy" zarf init --registry-proxy or through on-disk config so the main idea is keeping parity. Given that registry-proxy is an option, flagging it enables that option.
As for help text, it's a good callout. In a quick experiment, I wasn't able to get the boolean state of one rootCmd pflag to mark another as hidden (e.g. if IsColorDisabled { ...MarkHidden("--log-level") }. There may be another path though.
That said, the least clever path would be showing the flag text every time and documenting its release stage and which feature flag to use.
There was a problem hiding this comment.
Yeah worth experimenting more on that in the future, but marking it as disabled in the flag text as a backup is reasonable for now.
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
Signed-off-by: Kit Patella <kit@defenseunicorns.com>
AustinAbro321
left a comment
There was a problem hiding this comment.
A few todos still, but this is certainly in a state where we can start iterating on the implementation. Nice work.
One-line PR description:
This ZEP proposes the need for, API, docs, and implementation of a Feature Flagging library in Zarf.
Issue link:
Feature Flags #35
Other comments:
Currently Revision 2, ready for pre-alpha review and potential merge.