-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework command line arguments parsing #2343
Conversation
This change solves two problems: 1. In the current implementation, command flags specific to the Splunk distro are dropped from the arguments list passed to the collector core service by their names. It means that all the flag values are unintentionally passed downstream. Also, flags with `=` are not removed from the list and cause failures downstream. For example, `./bin/otelcol --config-dir=/tmp/conf ...` fails with: ``` 2022/12/10 19:15:52 main.go:106: application run finished with error: unknown flag: --config-dir ``` And `removeFlag` function is called from methods that are not expected to be mutating by their definition like `ResolverURIs`. 2. Currently, we rely on the downstream collector core service to display `--help` flag output and get the following: ``` Usage: otelcol [flags] Flags: --config --config=file:/path/to/first --config=file:path/to/second Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=file:/path/to/first --config=file:path/to/second. (default []) --feature-gates Flag Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature. -h, --help help for otelcol --set func Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s -v, --version version for otelcol ``` It's not just incomplete but also provides a misleading description for `--config` flag: we don't support config schemas, so using any of the provided examples will fail. This change defines all the flags in our distro and makes the program use that definition in `--help` output instead of relying on the collector core service. Experimental flags like `--configd`, `--config-dir` and `--discovery` are hidden for now, but can be easily enabled. It provides the following output for `--help` flag: ``` Usage of otelcol: --config string Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=/path/to/first --config=path/to/second. (default "[]") --dry-run Don't run the service, just show the configuration --feature-gates string Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature. (default "[]") --no-convert-config Do not translate old configurations to the new format automatically. By default, old configurations are translated to the new format for backward compatibility. --set string Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s (default "[]") -v, --version Version of the collector. ``` Now we also explicitly pass all the flags that are handled by the collector core service instead of passing what we don't use. Reason for that is that there are no plans in the collector core to add any new flags. Most of the old flags were migrated to the configuration. And, even if something will be added, it potentially can be incompatible with our distro and introduce some broken flags to the `--help` output without us noticing. And, as an opportunistic simplification, this change replaces an unnecessary `Settings` interface with a struct.
c294416
to
199a7c4
Compare
@@ -60,77 +57,58 @@ const ( | |||
ConfigDScheme = "splunk.configd" | |||
) | |||
|
|||
type Settings interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, as an opportunistic simplification, this change replaces an unnecessary Settings interface with a struct.
The aim of this interface was to help clarify the intended consumer usage or available settings while hiding all the underlying flag/state management that was scattered throughout main.go
. What are we gaining by using the exported internal struct directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any IDE or editor can show all available methods and point you right to the implementation, which is much better than pointing to an interface declaration that force you to look around for the actual implementation after that. Also, all the methods will be listed the same way as they are in the go doc https://pkg.go.dev/github.com/signalfx/splunk-otel-collector@v0.66.0/internal/settings#Settings. So would disagree that using interfaces for one struct makes developer live any better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more for maintainability and trying to establish a clear convention for how future settings should be added, and nothing to do w/ IDEs. Given these changes don't deviate too far from what was there I think it's ok but may be something to revisit as we add future options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more for maintainability
IMHO adding unnecessary interfaces reduces maintainability
establish a clear convention for how future settings should be added
I think it's rather an unnecessarily complicated convention for such a simple interface. Not sure if there is even a reason to expose them as methods... If they are all just getters using data from the struct that was set during initialization, we can even expose them as precalculated struct fields. That will put them all together in one place as it was with the interface. And it'll help to avoid extra work if they would be used more than once in the future. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to get too caught up in a dispute of personal preference, but they were interfaces because interfaces are collections of methods and not fields. The accessors do the computing and return the setting value, but could definitely be improved. They were exposed via the settings interface to encapsulate the ~legacy flag helpers while migrating to a central setting store to prevent breaking changes.
Processing them w/ initial parsing stages in New()
and using exported fields is acceptable to me and would address their idempotence issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processing them w/ initial parsing stages in New() and using exported fields is acceptable to me and would address their idempotence issues.
Sounds good, I was thinking about doing that but didn't want to go too far. I can do it in this or separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say let's cut an issue for
Processing them w/ initial parsing stages in New() and using exported fields is acceptable to me and would address their idempotence issues.
and keep it as the struct version for now.
cpArgs = append(cpArgs, args...) | ||
// parseArgs returns new Settings instance from command line arguments. | ||
func parseArgs(args []string) (*Settings, error) { | ||
flagSet := flag.NewFlagSet("otelcol", flag.ContinueOnError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented the aim here is to not have to track or manage upstream options in our init logic unless we are using them directly. Why is this no longer desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for that is that there are no plans in the collector core to add any new flags.
Do you have an issue for this?
it potentially can be incompatible with our distro and introduce some broken flags to the --help output without us noticing.
Do you mind clarifying why this expectation is something to avoid and why it's better to not inherit* features as upstream provides them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an issue for this?
No
Do you mind clarifying why this expectation is something to avoid and why it's better to not inherit* features as upstream provides them?
Because additional flags introduce some optional functionality. And that functionality potentially can be incompatible with our distro, which will lead to an additional broken flag in --help
output without us noticing. If a new additional flag in core, for whatever reason, is required, we will catch it during the upgrade of the core dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we still want to define all known flags (including core's) in our distro but pass everything that we don't know to the core just to potentially catch a new flag added to the core without documenting it, we can do that. But it will require manual argument parsing. I don't think it is worth it. I'd rather keep an eye on new flags added to the core (which is very unlikely to happen any time soon) and add them properly along with documentation in our distro, the same way as we do for components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still under the impression we probably don't want to define any flags we aren't using and the current problems we are running into are longstanding implementation defects. Are we not already manually parsing args atm and in your changes? I can see an inverse flagSetToArgs()
that removes our distro-only flags and values as a major improvement to the current removeFlag()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it makes sense to fix the config parsing and other issues alone since we aren't in alignment on how to move forward otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing each issue alone requires adding more hacks on top of what we have. I tried to come up with one set of related changes that doesn't require doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context Dmitrii, are you saying that not doing the opportunistic change regarding Settings
makes it more difficult to deliver/fix the two problems mentioned in the issue body?
There's a lot of "issues" and decision points to be made so I want to be 100% certain on which specific issues and challenges are meant by "fixing each issue alone" and "one set of related changes" and "doing that". Want to make sure I didn't get lost in the thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still disagree that we should pass all unknown flags, but happy to discuss that separately and apply another PR if needed.
Yeah, it's harder to remove that functionality if someone ends up needing it for some esoteric triage. Better imho to go with an explicit enumerative set of flags and let the customers cut tickets if they have use cases instead of them just silently DIYing stuff unbeknownst to us. Still catching up on this though, so treat this as my bit-more-than-initial impression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like (love?) the vision for a better future Ryan is illustrating and agree that in general this is an unfortunate place to be in, stuck between a bunch of suboptimal design decisions and context we as rotel can't fully control. I think these integration pains need to be addressed and agree that Ryan's concerns of software quality need to be better internalized in our culture and inter-team communication, and I wouldn't say this PR fixes the "root" cause for some of the aforementioned issues. The timeframe for addressing this is "the sooner the better".
All that said, I still think we should merge this as-is for now. To my understanding the interface and customer messaging won't change drastically from the implicit one we have today, and this change fixes bug(s) without (imho) introducing any design decisions that we can't easily walk back from, and overall improves the customer experience.
For that reason I'm inclined to ship as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved so long as it's agreed that any design decisions introduced here won't be too hard to change or walk back in the future, which I believe to be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved - will use this discussion for tech debt around this module.
This PR solves two problems:
=
are not removed from the list and cause failures downstream. For example,./bin/otelcol --config-dir=/tmp/conf ...
fails with:And
removeFlag
function is called from methods that are not expected to be mutating by their definition likeResolverURIs
.--help
flag output and get the following:It's not just incomplete but also provides a misleading description for
--config
flag: we don't support config schemas, so using any of the provided examples will fail. This change defines all the flags in our distro and makes the program use that definition in--help
output instead of relying on the collector core service. Experimental flags like--configd
,--config-dir
and--discovery
are hidden for now, but can be easily enabled. It provides the following output for--help
flag:Now we also explicitly pass all the flags that are handled by the collector core service instead of passing what we don't use. The reason for that is that there are no plans in the collector core to add any new flags. Most of the old flags were migrated to the configuration. And, even if something will be added, it potentially can be incompatible with our distro and introduce some broken flags to the
--help
output without us noticing.And, as an opportunistic simplification, this change replaces an unnecessary
Settings
interface with a struct.