Rework command line arguments parsing #2343

dmitryax · 2022-12-11T03:50:01Z

This PR solves two problems:

In the current implementation, command flags specific to the Splunk distro are dropped from the arguments list passed to the collector core service by their names. It means that all the flag values are unintentionally passed downstream. Also, flags with = are not removed from the list and cause failures downstream. For example, ./bin/otelcol --config-dir=/tmp/conf ... fails with:

2022/12/10 19:15:52 main.go:106: application run finished with error: unknown flag: --config-dir

And removeFlag function is called from methods that are not expected to be mutating by their definition like ResolverURIs.

Currently, we rely on the downstream collector core service to display --help flag output and get the following:

Usage:
  otelcol [flags]

Flags:
      --config --config=file:/path/to/first --config=file:path/to/second   Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=file:/path/to/first --config=file:path/to/second. (default [])
      --feature-gates Flag                                                 Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature.
  -h, --help                                                               help for otelcol
      --set func                                                           Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s
  -v, --version                                                            version for otelcol

It's not just incomplete but also provides a misleading description for --config flag: we don't support config schemas, so using any of the provided examples will fail. This change defines all the flags in our distro and makes the program use that definition in --help output instead of relying on the collector core service. Experimental flags like --configd, --config-dir and --discovery are hidden for now, but can be easily enabled. It provides the following output for --help flag:

Usage of otelcol:
      --config string          Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=/path/to/first --config=path/to/second. (default "[]")
      --dry-run                Don't run the service, just show the configuration
      --feature-gates string   Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature. (default "[]")
      --no-convert-config      Do not translate old configurations to the new format automatically. By default, old configurations are translated to the new format for backward compatibility.
      --set string             Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s (default "[]")
  -v, --version                Version of the collector.

Now we also explicitly pass all the flags that are handled by the collector core service instead of passing what we don't use. The reason for that is that there are no plans in the collector core to add any new flags. Most of the old flags were migrated to the configuration. And, even if something will be added, it potentially can be incompatible with our distro and introduce some broken flags to the --help output without us noticing.

And, as an opportunistic simplification, this change replaces an unnecessary Settings interface with a struct.

This change solves two problems: 1. In the current implementation, command flags specific to the Splunk distro are dropped from the arguments list passed to the collector core service by their names. It means that all the flag values are unintentionally passed downstream. Also, flags with `=` are not removed from the list and cause failures downstream. For example, `./bin/otelcol --config-dir=/tmp/conf ...` fails with: ``` 2022/12/10 19:15:52 main.go:106: application run finished with error: unknown flag: --config-dir ``` And `removeFlag` function is called from methods that are not expected to be mutating by their definition like `ResolverURIs`. 2. Currently, we rely on the downstream collector core service to display `--help` flag output and get the following: ``` Usage: otelcol [flags] Flags: --config --config=file:/path/to/first --config=file:path/to/second Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=file:/path/to/first --config=file:path/to/second. (default []) --feature-gates Flag Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature. -h, --help help for otelcol --set func Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s -v, --version version for otelcol ``` It's not just incomplete but also provides a misleading description for `--config` flag: we don't support config schemas, so using any of the provided examples will fail. This change defines all the flags in our distro and makes the program use that definition in `--help` output instead of relying on the collector core service. Experimental flags like `--configd`, `--config-dir` and `--discovery` are hidden for now, but can be easily enabled. It provides the following output for `--help` flag: ``` Usage of otelcol: --config string Locations to the config file(s), note that only a single location can be set per flag entry e.g. --config=/path/to/first --config=path/to/second. (default "[]") --dry-run Don't run the service, just show the configuration --feature-gates string Comma-delimited list of feature gate identifiers. Prefix with '-' to disable the feature. '+' or no prefix will enable the feature. (default "[]") --no-convert-config Do not translate old configurations to the new format automatically. By default, old configurations are translated to the new format for backward compatibility. --set string Set arbitrary component config property. The component has to be defined in the config file and the flag has a higher precedence. Array config properties are overridden and maps are joined. Example --set=processors.batch.timeout=2s (default "[]") -v, --version Version of the collector. ``` Now we also explicitly pass all the flags that are handled by the collector core service instead of passing what we don't use. Reason for that is that there are no plans in the collector core to add any new flags. Most of the old flags were migrated to the configuration. And, even if something will be added, it potentially can be incompatible with our distro and introduce some broken flags to the `--help` output without us noticing. And, as an opportunistic simplification, this change replaces an unnecessary `Settings` interface with a struct.

rmfitzpatrick · 2022-12-12T22:36:48Z

internal/settings/settings.go

@@ -60,77 +57,58 @@ const (
 	ConfigDScheme       = "splunk.configd"
 )

-type Settings interface {


And, as an opportunistic simplification, this change replaces an unnecessary Settings interface with a struct.

The aim of this interface was to help clarify the intended consumer usage or available settings while hiding all the underlying flag/state management that was scattered throughout main.go. What are we gaining by using the exported internal struct directly?

Any IDE or editor can show all available methods and point you right to the implementation, which is much better than pointing to an interface declaration that force you to look around for the actual implementation after that. Also, all the methods will be listed the same way as they are in the go doc https://pkg.go.dev/github.com/signalfx/splunk-otel-collector@v0.66.0/internal/settings#Settings. So would disagree that using interfaces for one struct makes developer live any better.

It's more for maintainability and trying to establish a clear convention for how future settings should be added, and nothing to do w/ IDEs. Given these changes don't deviate too far from what was there I think it's ok but may be something to revisit as we add future options.

It's more for maintainability

IMHO adding unnecessary interfaces reduces maintainability

establish a clear convention for how future settings should be added

I think it's rather an unnecessarily complicated convention for such a simple interface. Not sure if there is even a reason to expose them as methods... If they are all just getters using data from the struct that was set during initialization, we can even expose them as precalculated struct fields. That will put them all together in one place as it was with the interface. And it'll help to avoid extra work if they would be used more than once in the future. What do you think?

Not to get too caught up in a dispute of personal preference, but they were interfaces because interfaces are collections of methods and not fields. The accessors do the computing and return the setting value, but could definitely be improved. They were exposed via the settings interface to encapsulate the ~legacy flag helpers while migrating to a central setting store to prevent breaking changes.

Processing them w/ initial parsing stages in New() and using exported fields is acceptable to me and would address their idempotence issues.

Processing them w/ initial parsing stages in New() and using exported fields is acceptable to me and would address their idempotence issues.

Sounds good, I was thinking about doing that but didn't want to go too far. I can do it in this or separate PR

I'd say let's cut an issue for

Processing them w/ initial parsing stages in New() and using exported fields is acceptable to me and would address their idempotence issues.

and keep it as the struct version for now.

rmfitzpatrick · 2022-12-13T14:41:38Z

internal/settings/settings.go

-	cpArgs = append(cpArgs, args...)
+// parseArgs returns new Settings instance from command line arguments.
+func parseArgs(args []string) (*Settings, error) {
+	flagSet := flag.NewFlagSet("otelcol", flag.ContinueOnError)


As commented the aim here is to not have to track or manage upstream options in our init logic unless we are using them directly. Why is this no longer desired?

The reason for that is that there are no plans in the collector core to add any new flags.

Do you have an issue for this?

it potentially can be incompatible with our distro and introduce some broken flags to the --help output without us noticing.

Do you mind clarifying why this expectation is something to avoid and why it's better to not inherit* features as upstream provides them?

Do you have an issue for this?

No

Do you mind clarifying why this expectation is something to avoid and why it's better to not inherit* features as upstream provides them?

Because additional flags introduce some optional functionality. And that functionality potentially can be incompatible with our distro, which will lead to an additional broken flag in --help output without us noticing. If a new additional flag in core, for whatever reason, is required, we will catch it during the upgrade of the core dependency

If we still want to define all known flags (including core's) in our distro but pass everything that we don't know to the core just to potentially catch a new flag added to the core without documenting it, we can do that. But it will require manual argument parsing. I don't think it is worth it. I'd rather keep an eye on new flags added to the core (which is very unlikely to happen any time soon) and add them properly along with documentation in our distro, the same way as we do for components.

I'm still under the impression we probably don't want to define any flags we aren't using and the current problems we are running into are longstanding implementation defects. Are we not already manually parsing args atm and in your changes? I can see an inverse flagSetToArgs() that removes our distro-only flags and values as a major improvement to the current removeFlag().

maybe it makes sense to fix the config parsing and other issues alone since we aren't in alignment on how to move forward otherwise.

Fixing each issue alone requires adding more hacks on top of what we have. I tried to come up with one set of related changes that doesn't require doing that.

For more context Dmitrii, are you saying that not doing the opportunistic change regarding Settings makes it more difficult to deliver/fix the two problems mentioned in the issue body?

There's a lot of "issues" and decision points to be made so I want to be 100% certain on which specific issues and challenges are meant by "fixing each issue alone" and "one set of related changes" and "doing that". Want to make sure I didn't get lost in the thread.

I still disagree that we should pass all unknown flags, but happy to discuss that separately and apply another PR if needed.

Yeah, it's harder to remove that functionality if someone ends up needing it for some esoteric triage. Better imho to go with an explicit enumerative set of flags and let the customers cut tickets if they have use cases instead of them just silently DIYing stuff unbeknownst to us. Still catching up on this though, so treat this as my bit-more-than-initial impression

I like (love?) the vision for a better future Ryan is illustrating and agree that in general this is an unfortunate place to be in, stuck between a bunch of suboptimal design decisions and context we as rotel can't fully control. I think these integration pains need to be addressed and agree that Ryan's concerns of software quality need to be better internalized in our culture and inter-team communication, and I wouldn't say this PR fixes the "root" cause for some of the aforementioned issues. The timeframe for addressing this is "the sooner the better".

All that said, I still think we should merge this as-is for now. To my understanding the interface and customer messaging won't change drastically from the implicit one we have today, and this change fixes bug(s) without (imho) introducing any design decisions that we can't easily walk back from, and overall improves the customer experience.

For that reason I'm inclined to ship as-is.

hughesjj

approved so long as it's agreed that any design decisions introduced here won't be too hard to change or walk back in the future, which I believe to be the case.

atoulme

Approved - will use this discussion for tech debt around this module.

dmitryax requested review from a team as code owners December 11, 2022 03:50

dmitryax force-pushed the exclude-flags-explicitly branch from c294416 to 199a7c4 Compare December 11, 2022 04:08

rmfitzpatrick reviewed Dec 12, 2022

View reviewed changes

rmfitzpatrick reviewed Dec 13, 2022

View reviewed changes

hughesjj reviewed Dec 14, 2022

View reviewed changes

hughesjj approved these changes Dec 14, 2022

View reviewed changes

atoulme approved these changes Dec 14, 2022

View reviewed changes

dmitryax merged commit 3b9164c into main Dec 15, 2022

delete-merged-branch bot deleted the exclude-flags-explicitly branch December 15, 2022 00:07

rmfitzpatrick mentioned this pull request Jan 18, 2024

support core service validate command #4175

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework command line arguments parsing #2343

Rework command line arguments parsing #2343

dmitryax commented Dec 11, 2022

rmfitzpatrick Dec 12, 2022

dmitryax Dec 12, 2022

rmfitzpatrick Dec 13, 2022

dmitryax Dec 13, 2022

rmfitzpatrick Dec 13, 2022

dmitryax Dec 13, 2022

hughesjj Dec 14, 2022

rmfitzpatrick Dec 13, 2022

rmfitzpatrick Dec 13, 2022 •

edited

dmitryax Dec 13, 2022 •

edited

dmitryax Dec 13, 2022

rmfitzpatrick Dec 13, 2022

rmfitzpatrick Dec 14, 2022

dmitryax Dec 14, 2022

hughesjj Dec 14, 2022

hughesjj Dec 14, 2022

hughesjj Dec 14, 2022

hughesjj left a comment

atoulme left a comment

Rework command line arguments parsing #2343

Rework command line arguments parsing #2343

Conversation

dmitryax commented Dec 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmfitzpatrick Dec 13, 2022 • edited

Choose a reason for hiding this comment

dmitryax Dec 13, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hughesjj left a comment

Choose a reason for hiding this comment

atoulme left a comment

Choose a reason for hiding this comment

rmfitzpatrick Dec 13, 2022 •

edited

dmitryax Dec 13, 2022 •

edited