Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Output options in lib.Options #587

Open
na-- opened this issue Apr 19, 2018 · 8 comments
Open

Move Output options in lib.Options #587

na-- opened this issue Apr 19, 2018 · 8 comments
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 refactor ux

Comments

@na--
Copy link
Member

na-- commented Apr 19, 2018

The configuration for different collectors is currently in the Config struct, which contains lib.Options. This allows collectors to be configured via environment variables or the global k6 config, but not by exporting the settings via the options object from scripts. The only exception to this is configuring the cloud collector via the lib.Options.External property (setting options['ext']['loadimpact'] in the script), which is suboptimal (configuring the same thing from 2 different places in 2 different ways, works for cloud collector but not influxdb or others in the future, etc.)

Moving the collector options to lib.Options would alleviate all of those issues. The lib.options.External configuration would still remain, but it will be used for other loadimpact cloud specific settings.

@na--
Copy link
Member Author

na-- commented Aug 27, 2018

Started on this last Friday, but then I realized there are some things we need to consider:

  • Since we support multiple output collectors now (allow specifying multiple collectors #624), we should make this collectors option an array rather than a map. Or if it's a map, the key shouldn't be the collector type, rather it should be some sort of user-defined ID.
  • That way users can specify multiple instance of a single collector type (2 influxdb servers or json files for example). That would be very useful when metric filtering is implemented, allowing users to send different metrics to different outputs. And it could also be very useful if we decide to expand the aggregation capabilities to different collectors and metric types, for example giving us the option to send aggregated data to one collector for long-term storage and unaggregated data to another temporary one that's purged regularly.
  • Also, we probably want to restrict some collectors (i.e. the cloud one) to just one instance. We can easily do that with a per-module counter so those collectors, giving an error when a second instance is created.
  • This would be good enough for configuring stuff from the CLI or from the exported in-script options. We can still try to do some sort of configuration with environment variables, for example K6_OUTPUT_1_TYPE=influxdb or something like that. And if we use a map with ids as key, something like K6_OUTPUT_MY_ID_KEY_TYPE=influxdb. The trouble with environment variables is that we probably shouldn't allow them to be used for creating new collectors, just for configuring already existing ones, for both security and usability.
  • And finally, what should the name for the whole collector/output config be? collectors or outputs or output? We're currently calling them outputs in the documentation and in the CLI flag, but use collectors in the code and as the config json key...

@na--
Copy link
Member Author

na-- commented Aug 27, 2018

Another thing that we should consider. Generally in k6, options with more precedence will override options with less (defaults < config file < exported script options < environment variables < command-line flags). For example, if we have defined export let options = { vus: 10 }; in the script, but run it with --vus 20, the script will run with 20 VUs. Even somewhat compound values like tags, systemTags and stages behave like that.

In the case of outputs/collectors or whatever we want to call it, it makes more sense to be able to specify them in the different configuration sources in a way that they don't (always) cancel each other, so different collectors could be configured in different places. That gives some weight to allowing them to be configured as a map with user-defined IDs:

export let options = {
  collectors: {
    mydebug: {
      type: "json",
      dest: "test.json",
      // ...
    },
    db1: {
      type: "influxdb",
      url: "https://localhost:8086",
      tagsAsFields: ["vu", "iter"],
      // ...
    },
    db2: {
      type: "influxdb",
      url: "http://someinfluxdbhost.lan:8086",
      tagsAsFields: ["vu", "iter", "url"],
      // ...
    },
  }.
};

I think that approach gives us the greatest amount of flexibility:

  • Different collectors can be specified from different config layers without interference. If however we want to override something, just use the same collector ID in the upper layer as you used in the lower one
  • Configuring stuff via environment variables becomes a usable feature again... For example, setting the password of the second DB in the above example could be done via the K6_COLLECTOR_DB2_PASSWORD env var. This would be especially useful for using k6 in containerized environments where secrets often are passed as env vars...
  • Backwards compatibility - currently the collectors in the config json are specified as a map, only it uses the collector type as a key. We can still support that - if there's no type key in the collector entry, we can try using its key in the collectors map as the type, probably while emitting a warning if it works or an error if it doesn't. That easy backwards compatibility is why I also slightly prefer the use of the collectors key as opposed to the outputs or something else.
  • Gives us more flexibility in the future if for some reason we have to reference collectors in some way. For example, metric filtering/aggregation/transformations/etc.

I'm not sure how easy it would be to specify collector IDs via the CLI flags, but we can probably allow users to elide IDs and reuse the type as an ID unless they they try to use the same collector type multiple times.

@robingustafsson
Copy link
Member

@na-- I think your second proposal allowing the outputs/collectors to be configured as a map with user-defined IDs sounds like a good way forward given the factors you list.

In terms of the naming, I usually refer to the "collectors" as "outputs" which I think is a bit easier to interpret than the former, so my vote would be on using "outputs" 😄...as you say it's also what we've used in the user-facing parts: the CLI (-o/--out) and the docs (the config file feels like less of an issue to me as we could deprecate its use and then support converting collectors -> outputs for a version or two before completely removing it).

@na--
Copy link
Member Author

na-- commented Aug 27, 2018

Hmm good point, the json config is probably way less used than the CLI flags, so outputs it is 😄

@na--
Copy link
Member Author

na-- commented Aug 28, 2018

Something else occurred to me while dealing with other stuff, so note to future self: since this refactoring is going to involve obsoleting a lot of collector config avenues:

  • options['ext']['loadimpact'] (only for cloud collector stuff, cloud execution options like name and distribution would still be there)
  • output environment variables without any ID prefix, like K6_INFLUXDB_USERNAME
  • the collectors map in the JSON config

... and since we'd still need to support those config avenues for a couple of k6 versions (while emitting a warning), we need some way to detect if a user actually uses any of them. That could be troublesome since those configs are mostly just plain Go structs that we unmarshal stuff into and call their Apply() method... Not sure if there's a better solution for checking if any of the obsolete config options are used, but my best idea for the moment is to create an empty struct of the particular type and use reflect.DeepEqual() to compare it with the unmarshalled one.

@na--
Copy link
Member Author

na-- commented Sep 21, 2018

During a discussion with @Griatch I noticed that there are some issues with handling environment variables that current are supposed to overwrite some cloud config options, i.e. K6_CLOUD_PROJECT_ID, so something else to look into. And another point that was raised in that discussion is that, even if the option overriding was working perfectly, we'd still have to inject those updated values back into options['ext']['loadimpact'] map, since there shouldn't be discrepancies between that map and the new outputs configuration ...

@na-- na-- added the evaluation needed proposal needs to be validated or tested before fully implementing it in k6 label Aug 27, 2019
@na-- na-- removed their assignment Aug 27, 2019
@na-- na-- removed this from the v1.0.0 milestone Aug 27, 2019
@mstoykov mstoykov changed the title Move collector options in lib.Options Move Output options in lib.Options Oct 2, 2023
@mstoykov
Copy link
Contributor

I have made a a very small PoC commit (after some particular problems) that makes this possible. And due to some other changes through out k6 over the years let a script user get output configuration which turns out to be a thing few users want.

import execution from "k6/execution";

export default function() {
	console.log(execution.test.options.cloud)
	console.log(execution.test.options.outputs["cloud"])
}

will output
image
which for example let you get the test run id and use it to correlate results with it.

There is plethora of problems:

  1. PushRefID ... might be better off as TestRunId
  2. unfortunately a bunch of the options are null which likely won't be as useful - so they should be fixed ( I needed to fix the PushRefID for example)
  3. this requires we reset the options one more time - not certain this is such a great idea - so this needs more investigation
  4. merging options for outputs also should be investigated as now them being settable in script adds one more layer.
  5. all of points in the previous discussions here.

But this do work with -o cloud as well.

@mstoykov
Copy link
Contributor

cc @grafana/k6-browser I know that you are currently somewhat hackishly accessing test run id - does this also work for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 refactor ux
Projects
None yet
Development

No branches or pull requests

3 participants