You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally created by @pnadolny13 on 2022-01-20 19:40:37
I've had a couple use cases where I'd like to set different environment variables at different levels. For sensitive credentials I set them in the .env while non-sensitive environment specific settings (like schema_name) get set at the meltano.yml environment level.
The functionality I want is to be able to combine both sensitive and non-sensitive variables for use in a plugin as an environment variable. A use case is in the squared repo where we have an Athena connection string used in Great Expectations and Superset:
I want to be able to set the secret keys in my .env and the path and schema in my meltano.yml environment. Right now templating is supported in meltano.yml for config values using environment variables but for some plugins, especially utilities, configs arent useful and we need to be building environment variables that the utility can access. Templating doesnt seem to be supported at the env level underneath the environment key. So right now I can set secrets in the .env and non-sensitive variables at the environment.env level in meltano.yml but I have no way of blending the two to pass along to a plugin, so I end up building the connection string multiple times in the plugins vs defining it once.
Ideas: Add an env key to plugin configs that allows you to use templating to build a new environment variable at the plugin/environment level. Or...allow templating at the current env key level based on whats in your .env file.
Currently, env variables at different levels of the code cannot necessarily rely on each other or build upon each other.
Proposed solution:
We would define environment variable inheritance to be valid across these layers, in this strict order:
Level 1: terminal context (inherited either via .env or via shell env vars)
Level 2: top-level env: declarations within meltano.yml
Level 3: environment-level env: declarations.
Level 4: env: declarations within top-level plugin definitions.
Level 5: env: declarations within environment-scoped plugin definitions.
Example
This shows how inheritance would be expected to function across the 5 named levels. (Could be a unit test during development.)
.env
# Level 1: terminal context# Inherits from none
LEVEL_NUM="1"
STACKED="1"
env:
# Level 2: top-level `env:`# Inherits from terminal contextLEVEL_NUM: "2"# '2'STACKED: "${STACKED}2"# '12'plugins:
extractors:
tap-foobar:
env:
# Level 4: plugin-level `env:`# Inherits from a environment-level `env:` if an environment is active# Inherits directly from top-level `env:` if no environment is activeLEVEL_NUM: "4"# '4'STACKED: "${STACKED}4"# '1234'environments:
prod:
env:
# Level 3: environment-level `env:`# Inherits from top-level `env:`LEVEL_NUM: "3"# '3'STACKED: "${STACKED}3"# '123'plugins:
extractors:
tap-foobar:
env:
# Level 5: environment-level plugin `env:`# Inherits from (global) plugin-level `env:`LEVEL_NUM: "5"# '5'STACKED: "${STACKED}5"# '12345'
FAQ
Q. Why does the global plugin level config (level 4) inherit from the environment-level config?
A: This is counterintuitive at first but actually quite necessary.
Imagining that the plugin settings globally include a call to $SNOWFLAKE_USERNAME and imagine that each environment has its own value for SNOWFLAKE_USERNAME: ___ in the respective env:. We don't want each environment to have to redundantly declare the plugin properties but we do want to use the $SNOWFLAKE_USERNAME value that is most appropriate from the given execution context. Hence, the global-level plugin can be defined once to leverage whatever value is in $SNOWFLAKE_USERNAME and the plugin's configuration will still be seeded properly according to which environment is active.
Providing level info for research/debugging
Perhaps in the dry-run proposal in #3146, it would be good to print where each environment variables are sourced. We will need to do this without exposing sensitive values.
So, if the example above were real, we might get debug info such as:
meltano run --dry tap-foobar target-snowflake
Printing debug info in response to `--dry` CLI argument. No actions will be executed.
The plugin `tap-foobar` was declared with the following context:
- env:LEVEL_NUM - derived from the following, in order of precedence:
- environments[prod].plugins[tap-foobar].env[LEVEL_NUM]
- plugins[tap-foobar].env[LEVEL_NUM]
- environments[prod].env[LEVEL_NUM]
- env[LEVEL_NUM]
- shell.env[LEVEL_NUM]
...
Or perhaps a more realistic example:
meltano run --dry tap-foobar target-snowflake
Printing debug info in response to `--dry` CLI argument. No actions will be executed.
The plugin `tap-foobar` was declared with the following context:
- 'tap-foobar.config[username]' was derived from the following, in order of precedence:
- environments[prod].env[FOOBAR_USERNAME]
- env[FOOBAR_USERNAME]
...
In the above example, the 'prod' declaration of FOOBAR_USERNAME clobbers the global one. The user can infer from this printout that their prod-level config is working correctly, and if they were to remove the prod-level config, they'd get the value of the top-level env: declaration.
Related Topics
More discussion on this can and probably should be moved to a follow on issue, but for completeness:
Further detail and settings-level env: declarations (level 6)
Not listed above, but a plugin setting may have env: declared, which operates differently from the other layers. This plugin.settings[].env: declaration simply accepts a string as env variable name, and the specified setting value will get injected and sent into the plugin's subprocess.
This functionality is required in order to have plugins behave as expected. Non-singer plugins don't receive a config.json so the env variable injection is the substitute.
I think the env: function at the setting layer should always be a pure child process injection (clobbering any existing env var but not reading from it).
We should probably stop this layer 6 from also sourcing data from layers 1-5, and instead make it a pure one-way injection into the plugin context. As described in #3447
I do not know if the current status quo for an existing env variable value will clobber an explicitly defined config value, or if the config value would clobber the env variable value, essentially replacing the env var value within the plugin's child process.
The text was updated successfully, but these errors were encountered:
Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/3173
Originally created by @pnadolny13 on 2022-01-20 19:40:37
I've had a couple use cases where I'd like to set different environment variables at different levels. For sensitive credentials I set them in the .env while non-sensitive environment specific settings (like schema_name) get set at the meltano.yml environment level.
The functionality I want is to be able to combine both sensitive and non-sensitive variables for use in a plugin as an environment variable. A use case is in the squared repo where we have an Athena connection string used in Great Expectations and Superset:
awsathena+rest://[YOUR_AWS_KEY]:[YOUR_AWS_SECRET]@athena.us-east-2.amazonaws.com/[YOUR_SCHEMA_NAME]?s3_staging_dir=s3://[YOUR_S3_PATH]
I want to be able to set the secret keys in my .env and the path and schema in my meltano.yml environment. Right now templating is supported in meltano.yml for config values using environment variables but for some plugins, especially utilities, configs arent useful and we need to be building environment variables that the utility can access. Templating doesnt seem to be supported at the
env
level underneath theenvironment
key. So right now I can set secrets in the .env and non-sensitive variables at theenvironment.env
level in meltano.yml but I have no way of blending the two to pass along to a plugin, so I end up building the connection string multiple times in the plugins vs defining it once.Ideas: Add an
env
key to plugin configs that allows you to use templating to build a new environment variable at the plugin/environment level. Or...allow templating at the currentenv
key level based on whats in your .env file.Update (2022-05-05)
Scope of this issue is now to:
Update (2022-05-03)
Relevant Miro Board where we're whiteboarding all of this
Spec Update (2022-04-08)
Note: this updated spec follows from my comment here: https://gitlab.com/meltano/meltano/-/issues/3173#note_817409586
Problem statement:
Currently, env variables at different levels of the code cannot necessarily rely on each other or build upon each other.
Proposed solution:
We would define environment variable inheritance to be valid across these layers, in this strict order:
.env
or via shell env vars)env:
declarations withinmeltano.yml
env:
declarations.env:
declarations within top-level plugin definitions.env:
declarations within environment-scoped plugin definitions.Example
This shows how inheritance would be expected to function across the 5 named levels. (Could be a unit test during development.)
.env
FAQ
Q. Why does the global plugin level config (level 4) inherit from the environment-level config?
A: This is counterintuitive at first but actually quite necessary.
Imagining that the plugin settings globally include a call to
$SNOWFLAKE_USERNAME
and imagine that each environment has its own value forSNOWFLAKE_USERNAME: ___
in the respectiveenv:
. We don't want each environment to have to redundantly declare the plugin properties but we do want to use the$SNOWFLAKE_USERNAME
value that is most appropriate from the given execution context. Hence, the global-level plugin can be defined once to leverage whatever value is in$SNOWFLAKE_USERNAME
and the plugin's configuration will still be seeded properly according to which environment is active.Providing level info for research/debugging
Perhaps in the
dry-run
proposal in #3146, it would be good to print where each environment variables are sourced. We will need to do this without exposing sensitive values.So, if the example above were real, we might get debug info such as:
Or perhaps a more realistic example:
In the above example, the 'prod' declaration of
FOOBAR_USERNAME
clobbers the global one. The user can infer from this printout that theirprod
-level config is working correctly, and if they were to remove the prod-level config, they'd get the value of the top-levelenv:
declaration.Related Topics
More discussion on this can and probably should be moved to a follow on issue, but for completeness:
Further detail and settings-level
env:
declarations (level 6
)Not listed above, but a plugin setting may have
env:
declared, which operates differently from the other layers. Thisplugin.settings[].env:
declaration simply accepts a string as env variable name, and the specified setting value will get injected and sent into the plugin's subprocess.config.json
so the env variable injection is the substitute.env:
function at the setting layer should always be a pure child process injection (clobbering any existing env var but not reading from it).env
variable value will clobber an explicitly defined config value, or if the config value would clobber the env variable value, essentially replacing the env var value within the plugin's child process.The text was updated successfully, but these errors were encountered: