Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: An improved method to resolve variable expressions #3454

Open
mattcasters opened this issue Nov 24, 2023 · 18 comments
Open
Assignees
Labels
API new feature P2 Default Priority

Comments

@mattcasters
Copy link
Contributor

What would you like to happen?

Right now, if we have a variable expression, for example ${MY_VARIABLE}, any value that we have on the books for MY_VARIABLE in the local IVariables instance will be substituted and returned during IVariables.resolve().

The API that is used StringUtil.substitute() already supports recursive or nested variable resolution. However, this method is not used anywhere right now.

It would be great if we could give Hop users a way to signal that recursive or nested substitution of variables should take place. Some suggestions come to mind:

  • A new way to specify variables, for example with round brackes $(MY_VARIABLE)
  • A prefix system to the variables: ${NESTED(2):MY_VARIABLE}, for example to resolve the variable 2 levels deep.

If we think about the second prefixing system we can even think of a pluggable, configurable, system which would allow us to do more than just nested variables. For example we could build a ${ENCODED:DATABASE_USER}, for example if you want to obfuscate or encrypt the username of a database (not just its password) in an environment properties file.
We could combine prefixes, for example: ${ENCODED(EAS2):NESTED(1):MY_VARIABLE}.
Other ideas are ways to get values from a secrets vault or resolve a value from a certain web-service.

The counter argument to all this is that it's a very cryptic way of doing things. Even as I'm typing this I'm not liking it.
Perhaps we can think of a pluggable system attached to the lifecycle environment in which we're allowing these options for variables. That way we can build proper user interfaces reflected in proper configuration files and metadata.

Issue Priority

Priority: 3

Issue Component

Component: API

@hansva
Copy link
Contributor

hansva commented Nov 24, 2023

+1 on prefixing, at least for the provider (AES/ENCRYPTED/MYVAULT).

As for the resolving of nested variables, I lean towards making the default way to handle variables and keep the syntax simple/optional. I see something like this working (just thinking out loud):

${MY_VARIALBE} -> full nested lookup
${::MY_VARIALBE} -> full nested lookup -> level 0
${:1:MY_VARIALBE} -> full nested lookup -> level 1

${AES:MY_VARIABLE} -> full nested with AES provider
${AES::MY_VARIABLE} -> level 0
${AES:1:MY_VARIABLE} -> level 1 deep

@mattcasters
Copy link
Contributor Author

The counter argument that I saw was that in development you might just want to use clear-text passwords and in production secrets are stored in a vault. Mind you, I'm not opposed to the prefixes. I just think that having this 'symbolic link' functionality could be cool to have as well.

@mattcasters
Copy link
Contributor Author

mattcasters commented Nov 24, 2023

For any environment you're in it would be cool to have a 'symbolic link' feature for variables.
You would be able to map a variable name to one or more operations or even the execution of a pipeline.

  • Nested resolution
  • Decode/decrypt a value
  • Look up in secrets vault
  • Execute a Hop pipeline to retrieve a value (ex: get a token from EntraID to use in the whole environment).
  • ...

@hansva
Copy link
Contributor

hansva commented Nov 24, 2023

If we have variable nesting working your variable in the pipeline/database metadata can be ${MY_VARIABLE} and in the environment in development it would be "my cleartext password" in production it could be ${vault:secret}. This way you can hit both birds with one stone :)

@mattcasters
Copy link
Contributor Author

Yes indeed. One the one hand you want some sane defaults, nothing too much to configure, just to get some basic functionality like nested resolution. On the other hand we'll have users who need a bit of extra hand-holding to configure vault access or which pipeline to execute to get a value.
Is there a risk to enabling nested resolution one level deep by default, besides performance?
Perhaps we can have a system variable like HOP_VARIABLE_RESOLUTION_DEPTH with default set to 0 (used now).

@mattcasters
Copy link
Contributor Author

Question remains that we need a place to store the environment configuration metadata.
The project metadata isn't really a suited location for it. We probably need a separate metadata location, alongside the environment configuration JSON file(s).

@sramazzina
Copy link
Contributor

Sorry for the stupid question but what do we intend by variable's depth? That is not clear to me

@hansva
Copy link
Contributor

hansva commented Nov 24, 2023

for example you can have a variable that is ${MY_VARIABLE} but that variable could be configured in the environment with a value of ${MY_OTHER_VARIABLE} which could then be specified in hop-config as ${HOP_CONFIG_FOLDER} so then it goed 2 variables deep

@sramazzina
Copy link
Contributor

Cristal clear thanks for the explanation

@mattcasters
Copy link
Contributor Author

I was thinking about a simple generic plugin type to do a value lookup. The lookup could apply to one or more types (variables in our case) and have to be ability to return either a single value or a list.
We could use it in various places to give back all sorts of values, in the GUI mostly, but also in things like variable replacement.

@vdwals
Copy link
Contributor

vdwals commented Nov 27, 2023

Sharing my perspective as a devoted user with a deep appreciation for your work on Hop:

  1. The notion of utilizing a keyword for decrypting variables resonates with me. Presently, the support for variable decryption seems a bit arbitrary, introducing uncertainty about which values will undergo decryption.

  2. In a recent project that involved recursive variable replacement Olymp - SnippetService, I applied a methodology where the replacement process persists until no more variables can be identified in the string. This recursive approach naturally extends to the deepest level by default. This leads me to consider whether it is essential to introduce a specific keyword for nesting in Hop, or if the tool could seamlessly replace variables as long as they persist in the result (provided that the result is not identical to the input to prevent infinite loop, of course).

for example you can have a variable that is ${MY_VARIABLE} but that variable could be configured in the environment with a value of ${MY_OTHER_VARIABLE} which could then be specified in hop-config as ${HOP_CONFIG_FOLDER} so then it goed 2 variables deep

@mattcasters
Copy link
Contributor Author

Great points Dennis. I guess I'm more concerned with backward compatibility. I'm thinking specifically about the people that built around the current limitations of variable replacement.
Depth indeed doesn't mean all that much. A limit was put on it to avoid some infinite loop in the current code. That itself is asinine since you can simply see that the output is the same as the input and just give up right there and then.

@mattcasters
Copy link
Contributor Author

So the plot thickens. I was writing a few extra unit tests for StringUtil, the place where the action is taking place.
It turns out we already do recursive variable replacement.

@hansva
Copy link
Contributor

hansva commented Nov 29, 2023

Then in this case it might be an error I introduced in the PROJECT_HOME extension point. I will take a look

@mattcasters
Copy link
Contributor Author

That doesn't mean the other ideas aren't valid. Working with a prefix does show the intent of the variable replacement clearly. ${AES:SOME_SECRET} is pretty clear once you know what's going on.

@mattcasters
Copy link
Contributor Author

The idea to have a new 'Lookup' plugin type in core is meant to allow all plugins to contribute and use without too many class loader issues. It also allows us to integrate with metadata.
For example, we might have the "Projects" plugin that knows about the way to access a secrets vault, stored in some metadata folder somewhere. It would expose a lookup plugin with prefix VAULT. The StringUtil code would simply need to look in the core lookup plugins which lookup this represents. We wouldn't have to refactor thousands of resolve() calls across the codebase.

@usbrandon
Copy link
Contributor

Does this mean if we are setting a variable in a transform "Set Variables", if we put ${AES:MY_STRING} that the step would interpret that to be "Use AES to encrypt the string when setting the value from the stream." If on the other hand we were using a Get Variables transform and used ${AES:MY_STRING}, should we expect clear text to come out from an encrypted value? Is this just metadata? (encrypted_yn, and encryption_used = 'AES')?

@mattcasters
Copy link
Contributor Author

mattcasters commented Dec 7, 2023

@usbrandon Variable expressions are basically used to look up a value somewhere. Setting a value in a Hop context can either be a clear text value or indeed another variable expression like the one you mentioned. No encoding or decoding is done at the time of setting the variable, only when the variable expression is decoded and the actual value is looked up. The results of the lookup are in most cases indeed a clear text value since that's what the transform, action or metadata element needs at runtime.
I guess the main point of keeping the expression intact as long as possible is to avoid keeping the clear text secret in memory , to have it available only locally where needed, and not in the whole Hop runtime context. When the secret is needed a lookup is done in a config file, an asymmetric decryption is done, a secrets vault is queried, a token is requested somewhere, and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API new feature P2 Default Priority
Projects
None yet
Development

No branches or pull requests

5 participants