-
Notifications
You must be signed in to change notification settings - Fork 161
Replacing jsonpath with jq and adding 'expressionLang' top level property #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -286,12 +286,12 @@ Our expression function definitions can now be referenced by workflow states whe | |||
"dataConditions": [ | |||
{ | |||
"name": "Applicant is adult", | |||
"condition": "{{ fn(isAdult) }}", | |||
"condition": "${ fn:isAdult }", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be a function call to an isAdult expression function declared in the function definitions?
In jq, I think a function call would look like isAdult(.)
, should we support something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we need a language specific mapping to tell parsers/validators that its a call to an expression functions so they substitute its "operation" value that includes the actual expression that needs to be evaluated. "fn" is just a namespace in this case.
the data the expression is evaluated against depends on if its used inside a certain filter, in a state property or in a top-level workflow property.
I will look at possibility to pass in the input specifically, for example "fn:isAdult(.customer)". Will test it out and do that in a separate pr if possible. Right now its assumed its always "fn:expressionFunctionName(.)" but users dont have to do the "(.)" part - is assumed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what bothers me about this is we need to inject special behavior into the middle of the jq expressions in order to call these functions, it's no longer JQ at this point but some strange hybrid language. This puts a burden on users of the workflow language because they have to learn this new expression language special to calling workflow functions, and I worry that it actually makes the runtimes more difficult to implement.
It seems like if we have the expectation that all of the workflows expression 'functions' are automatically exposed as jq functions by the runtime, then we shouldn't need a special syntax to tell the runtime to 'substitute' the call. The caveat would be that, if some of the expression functions aren't jq then they would not be accessible from jq expressions (which imho, is a reasonable approach).
Given the example above, it seems like the runtime under-the-covers could define the expression functions as jq functions before evaluating a condition.
Given this alternative way to define this line:
"condition": "${isAdult(.)}"
Then runtime could do something functionally equivalent to the follow to evaluate the expression:
def isAdult: .applicant | .age >= 18;
def isMinor: in.applicant | .age < 18;
isAdult(.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh sorry i went ahead and merged before realizing to see this comment
i will open an issue on this now and work on it for next week
or a reference to a defined [expression function definition](#Using-Functions-For-Expression-Evaluation). | ||
|
||
To reference a defined [expression function definition](#Using-Functions-For-Expression-Evaluation) | ||
the expression must have the following format, for example: | ||
|
||
```text | ||
{{ fn(myExprName) }} | ||
${ fn:myExprFuncName } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use jq's function call notation instead? ie; it seems like the runtime could expose expression functions as jq functions which can be accessed by jq expressions in the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think its something we should enforce (exposing function defs with type "expression" as jq functions). Some runtimes just might want to just substitute the operation value of the function def that includes the expression and just eval it.
specification.md
Outdated
@@ -1942,7 +1960,7 @@ If events defined in event-based conditions do not arrive before the states `eve | |||
| Parameter | Description | Type | Required | | |||
| --- | --- | --- | --- | | |||
| name | Data condition name | string | no | | |||
| condition | JsonPath expression evaluated against state data. True if results are not empty | string | yes | | |||
| condition | jq expression evaluated against state data. Must evaluate to true or false | string | yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the jq docs:
false and null are considered "false values", and anything else is a "true value".
Should this follow that standard? If not, what should happen to a workflow where some non-boolean value is returned by the filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think it would make more sense then to say
"jq expression evaluated against state data. Condition is considered as true if the result of the expression is true. If the result of the expression is anything else, the condition is considered false."
specification.md
Outdated
The `inputCollection` property is a JsonPath expression which selects an array in the states data. All iterations | ||
are done against data elements of this array. This array must exist. | ||
The `inputCollection` property is a workflow expression which selects an array in the states data. All iterations | ||
are performed against data elements of this array. This array must exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the workflow fail or throw an exception if the array does not exist? Or can we say the runtime should assume an empty array if the inputCollection would return a non-array value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if inputcollection is empty or does not exist there is nothing to loop over. i think this should be a runtime error and should be able to be handled in the onErrors def. will update to add that part
@jorgenj @ricardozanini |
Adding such |
not sure extension is needed as all expressions are inside workflow definition. we had this before and removed it to try to enforce portability as expression syntax has affects on portability..but it seems that pushing a single language only does significantly lower the adoption possibilities, as many runtimes have hard requirements to use a specific syntax for expressiosn which might not our "default" one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not actually asking for changes, but a couple of questions on the PR to continue the discussion
@@ -319,17 +319,28 @@ Note that different data filters play a big role as to which parts of the states | |||
evaluated. Reference the | |||
[State Data Filtering](#State-Data-Filtering) section for more information about state data filters. | |||
|
|||
All expressions must follow the [jq](https://stedolan.github.io/jq/) syntax. Jq is a very powerful JSON processor. You | |||
can find more information in the [jq manual](https://stedolan.github.io/jq/manual/). | |||
By default, all workflow expressions should be defined using the [jq](https://stedolan.github.io/jq/) syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking for this sentence in the call for the discussion around whether we enforce jq support (MUST) or just recommend it (SHOULD). This can stay as is then.
Serverless Workflow does not mandate the use of jq and it's possible to use an expression language | ||
of your choice with the restriction that a single one must be used for all expressions | ||
in a workflow definition. If a different expression language needs to be used, make sure to set the workflow | ||
`expressionLang` property to identify it to runtime implementations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call on 15/02: Do we need to distinguish transformationLang from conditionLang?
jq can do transformation (outputPath, etc.), but when we looked at JSONPath and JSONPatch, path was able to list found elements in an array while JSONPatch was able to make modifications (add/replace/remove/move/copy/...) and we didn't discuss languages that are just for writing conditions, but there are some. We need conditions for transitions, switch statements, etc and we need transformations for input/output path. If I specify expressionLang to be SQL, I wouldn't be able to do any transformations. And for conditions, it needs to be clear how to interpret the result (e.g. if an empty/nil result means false or if it's an error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can tell users which parts of jq to use at what places, not sure how to honestly enforce that (even on the runtime level).
a selection in case of our data filters is really also a modification, for example a state filter data output becomes the final output of that state, so by selection you are really modifying the state data that for example is passed to the next state.
i think for conditions we should just say they are boolean conditions and leave it at that
Where `expression` can be either an in-line expression, or a reference to a | ||
defined [expression function definition](#Using-Functions-For-Expression-Evaluation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add how to escape in case "${ notanexpression }"
is supposed to not be an expression, but a string. Suggesting to escape the dollar sign \$
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative could be to use an object instead of a string, i.e. leave string as string and have a type StringOrExpression
for e.g. operation parameters, that allows either a string or an object with the parameter expression
and an optional language:
parameters:
firstParam: "StaticText"
secondParam:
expression: "${ .message }"
thirdParam:
language: "JSONPath"
expression: "{{ .message }}"
or in designing the object, we could allow the language to be the parameter name:
parameters:
firstParam: "StaticText"
secondParam:
jq: "${ .message }"
thirdParam:
jsonpath: "{{ .message }}"
I'd like the last one most. But feel free to drop it now and we can add this in a follow-up PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to model the second scenario we would have to have a defined enum of "possible" expression languages in our schema. don't think we really wanna go that direction as we will never be able to keep up with it and support all adopters
i would like to see the case where atm defining more than one expression language would be needed. i think best maybe is for now to have the top-level expressionLang and to keep enforcing a single expression lang throughout the workflow definition until we have someone bring up a good use case where multiple ones would be needed. wdyt
@@ -1960,7 +1976,7 @@ If events defined in event-based conditions do not arrive before the states `eve | |||
| Parameter | Description | Type | Required | | |||
| --- | --- | --- | --- | | |||
| name | Data condition name | string | no | | |||
| condition | jq expression evaluated against state data. Must evaluate to true or false | string | yes | | |||
| [condition](#Workflow-Expressions) | Workflow expression evaluated against state data. Must evaluate to true or false | string | yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For conditions, we need to have bindings on expression languages, meaning we need to specify how to use them. Any third-party extension is fine, but we need to say for jq that we want the result of the expression to be either JSON true
or JSON false
and that nil causes a runtime error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also feel free to postpone this in an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jq can return true/false i would assume that anything else than true could be interpreted as false. so imo null/nil would be false in evaluation. i think other types of exceptions during expression evaluation should also just be handled as false in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the JQ docs:
false and null are considered "false values", and anything else is a "true value".
It seems like this would be really confusing for users of the workflow language to have to bridge that gap. For example, they'd have to take extra special care for a pretty common/basic scenario like the following:
Input data:
{ "aNullableAgeField": null }
Given input like the above, a workflow might want to do something special if the user is a minor (or didn't provide their age). The natural way to specify this in JQ would be:
"functions": [
{
"name": "isMinorOrNoAgeProvided",
"operation": "${.aNullableAgeField < 18}",
"type": "expression"
}
]
Unfortunately that would return null for the sample input above, which is suggested to cause a runtime exception.
Workflow users would have to add a lot of boiler-plate to their expressions to guard against this, like so:
"functions": [
{
"name": "isMinorOrNoAgeProvided",
"operation": "${.aNullableAgeField == null OR .aNullableAgeField < 18}",
"type": "expression"
}
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would be the suggestion to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're targeting support for many expression lang choices, perhaps we can just mandate that the runtime is responsible for translating any potential value to a concrete true or false based on the standard behavior of that expression lang? For the JQ example, this means that the runtime should translate according to JQ behavior? Basically, we set the expectation that for any boolean expression, it's important that the runtime ensures that they properly translate any potential value from the user-defined expression to return exactly true or false and never any other value?
I think it's a subtle difference from what the doc says now, it's just that the current wording implies that it's on the writer of the workflow to handle these cases, but I think we should say that the runtime should handle it for them instead, that way users can write idiomatic JQ if they're using JQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@jorgenj could you rewrite the current text in the spec to express what you stated here please?
actions: | ||
- name: eventInfoAction | ||
functionRef: | ||
refName: consoleFunction | ||
parameters: | ||
log: ">>> event $event.type caused by $.event.data.provider" | ||
log: ">>> event ${ .event.type } caused by ${ .event.data.provider }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String formatting seems like a third kind of use for expression languages. It looks familiar, is there some common string formatting in e.g. helm templating that can be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry idk much about helm. could you show example so i understand what this means?
rebased |
need to rebase :) |
Signed-off-by: Tihomir Surdilovic <tsurdilo@redhat.com>
Signed-off-by: Tihomir Surdilovic <tsurdilo@redhat.com>
Signed-off-by: Tihomir Surdilovic <tsurdilo@redhat.com>
will go ahead and merge - @jorgenj if you find any issues w/ this please raise them and i'll fix asap |
Signed-off-by: Tihomir Surdilovic tsurdilo@redhat.com
Many thanks for submitting your Pull Request ❤️!
Please specify parts this PR updates:
What this PR does / why we need it:
Special notes for reviewers:
Additional information (if needed):
Fixes #216