Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short-circuit behavior for the && and || operators #538

Closed
wants to merge 3 commits into from

Conversation

apparentlymart
Copy link
Contributor

@apparentlymart apparentlymart commented Jun 14, 2022

Previously we just always evaluated both operands of any binary operator and then executed the operator.

For the logical operators that was inconsistent with their treatment in several other languages with a similar grammar to ours, leading to users being surprised that HCL didn't short-circuit the same way as other languages they had prior experience with.

For our purposes here, "short-circuit" is a colloquial term for ignoring later operands of an operation if earlier operands are sufficient to decide the result, as discussed in Short-circuit evaluation on Wikipedia.

Since HCL expressions are assumed to be side-effect-free, the primary motivation for short-circuit evaluation in this language is to permit using the LHS of a conditional operator as a "guard" to avoid certain kinds of error on the RHS which are "cancelled out" by the condition in LHS. A secondary motivation is that if LHS is known and activates short-circuit, the "known-ness" of RHS is not considered when deciding the result, and so e.g. true && unknown(bool) can return true instead of unknown(bool).

Our initial exclusion of short-circuiting here was, as with various other parts of HCL's design, motivated by the goals of producing consistent partial results even when unknown values are present, and of proactively returning as many errors as possible in order to give better context when debugging.

However, in acknowledgement of the fact that logical operator short-circuiting has considerable ergonomic benefits when writing out compound conditional expressions where later terms rely on the guarantees of earlier terms, this commit implements a compromise design where we can get the short-circuit benefits for dynamic-value-related errors without losing our ability to proactively detect type-related errors even when short-circuiting, and with a suitable approximation for situations where the LHS value is unknown.

Specifically, if a particular operator participates in short-circuiting then it gets an opportunity to decide for a particular known LHS value whether to short-circuit. If it decides to do so then HCL will evaluate the RHS in a type-checking-only mode, where we ignore any specific values in the variable scope but will still raise errors if the RHS expression tries to do anything to them that is inconsistent with their type.

If the LHS of a short-circuit-capable operator turns out to be an unknown value of a suitable type, we'll pessimistically treat it as a short-circuit and defer full evaluation of the RHS until we have more information, so as to avoid raising errors that would be guarded away once the LHS becomes known.

The effect of this compromise is that it's possible to use the short-circuit operators in common value-related guard situations, like checking whether a value is null or is a number that would be valid to index a particular list:

  • foo != null && foo.something
  • idx < 3 && foo[idx]

On the other hand though, it is not possible to use the behavior to guard against problems that are related to types rather than to values, which allows us to still proactively detect various kinds of errors that are likely to be mistakes:

  • foo != null && fo.something # Typoed "foo" as "fo"
  • foo != null && foo.smething # Typoed "something" as "smething"
  • num < 3 && num.something # Numbers never have attributes

Those coming from more dynamic languages will probably still find these errors surprising, but HCL's other features follow a similar sort of hybrid model of statically checking types where possible and that tradeoff seems to have worked well to make it possible to express more complicated situations while still providing some of the help typically expected from a static type system for "obviously wrong" situations.

It should typically be possible to adjust an expression to make it friendly to short-circuit guard style by being more precise about the types being used. If all else fails, the existing conditional operator ... ? ... : ... offers a way to be more explicit about the intended sequence of evaluation and thereby "get away with" more kinds of errors on its unvisited arm.


A change like this could, if designed or implemented incorrectly, cause regressions for existing configuration, and so we first need to convince ourselves that this change will not violate the guarantees we typically consider important for backward-compatibility, including:

  • Not returning errors in any new situations that were valid before. (It's okay to return errors in fewer situations.)
  • Not producing unknown values in any context that would previously have yielded a known value. (An unknown value becoming known is okay, though.)
  • For any expression that can be evaluated without error, it produces the same value it produced before even though it might reach that result in a different way.

I believe that this compromise meets all of those constraints, but given the potential scope of disruption of an incorrect design or implementation here I would like some second opinions. Therefore I would appreciate both design and early implementation review even though I currently have this marked as Draft.

If this design seems plausible (even if we identify some bugs in the implementation of that design) then another important step here will be to describe the new behavior more completely in the specification. The current native syntax specification actually lacks any specific language about the behaviors of the individual binary and unary operators at all, so that may also be a good opportunity to fill that gap and describe the intended correct behavior of all of them.

The main motivation here is that we were apparently using bit shifting
with signed shift amounts even though modern Go toolchains seem to raise
that as an error if the go.mod file selects a language version less than
Go 1.13.

Go 1.17 is therefore newer than we strictly need to solve that particular
problem, but gets us caught up with a relatively release of the language.
This alone doesn't break compatibility for anyone using older versions of
Go with HCL, since the Go toolchain will still attempt to compile modules
targeting later versions and will only mention the newer language version
if compilation would already have failed for some other reason.
Previously we just treated all binary operators as being left-associative,
which for many of them didn't make a lot of difference anyway because they
were associative operators.

The logical AND and OR operators are also effectively associative the way
we have them implemented right now, because HCL always evaluates both
sides of a binary operator anyway and so there's no way to depend on
the associativity for either of these operations.

However, we'd like to implement a similar short-circuit behavior as we see
in many other similar languages, at which point the evaluation order of
the operations would become important in order to constrain which
expressions get fully evaluated and which are "short-circuited". Treating
these operators as right-associative instead of left-associative will
make the subsequent implementation more intuitive, since the LHS of the
outermost operation will decide whether to evaluate the RHS, rather than
recursing into the LHS completely first and then unwinding if the innermost
term causes the short-circuit.

Since we expect HCL expression evaluation to always be side-effect-free
it doesn't technically matter what associativity we use -- the result would
be the same either way -- but this approach will make the control flow
during evaluation match intuition about how a short-circuit behaves.

This commit does not yet actually implement the short-circuit behavior,
which will hopefully follow in a subsequent commit.
Previously we just always evaluated both operands of any binary operator
and then executed the operator. For the logical operators that was
inconsistent with their treatment in several other languages with a
similar grammar to ours, leading to users being surprised that HCL didn't
short circuit the same way as their favorite other languages did.

Our initial exclusion of short-circuiting here was, as with various other
parts of HCL's design, motivated by the goals of producing consistent
results even when unknown values are present and of proactively returning
as many errors as possible in order to give better context when debugging.

However, in acknowledgement of the fact that logical operator
short-circuiting has considerable ergonomic benefits when writing out
compound conditional expressions where later terms rely on the guarantees
of earlier terms, this commit implements a compromise design where we
can get the short-circuit benefits for dynamic-value-related errors
without losing our ability to proactively detect type-related errors even
when short-circuiting.

Specifically, if a particular operator participates in short-circuiting
then it gets an opportunity to decide for a particular known LHS value
whether to short-circuit. If it decides to do so then HCL will evaluate
the RHS in a type-checking-only mode, where we ignore any specific values
in the variable scope but will still raise errors if the RHS expression
tries to do anything to them that is inconsistent with their type.

If the LHS of a short-circuit-capable operator turns out to be an unknown
value of a suitable type, we'll pessimistically treat it as a short-circuit
and defer full evaluation of the RHS until we have more information, so
as to avoid raising errors that would be guarded away once the LHS becomes
known.

The effect of this compromise is that it's possible to use the
short-circuit operators in common value-related guard situations, like
checking whether a value is null or is a number that would be valid
to index a particular list:
    foo != null && foo.something
    idx < 3 && foo[idx]

On the other hand though, it is _not_ possible to use the behavior to
guard against problems that are related to types rather than to values,
which allows us to still proactively detect various kinds of errors that
are likely to be mistakes:
    foo != null && fo.something   # Typoed "foo" as "fo"
    foo != null && foo.smething   # Typoed "something" as "smething"
    num < 3 && num.something      # Numbers never have attributes

Those coming from dynamic languages will probably still find these errors
surprising, but HCL's other features follow a similar sort of hybrid
model of statically checking types where possible and that tradeoff seems
to have worked well to make it possible to express more complicated
situations while still providing some of the help typically expected from
a static type system for "obviously wrong" situations. It should typically
be possible to adjust an expression to make it friendly to short-circuit
guard style by being more precise about the types being used.
Copy link
Contributor

@alisdair alisdair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design and implementation makes sense to me. I had a concern about the asymmetry of handling LHS vs RHS unknown values, but I don't see any way to support the goals of short circuiting as flow control alongside the maximally flexible promotion of unknown to bool suggested in this downstream Terraform issue.

That is, I think to fully implement the truth tables in the linked issue, we would always need to evaluate both sides of the expression, which would result in errors for many of the use cases this change aims to permit. Hopefully documenting the evaluation order as part of this change will clarify the situation for anyone expecting unknown || true to be promoted to true.

// The child does not initially have any of its own functions defined, and so
// it can still inherit any defined functions from the reciever.
//
// Because is function effectively takes a snapshot of the variables as they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Because is function effectively takes a snapshot of the variables as they
// Because this function effectively takes a snapshot of the variables as they

@apparentlymart
Copy link
Contributor Author

Thanks for linking to that Terraform issue, which I hadn't considered looking for because my focus here was on the error handling.

I think it would be worth exploring a more symmetrical model as you are hinting at here. We can in principle evaluate both sides in normal mode first and then discard the diagnostics from "the other side" if one of them turns out to be a short-circuit; that would involve running the other side logic a second time if the first run had encountered an error, but we don't typically try to juice the performance of the error return path (and have plenty precedent of doing extra work there to give better feedback) so I think it's promising.

This also prompted me to ponder if there is any reason why the value error discarding couldn't be symmetrical for consistency with unknown values being handled symmetrically. I've not yet thought deeply about it, but I suspect that the reason why left-to-right asymmetrical short circuit is the prominent design in familiar languages is because of the need to control for side-effects; in a language like HCL which assumes side-effect-free evaluation, perhaps symmetrical short-circuiting is reasonable and more helpful. 🤔

(There are some notable ways that applications push the bounds of this side-effect-free assumption, such as Terraform's file function that will re-read the designated file for each call and can therefore expose the side-effect if the file is large (so the runtime increases) or if something concurrently modifies the file. It might therefore be risky in practice to rely too much on that assumption. I think we'll need to do a survey of common HCL-based languages to see if there are any other similar gotchas.)

@apparentlymart
Copy link
Contributor Author

I'm no longer working at HashiCorp after today, so I won't be able to interact with the branch of this PR anymore and so I'm going to close it but hope it'll still be a useful starting point for a future PR with a similar goal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants