Proposal: core builtin extensions #943

cueckoo · 2021-07-03T10:36:22Z

Originally opened by @mpvl in cuelang/cue#943

Definitions

Before we introduce some of the proposed builtins, we formally introduce some as-of-yet undocumented language features.

Functions

We propose cue supports named argument functions and calls to “structs” as a shorthand for the common macro pattern (e.g. (s & { _, a: x}).out).

A function argument is now defined as:

	Argument       = [ identifier ":" ] Expression .

Any named argument must be followed by other named arguments.

The expression s(a: x, b: y), where s is a struct, is now a shorthand for s & {_, a: x, b: y).

Validator

A validator is a special builtin that is evaluated by unifying it with other values whereby the result is one of a few outcomes:

pass: returns _ if the validation is successful and making the value with which it was unified more specific does not change this result (or it is a final evaluation).
incomplete error: the validation failed, but making the value with which it was unified more specific could change this result
fatal error: the validation failed and making the value more specific cannot change this result.

A validator must be run at the last stage of evaluating a node, after a fixed point is reached evaluating all all non-validator values, in which case any error is considered a fatal error. A validator may be run at earlier stages of the evaluation of a node, in which case an incomplete error signifies that the decision on validity must be postponed.

An example of a language-level validator is <10. struct.MinFields and struct.MaxFields are examples of validators of builtin packages.

Validators can be thought of as a Go function that has an error return signature.

Inferred validators

Optional: Builtin functions that have the signature foo(x1, x2, …, xn) bool may be implicitly interpreted as validators of the signature foo(x2, …, xn) error.

The CUE function notation

We define the following signature format for cue functions:

FunctionDecl   = identifier Arguments "::" Expression .   
Arguments      = "(" [ Argument { "," Argument } [ "," ] ] ")" .
Argument       = [ identifier ":" ] Expression .

Either all or none of the arguments should be named.

The following rules apply for calling functions with this signature:

An argument with a default value in its expression may be omitted in a call. All other arguments must be present in a call.
A call must either have all named, or all unnamed arguments. This could be rel

These rules could be relaxed later.

Proposed builtins

builtins to replace `_|_` (bottom)

Although _|_ is part of the standard CUE idiom, it has several issues:

no ability to associate user-defined message to bottom
meaning of comparison against bottom is unclear
the symbol looks offensive to some

We intend to deprecate the bottom symbol (keeping it around for backwards compatibility) and replace it with builtins that clearer conveys the intent of its usage.

Comparison is not supported by the spec (arguably), but it is a crucial piece of functionality for many CUE configurations. The meaning of it is unclear, however. In many cases, it is used to check whether a reference exists. In some cases, however, the intended meaning is to check that a value is valid. In reality, CUE implements a semantic that is somewhere in between the two cases: it checks the validity of a value, but not recursively.

Note that if any of these builtins return false, they may still be satisfied at a later point in time. Evaluation should take this into account, as usual.

`_|_` replacement: `error(msg: string | *null) :: _|_`

The use of error(msg) replaces the common use of _|_ with the added ability to associate a user message with an error. When used within a disjunction, the error will get eliminated as usual, but upon failure of the disjunction, the user-supplied error is used as an alternative error message.

Comparison to bottom

Uses of comparison against bottom will need to be replaced with one of the following builtins.

`isconcrete(expr) :: bool`

isconcrete reports whether expr resolves to a concrete value, returning true if it does and false otherwise. It is a fatal error if an expression can never evaluate to true.

Example:

a: {}
b: int

c: isconcrete(a)   // true
d: isconcrete(b)   // false
e: isconcrete(a.b) // false(b could still be defined)
f: isconcrete(b.c) // fatal error (b.c can never be satisfied)

Purpose: replaces if a.foo != _|_ {, where it is checked whether a.foo exists with the purpose of determining whether it is a concrete value.

`exists(expr) :: bool` (optional)

exists reports whether expr resolves to any value.

Example:

a: {}
b: int

c: exists(a)   // true
d: exists(b)   // true
e: exists(a.b) // false (b could still be defined)
f: exists(b.c) // fatal error (b.c can never be satisfied)

opt?: int
ref:  exists(opt)  // false considered to be non-existing.

req!: int
ref:  exists(req)  // false

Purpose: replaces if a.foo != _|_ {, where it is checked whether a.foo exists regardless of concreteness.

validator builtins

`must(expr: _, msg: string | *null) :: _`

must(expr) passes if expr evaluates to true and fails otherwise.

Must can be used to turn arbitrary expressions into constraints. For instance, a: <10 can be written as a: must(a < 10). See Issue #575 for details

`not(expr) :: _`

not(expr) passes if unified with a value x for which expr&x fails and false otherwise.
See #571 for details.

Examples:

a: not(string) // number | bytes | {...} | [...] | bool | null

`numexist(count, ...expr) :: _`

numexist(count, ...expr) passes if the number of expressions for which exists(x) evaluates to true unifies with count.

The main purpose of numexist is to indicate mutual exclusivity of fields.

#X: {
    // either foo or bar may be specified by the user
    numexist(<=1, foo, bar)
    foo?: int
    bar?: int
}

`numconcrete(count, ...expr) :: _` (optional)

numconcrete(count, ...expr) passes if the number of expressions for which isconcrete(x) evaluates to true unifies with count.

`numvalid(count, ...expr) :: _` (optional)

numvalid(count, ...expr) passes if the number of expressions for which isvalid(x) evaluates to true unifies with count.

Builtins related to concrete values

Purpose: combine schema of different instances of the same package that would otherwise fail because there are conflicting definitions.

`manifest(x) :: _`

manifest evaluates x stripping it of any optional fields and definitions and disambiguating disjunctions after their removal.

Use cases:

combine instances that only differ in templates.

Defining ranges

Looking around at other languages, defining range numbers clearly is a hard problem, as it is often not clear from just looking at the syntax, or even wording, whether or not ranges are inclusive.

CUE’s unary comparators provide a possible solution to this issue.

`range(from: int, to: int, by: int | *1) :: [...int]`

Builtin range returns a stream of values, starting from from (must be concrete) , adding by (defaults to 1) as long as unification with to succeeds. It is an error to define a range that never terminates.

Examples:

range(from: 1, to: <10)              // [1, ..., 9]
range(from: 1, to: >=0.5, by: -0.1)  // [1, 0.9, ..., 0.5]
range(from: 1, to: <1)               // []
range(from: 1, to: >=1)              // error("infinite range")

Switching

CUE’s if is not paired with an else. This is partly because if really is a comprehension. But another reason is that the use of else quickly leads to nested conditions. A switch statement is generally more conducive to readability in this case.

A switch statement can be simulated in CUE using lists:

choice: [
    if a { x },
    if b { y },
    z,
][0]

is equivalent to the hypothetical

choice: if a { x } else { if b { y } else { z } }

The issue is that the hidden [0] at the end of the switch is impairing readability.

head

A head builtin could make the above more readable. It would do nothing more than select the first element in a list, but doing so by more clearly signaling the intention at the start of the list.

choice: head([
    if a { x },
    if b { y },
    z, // default
])

Package `std`

We’re considering making all core builtins available under the package std, so that they can be referenced unambiguously and more clearly than using the __ prefix.

import “std”

a: std.range(from: 3, by: -1, to: >0) // 2, 3, 1

The text was updated successfully, but these errors were encountered:

cueckoo · 2021-07-03T11:02:50Z

Original reply by @seh in cuelang/cue#943 (comment)

This is so good to see.

One problem to consider with the "Switch" section: You write, more or less, if a {} else if b {} ..., but quite frequently b is !a or not a, which requires restating a. Could let help here to define the result of a once, and express it being both true for the consequent branch and its negation for the alternate branch?

cueckoo · 2021-07-03T11:02:51Z

Original reply by @seh in cuelang/cue#943 (comment)

Also, while head is evocative, it does so little that it barely justifies its inclusion. I thought of coalesce as a good name for picking the first suitable item in a sequence that can accommodate "null" or disqualified values. Against that, though, in your "Switch" example, I suppose the list should never wind up with more than one value, as opposed to it being prefixed by any number of "null" values.

cueckoo · 2021-07-03T11:02:52Z

Original reply by @mpvl in cuelang/cue#943 (comment)

@seh: yes, let could be used here that way, though outside the list. We could perhaps consider allowing let in lists.
Also, one could mimic this behavior with: head([if a {}, {}]), where the second element is the "default", and thus !a`.

Regarding head: I agree its utility is a bit meager. We did consider a select builtin which I think is close to what you're proposing, where it would pick the first of any valid entry. The main problem with this pattern seems that it will be too easy to ignore potential errors, so it may be a less safe approach. Having said that, it reads quite nice and we have seen configurations where this would have merit. So it is something to consider. It just seemed safer to see how far one would get with this seemingly safer approach.

I'm not sure I understand the point with the null values, but maybe this answers your question.

Do you think adding head is not warranted and using a [...][0] pattern is sufficient?

cueckoo · 2021-07-03T11:02:53Z

Original reply by @seh in cuelang/cue#943 (comment)

I was not sure that CUE has the same notion of "null" values that SQL, HCL, Jsonnet, and other languages have, so the semantics of a hypothetical coalesce function might not apply.

I don't think head is warranted without tail (or rest), and perhaps nth. My Lisp is showing. I haven't yet reached for any functions like that, though. I'd rather spend those tokens on set manipulation functions for lists.

Would it be possible to write a CUE "function" that encapsulates your [if a {consequent}, {alternate}][0] technique? It would require at least two inputs; the alternate could be optional. It's not much compression, but might cut down on the "syntactic noise" with those brackets. Yes, I confess that I'm still looking for else.

cueckoo · 2021-07-03T11:03:27Z

Original reply by @mpvl in cuelang/cue#943 (comment)

@seh: you can do else with the switch approach and I’m not in favor of a dedicated If-else construct, as it encourages bad patterns.

But I see your points otherwise. I guess you could indeed express this as cue macros neatly if we had the call shorthand. head would then be defined as:

head: { #0[0], #0: […] }

One problem is that the first element cannot have a conflicting definition of #0.

But maybe this is enough for now to just point out the pattern and suggest that people comment the construct:

aSwitch: [ // select first match
   if a { … },
   if b { … },
   c // default
][0]

anIfElse: [ // if then else
   if a { … },
   c // else
][0]

This would not require any additions to the language and we can get some experience to see what works. The query addition may also provide useful patterns that obviates the need for this.

cueckoo · 2021-07-03T11:03:28Z

Original reply by @mpvl in cuelang/cue#943 (comment)

@seh in CUE, bottom (incomplete errors,
to be more specific) is a bit like null in those languages. null can mean various things, often not compatible with the notion of null here. So it seemed impossible to assign any specific meaning to it.

myitcv · 2021-07-29T10:24:25Z

Noting that one use case of comparison against _|_ we should explicitly document (I'm not totally clear it is actually covered above) is that of type assertion, as discussed in #1161.

nyarly · 2021-09-23T17:36:17Z

Perhaps this warrants a new discussion or feature issue, but one thing I've found lacking in cases where I've wanted something like the list-as-choice pattern at the end there comes from FP paradigms doing pattern matching. Specifically, language support for guaranteeing that the options are exhaustive.

Reflecting here that the list comprehension version of this provides that in a roundabout (and at-runtime) way: if all the alternatives fail, the list is empty and the index will be out of bounds. So there's some safety railing there.

But: the user is going to get an "index out of bounds" error, (which is confusing when the cause is that an alternative was overlooked), and it'll be the user of the CUE program and not its author who gets the error.

It would be fantastic to have a language level match operator that could, at parse time, emit something like no alternative matches <16, >20 or something. There may be a correspondingly fantastic level of effort to provide that feature, but it sure would be nice.

verdverm · 2021-09-23T17:54:02Z

The default seems to prevent the out of bounds issue, assuming it is always required. One could use error in the case the default should fail the config and provide a more meaningful message.

It may be useful to know that something like <16 | >20 cannot be validated at parse time and requires the evaluator to do its thing ("runtime" in your message, though I'm not sure that is the most accurate term)

It might be also worth considering that, in many ways, CUE comes from Go and there is value in minimizing language features and syntax.

verdverm · 2022-01-06T11:03:01Z

What about an operator for subsumes?

like if subsumes(a, b) { "a subsumes b" }

I'm trying something like

t: int

result: [
  if (t & int) == _|_ { "int" },
  if (t & int64) == _|_ { "int64" },
  if (t & int32) == _|_ { "int32" },
  if (t & int8) == _|_ { "int8" },
  "unknown",
][0]

which won't work, I think something like this might

t: int

result: [
  if subsumes(t, int)   { "int" },
  if subsumes(t, int64) { "int64" },
  if subsumes(t, int32) { "int32" },
  if subsumes(t, int8)  { "int8" },
  "unknown",
][0]

The goal of the example is to turn CUE types into a string, maybe there could be a builtin or stdlib package that helps with that in a more targeted way. A subsume builtin might still be useful more generally

sdboyer · 2022-01-06T16:06:11Z

Strong +1 to that - a native subsumption operator is a key roadmap item for thema (née scuemata). For now, the necessary enforcement of a subsumption relation has to be done in Go. (Though that doesn't work either because of a panic that i need to post an issue for, once i have a clear reproduction)

haydenflinner · 2023-06-05T12:54:14Z

What's the general status on the extensions discussed here? I'm particularly interested in functions.

myitcv · 2023-07-11T09:30:57Z

Noting that we should also consider downcasts #454 in scope of new builtins.

myitcv · 2024-03-12T06:20:07Z

Noting what I think is a tricky edge case here:

#X: {
    // at least one of foo or bar must be specified by the user
    numexist(>0, foo, bar)
    foo?: int
    bar?: int
}

The definition #X itself will be in error in this case.

vergenzt · 2024-04-02T00:49:05Z

I came here from https://cuetorials.com/patterns/functions and am especially interested in the functions syntax, but I have yet to have a use case for any of the other proposals.

Should some of these use cases be split out into different issues? I feel like there's a lot being proposed here. It might make it clearer which features are priorities to users if these were separate issues.

myitcv · 2024-05-22T15:08:17Z

I have just created #3165 for further discussion regarding the encoding of oneofs in CUE.

This prepares for both adding new buitlins (such as the proposed numExist et. al.) as well as adjusting some exiting ones, like `and`. This CL is supposed to be a no-op (aside from adding the functionality) and we separate it out to make future diffs smaller. We will test RawFunc itself with the respective builtins. The issue with `and`, for instance, is that it "weaves" in partially evaluated expressions into existing evaluation. In come cases this may lead to cycles. To prevent this, there needs to be a back channel from the function to the evaluator. Only the function can know exactly which cycle information is needed. Other uses are functions like `numExists` or any other builtin that needs to operate on CUE expressions rather than values. Issue #943 Signed-off-by: Marcel van Lohuizen <mpvl@gmail.com> Change-Id: I32ef92bfdc2a8318b00801bc067df4a073a10a73 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1202442 Reviewed-by: Matthew Sackman <matthew@cue.works> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>

This prepares for both adding new buitlins (such as the proposed numExist et. al.) as well as adjusting some exiting ones, like `and`. This CL is supposed to be a no-op (aside from adding the functionality) and we separate it out to make future diffs smaller. We will test RawFunc itself with the respective builtins. The issue with `and`, for instance, is that it "weaves" in partially evaluated expressions into existing evaluation. In come cases this may lead to cycles. To prevent this, there needs to be a back channel from the function to the evaluator. Only the function can know exactly which cycle information is needed. Other uses are functions like `numExists` or any other builtin that needs to operate on CUE expressions rather than values. Issue cue-lang#943 Signed-off-by: Marcel van Lohuizen <mpvl@gmail.com> Change-Id: I32ef92bfdc2a8318b00801bc067df4a073a10a73 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1202442 Reviewed-by: Matthew Sackman <matthew@cue.works> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>

myitcv · 2025-01-12T06:28:05Z

Building somewhat on #3289 (cue: Value needs a method to finalize a value), and motivated by the discussion in #3674 and details covered in #3296, I think we also need to consider two further builtins:

v: _
b1: finalize(v)
b2: concrete(v)

finalize() would be the builtin analog of #3289. Defaults would be selected, templates "removed" (see #3674) amongst other things.

concrete() would further require that the value of its argument be fully concrete, and return a recursively closed value.

rogpeppe · 2025-01-13T11:05:39Z

concrete() would further require that the value of its argument be fully concrete, and return a recursively closed value.

FWIW I would not define it to return a recursively closed value: I think such a primitive would probably be better just returning just data exactly as if it had arrived from JSON, for example. It should be easy to close it if needed.

cueckoo added the Proposal label Jul 3, 2021

This was referenced Jul 3, 2021

Too early evaluation? #955

Closed

Inconsistent evaluation of unification? #964

Open

Proposal: core builtin extensions cuelang/cue#943

Closed

This was referenced Jul 29, 2021

internal/eval: type assertion issue on complex definition - different behavior than with native types #1161

Open

Support type assertion #881

Closed

vscode-cue: Sublime Text / TextMate / Sourcegraph syntax highlighting support #3427

Open

sdboyer mentioned this issue Nov 30, 2021

Constrain sequence lenses so that an absent declaration is not valid grafana/thema#2

Closed

sdboyer mentioned this issue Dec 9, 2021

testing framework grafana/thema#3

Closed

This was referenced Feb 18, 2022

cmd/cue: ambiguous disjunction leads to incomplete value on export #1487

Open

time.Format does something that is not intuitive #1508

Open

verdverm mentioned this issue Apr 1, 2022

Issue with hidden refs/lists/modules #1613

Closed

myitcv mentioned this issue May 4, 2022

Better syntax support for user-defined "functions" #1480

Closed

myitcv mentioned this issue Aug 18, 2022

evaluator: structural cycle not terminated by null disjunction #1827

Closed

myitcv mentioned this issue Sep 9, 2022

evaluator: should close() distribute across elements of a disjunction? #1917

Closed

myitcv added the zGarden label Jun 13, 2023

myitcv mentioned this issue Jun 20, 2023

Support for not() builtin #571

Open

myitcv mentioned this issue Jul 11, 2023

cue: improve documentation for Value.UnifyAccept #2300

Open

myitcv mentioned this issue Aug 10, 2023

tools/flow: task deps lost when more than one level "deep" #2517

Closed

This was referenced Sep 20, 2023

internal/core/adt: add isconcrete builtin func #2605

Closed

internal/core/adt: add exists builtin func #2606

Closed

verdverm mentioned this issue Oct 9, 2023

OneOf pattern does not work for open structures #2636

Closed

myitcv mentioned this issue Nov 11, 2023

OpenAPI incorrect oneOf definition when used with required #2686

Closed

myitcv removed the zGarden label Nov 29, 2023

myitcv mentioned this issue Feb 23, 2024

explanation: on embedding, root/package values, closedness, multiple declarations, etc cue-lang/docs-and-content#64

Open

myitcv mentioned this issue Mar 17, 2024

evaluator: export as YAML vs JSON emits different templated data #2916

Closed

myitcv mentioned this issue May 24, 2024

evaluator: encoding of JSON Schema (and other) oneofs #3165

Closed

jpluscplusm mentioned this issue Jun 14, 2024

docs/howto/negate-a-disjunction cue-lang/docs-and-content#160

Closed

myitcv mentioned this issue Jul 15, 2024

evaluator: comparison with bottom not consistent in the presence of defaults #3292

Open

myitcv mentioned this issue Aug 19, 2024

language: add not builtin #3382

Open

myitcv mentioned this issue Oct 24, 2024

docs/howto: Ensuring structs in a list contain a field with a unique value cue-lang/docs-and-content#108

Open

This was referenced Jan 13, 2025

proposal: add downcasts #454

Open

cue: Value needs a method to finalize a value #3289

Open

evaluator: incorrect behaviour when choosing defaults #3296

Open

myitcv mentioned this issue Jan 14, 2025

pkg/time: difficult to use time.Duration with exported constants #3676

Open

myitcv mentioned this issue Feb 4, 2025

eval: issues around the function pattern #3711

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: core builtin extensions #943

Proposal: core builtin extensions #943

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

myitcv commented Jul 29, 2021

nyarly commented Sep 23, 2021

verdverm commented Sep 23, 2021

verdverm commented Jan 6, 2022 •

edited

Loading

sdboyer commented Jan 6, 2022 •

edited

Loading

haydenflinner commented Jun 5, 2023

myitcv commented Jul 11, 2023 •

edited

Loading

myitcv commented Mar 12, 2024

vergenzt commented Apr 2, 2024

myitcv commented May 22, 2024

myitcv commented Jan 12, 2025

rogpeppe commented Jan 13, 2025

Proposal: core builtin extensions #943

Proposal: core builtin extensions #943

Comments

cueckoo commented Jul 3, 2021

Definitions

Functions

Validator

Inferred validators

The CUE function notation

Proposed builtins

builtins to replace _|_ (bottom)

_|_ replacement: error(msg: string | *null) :: _|_

Comparison to bottom

isconcrete(expr) :: bool

exists(expr) :: bool (optional)

validator builtins

must(expr: _, msg: string | *null) :: _

not(expr) :: _

numexist(count, ...expr) :: _

numconcrete(count, ...expr) :: _ (optional)

numvalid(count, ...expr) :: _ (optional)

Builtins related to concrete values

manifest(x) :: _

Defining ranges

range(from: int, to: int, by: int | *1) :: [...int]

Switching

head

Package std

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

myitcv commented Jul 29, 2021

nyarly commented Sep 23, 2021

verdverm commented Sep 23, 2021

verdverm commented Jan 6, 2022 • edited Loading

sdboyer commented Jan 6, 2022 • edited Loading

haydenflinner commented Jun 5, 2023

myitcv commented Jul 11, 2023 • edited Loading

myitcv commented Mar 12, 2024

vergenzt commented Apr 2, 2024

myitcv commented May 22, 2024

myitcv commented Jan 12, 2025

rogpeppe commented Jan 13, 2025

builtins to replace `_|_` (bottom)

`_|_` replacement: `error(msg: string | *null) :: _|_`

`isconcrete(expr) :: bool`

`exists(expr) :: bool` (optional)

`must(expr: _, msg: string | *null) :: _`

`not(expr) :: _`

`numexist(count, ...expr) :: _`

`numconcrete(count, ...expr) :: _` (optional)

`numvalid(count, ...expr) :: _` (optional)

`manifest(x) :: _`

`range(from: int, to: int, by: int | *1) :: [...int]`

Package `std`

verdverm commented Jan 6, 2022 •

edited

Loading

sdboyer commented Jan 6, 2022 •

edited

Loading

myitcv commented Jul 11, 2023 •

edited

Loading