Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: "forward declarations" for custom built-ins #800

Open
neongreen opened this issue Nov 1, 2019 · 36 comments
Open

Proposal: "forward declarations" for custom built-ins #800

neongreen opened this issue Nov 1, 2019 · 36 comments

Comments

@neongreen
Copy link

@neongreen neongreen commented Nov 1, 2019

While dhall-haskell (and perhaps other implementations too) support custom built-ins, using them in practice is quite annoying – particularly because IDE support is lost.

It would be lovely if I could tell Dhall that an identifier is a built-in and has to be supplied by the interpreter:

let sha256 
    : Text -> Text
    = builtin

let ...

in ...
@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 1, 2019

Note: sha256 might not be a convincing example once #739 lands.

However, I suspect that making it easy for people to toy with built-ins will be useful in the long run because then we'll have better data on

  • which built-ins are useful "in the wild",
  • and which only seem useful but are later abandoned.

Plus, I have problems I want to solve using Dhall and I'd be able to solve them faster if I didn't have to work around missing built-ins (or wait for every built-in I need to get standardised).

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 1, 2019

This also overlaps with functionality for TODOs (https://github.com/dhall-lang/dhall-lang/wiki/How-to-type-check-and-normalize-incomplete-code), and could probably be handled uniformly:

let sha256 
    : Text -> Text
    = ...builtin

let something : ??? = ...todo

let something_else : ??? = ...

The ...<smth> syntax might be too cute, though. But at least it seems intuitive.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 1, 2019

@neongreen: Yeah, I think this is a good idea.

One thing to note is that we should make sure that the forward declaration is propagated when type-checking transitive imports. In other words, when type-checking them instead of providing an empty context we provide the forward declarations as the type-checking context.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 1, 2019

Thinking about this more, what does this buy us over the equivalent λ expression? In other words, why not this:

  λ(sha256 : Text  Text)
 

The reason I ask is that I realized that we would need to add forward declarations to transitive imports, too (otherwise they would not type-check in isolation), but then if we did that we might as well be explicitly passing assumed builtins as ordinary function arguments, like this:

-- ./foo.dhall

  λ(sha256 : Text  Text)
 ./bar.dhall sha256
-- ./bar.dhall
  λ(sha256 : Text  Text)
 sha256 "example"
@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 1, 2019

If you have a chain of modules importing each other – foo.dhall -> bar.dhall -> qux.dhall – and qux.dhall wants a built-in, you will have to propagate it through the chain.

Unknown-sized annoyance at random times is worse than one predictable constant-time annoyance, so you might create BuiltinsType.dhall and institute a rule that all modules shall begin with λ(builtins : ./BuiltinsType.dhall) →.

This is the same thing but with worse ergonomics. Plus it's something that has to be discovered (as an idiom) and maintained (as a styleguide rule).

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 1, 2019

What's worse, when qux.dhall won't need the built-in anymore for one reason or another, somebody will be surely tempted to remove λ(builtins : ./BuiltinsType.dhall) and simplify the code. This has good potential to result in high levels of churn.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 2, 2019

@neongreen: You would have the same churn under the forward declarations solution, too. Under both proposals the user would have to declare all built-ins for every transitively imported module and have to decide whether or not to preserve the built-ins "header" for modules that temporarily don't use it.

There's also another issue with how forward declarations interact with semantic integrity checks: the computed hash of an import would change depending on whether or not the interpreter supports the built-in.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 2, 2019

Ok, I evidently misunderstood "add forward declarations to transitive imports". Do you mean this?

-- foo.dhall

let sha256 
    : Text -> Text
    = ...

in ... sha256 ...
-- bar.dhall

let foo = ./foo.dhall

let -- necessary even though sha256 is not mentioned in bar.dhall?
    sha256 
    : Text -> Text
    = ...

in ...
@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 2, 2019

@neongreen: I mean that forward declarations would be necessary in transitive imports that used the newly-added built-in

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 2, 2019

Ok, so if the file uses a builtin directly, it needs a forward declaration, otherwise it doesn't?

In this case I don't see why levels of churn would be the same.

With lambdas:

                                        builtin used here
                                        ~~~~~~~~~
main.dhall -> foo.dhall -> bar.dhall -> qux.dhall
              ^^^^^^^^^    ^^^^^^^^^    ^^^^^^^^^
                 ^------------^------------^--------- lambda added here

With ...:

                                        builtin used here
                                        ~~~~~~~~~
main.dhall -> foo.dhall -> bar.dhall -> qux.dhall
                                        ^^^^^^^^^
                                           ^--------- declaration added here
@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 2, 2019

@neongreen: Ah, okay, I misunderstood the request. I thought all reverse dependencies of qux.dhall would have to declare the built-in.

Yeah, then that seems okay, although we still need to work out what the semantic integrity check should be for a module with a forward declaration.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Nov 2, 2019

Another thing to consider here is if forward declarations must be present at the outermost level of an imported expression or if they can be permitted anywhere within an expression. In other words, is the following expression legal?

"sha256:${let sha256 : Text → Text in sha256 "foo"}"

My guess is that we should probably permit forward declarations anywhere within the syntax tree since it would simplify things. For example, then import resolution would not need to be aware of forward resolution at all (just inline them like any other imported expression), and they would only be interpreted at type-checking time by extending the context.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Nov 3, 2019

My guess is that we should probably permit forward declarations anywhere within the syntax tree

This is my guess as well.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 1, 2019

we still need to work out what the semantic integrity check should be for a module with a forward declaration

Hm. Isn't it just "however they are serialized"? I assume that a forward declaration is serialized like BuiltinDecl <name> <type> and later on calls to those declarations are treated as calls to normal declarations, except that they can't be normalized away (like the free variables they are).

Or do you want to treat them differently? Not sure what the usecase would be.

By the way, what should happen to two built-in declarations with different types but same names? Should those be forbidden?

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 1, 2019

@neongreen: The reason I asked about how integrity checks interacts with forward declarations is that expressions are interpreted before they are hashed to produce a semantic integrity check, so if the interpreter supports the built-in referenced by a forward declaration then it might produce a different hash than one that doesn't.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 1, 2019

@neongreen: Regarding conflicts, I think a forward declaration should conceptually behave the exact same as the equivalent lambda expression, so it should work even if they had the same name and type. You would disambiguate through the use of indices (i.e. foo@n).

For example, if you wrote:

let foo : Integer

let foo : Integer

in  foo@1

The foo@1 would refer to the first forward declaration. This implies that the interpreter would need to provide an ordered sequence of built-ins satisfying those declarations instead of an unordered map.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 2, 2019

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 2, 2019

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 2, 2019

@neongreen: Yeah, I think keeping forward declarations unevaluated for hashing purposes seems like the right solution.

Regarding this:

In my mental model, you can have the same declaration in two different
files and then both declarations will be interpreted as the same builtin.

I now understand the problem with my proposal that you are referring to. We could run into a situation like this:

-- ./foo.dhall

[ ./bar.dhall, ./baz.dhall ]
-- ./bar.dhall

let sha256 : Text  Text

in  ./baz.dhall # [ sha256 "1" ]
let sha256 : Text  Text

in  [ sha256 "2" ] 

Under my proposal, the import chain of ./foo.dhall → ./bar.dhall → ./baz.dhall would add sha256 : Text → Text to the context twice, whereas the import chain of ./foo.dhall → ./baz.dhall would only add sha256 : Text → Text to the context once, so ./baz.dhall would no longer be referentially transparent.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 3, 2019

Okay.

@sjakobi @f-f since you ❤️-ed the proposal:

  • What usecases do you have in mind?

  • How do you think cases above should be handled? (duplicate forward declarations in different files; duplicate forward declarations in the same file)

  • Is the syntax above (a bare type annotation without =) acceptable?

@sjakobi

This comment has been minimized.

Copy link
Collaborator

@sjakobi sjakobi commented Dec 3, 2019

What usecases do you have in mind?

I didn't have a particular usecase in mind, but I thought that the idea seemed quite elegant! :)

Aren't there some open-source projects that define custom built-ins, where we could give this proposal a spin? I thought dhall-to-cabal did, but I might be mistaken. (CC @quasicomputational, @ocharles).

How do you think cases above should be handled? (duplicate forward declarations in different files; duplicate forward declarations in the same file)

Sorry, I didn't follow the discussion in detail. Can you clarify what the question is?

Is the syntax above (a bare type annotation without =) acceptable?

I think it's acceptable, but I wish it were a bit more explicit about the fact that it is a forward declaration. I think that involving some kind of keyword would be clearer.

How about handling it a bit like a special import:

builtin sha256 : Text -> Text

or

builtin "sha256" : Text -> Text

You would use it like

let sha256 = builtin sha256 : Text -> Text

in sha256 "bla"

or

(builtin sha256 : Text -> Text) "bla"

or

-- File sha256.dhall
builtin sha256 : Text -> Text
let sha256 = ./sha256.dhall

in sha256 "bla"

And maybe custom-builtin would be a better keyword than just builtin.

@sjakobi

This comment has been minimized.

Copy link
Collaborator

@sjakobi sjakobi commented Dec 3, 2019

BTW, what should be the result of type-checking a forward declaration, when the interpreter doesn't implement the specific custom builtin?

On one hand, you'd want type-checking to succeed when the types match up.

OTOH, it would be good to inform the user that normalization will fail!

So maybe we should require interpreters to output a warning when they typecheck unsupported custom builtins.

@Nadrieril

This comment has been minimized.

Copy link
Collaborator

@Nadrieril Nadrieril commented Dec 3, 2019

Or normalization could just ignore those builtins, so that you can further normalize it later with a different implementation that does support the builtin ? Not a common usecase though.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 3, 2019

I assume it would work the same as if those built-ins were lambda-bound, meaning that they would just be uninterpreted functions. In other words:

    \(sha256 : Text -> Text)
->  sha256 "1"

... would be in normal form, so therefore:

let sha256 : Text -> Text

in  sha256 "1"

... would also be in normal form if the built-in were not provided by the interpreter

@matheus23

This comment has been minimized.

Copy link
Collaborator

@matheus23 matheus23 commented Dec 3, 2019

And maybe custom-builtin would be a better keyword than just builtin.

How about changing the 'non-custom' builtins that we currently have to be those kinds of built ins, too?

So the Prelude's definition of List.head would become:

{-
Retrieve the first element of the list
-}
let head = builtin List/head : (a : Type)  List a  Optional a

let example0 = assert : head Natural [ 0, 1, 2 ]  Some 0

let example1 = assert : head Natural ([] : List Natural)  None Natural

in  head
@sjakobi

This comment has been minimized.

Copy link
Collaborator

@sjakobi sjakobi commented Dec 3, 2019

@matheus23 I guess that would nicely address #835, but it seems rather inergonomic IMHO.

@matheus23

This comment has been minimized.

Copy link
Collaborator

@matheus23 matheus23 commented Dec 3, 2019

@matheus23 I guess that would nicely address #835, but it seems rather inergonomic IMHO.

Because users who previously used List/head directly now have to import the Prelude?

What is the point of the List/head function in the prelude? Is it only documentation? Is it recommended to use the Prelude's List/head, or the built-in?

It seems to me it should be recommended to use the prelude instead of builtins directly. This would mean that changing the standard is less annoying for users, so e.g. replacing the List/head builtin with a List/match primitive could be done gracefully by providing an actual implementation for List/head in the prelude.

Interestingly, I simply used List/head in renderYAML... 🤔 :D

Just to be clear: I'm throwing ideas out here, which I think would be good and healthy for dhall, but I have to admit I don't have much skin in the game. So everything IMHO as well :)

@sjakobi

This comment has been minimized.

Copy link
Collaborator

@sjakobi sjakobi commented Dec 4, 2019

@matheus23 While I kind of agree that using the Prelude instead of using builtins directly may offer better forward compatibility, I think encouraging that might send the wrong message: Dhall code written today ought to just keep working with future interpreters – in fact we try to avoid churn in the builtins, see e.g. #807.

Imports are also rather tricky to implement, so encouraging their use where they aren't strictly necessary might have the effect of excluding other interpreters.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 4, 2019

If we standardize this, I think we should also require that unused builtins don't have to be provided, i.e. let foo : X in True should work fine without providing foo.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 5, 2019

Based on #847, it sounds like you suggested another way to accomplish the goal of this proposal, which is to add a builtin keyword that can bring into scope a value of any type. If we had such a keyword, then instead of this:

let sha256 : Text -> Text

in  

... we could instead write it as:

let sha256 = builtin sha256 : Text -> Text

in  ...

One of the advantages of doing things this way is that I believe it resolves many of the issues that we've mentioned so far. For example, if the builtin is unused after β-normalization then the interpreter never attempts to satisfy it.

It would also do the right thing across module boundaries, because we're not artificially modifying the context. It would behave the same as any other let-bound variable.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 5, 2019

Huh.

Yeah, looks like a solution.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 5, 2019

Also looks like @sjakobi suggested the same thing and I missed it while reading the thread. @Gabriel439 @sjakobi ❤️

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 5, 2019

Keyword name bikeshedding: I like builtin instead of custom-builtin because the current set of Dhall builtins seems pretty ad-hoc to me.

Specifically, I feel that builtins are a property of the interpreter (or even dialect) more than a property of the language. E.g. #741 argues for making List.map a builtin because it's hard to optimize – so, List.map could be a builtin for simple interpreters and a normal piece of code for smart interpreters.

@neongreen

This comment has been minimized.

Copy link
Author

@neongreen neongreen commented Dec 5, 2019

I feel that builtins are a property of the interpreter (or even dialect) more than a property of the language

Aw shucks. This would also make normalization a property of the interpreter, breaking semantic hashes along the way :/

Now I'm leaning towards evaluating builtins during normalization. Well, either that, or provide a new normalization mode that would require all builtins to be evaluated – otherwise https://github.com/Gabriel439/dhall-manual/blob/develop/manuscript/02-Refactors.md#semantic-integrity-checks will be unusable on code with custom builtins.

@Gabriel439

This comment has been minimized.

Copy link
Contributor

@Gabriel439 Gabriel439 commented Dec 5, 2019

@neongreen: I think it's important to step back and state the problem we're trying to solve before we proceed further with the design. Without a formal statement of requirements we don't have a way to evaluate the fitness of alternative solutions

@f-f

This comment has been minimized.

Copy link
Member

@f-f f-f commented Dec 10, 2019

@neongreen @Gabriel439 @sjakobi sorry for the delayed answer to the ping, I'm travelling and my backlog slipped a bit

I originally ❤️-ed the proposal because I always thought of Dhall as an excellent "platform for building config languages": vanilla-Dhall is pretty good for most configs, but in those cases in which you need a more sophisticated DSL the Haskell implementation is extensible enough so that you can just add some builtins to Dhall and avoid maintaining a custom interpreter of a superset of YAML/some lisp/etc.

E.g. I'm currently evaluating exactly this usecase for a client: they need a language that has similar characteristics to Dhall, except with more business-specific builtins.

Now, it's already possible to add buitins, but the ergonomics of adding a lambda binding to every module are ok, but I guess we could explore something more ergonomic (some reasoning for this has been detailed in the above comments and I agree with it). The thread went on since the ping, and I like the proposal in #800 (comment).

Re "normalization as a property of the interpreter": I think it's ok to make "extra-builtins" and "normalization" properties of the interpreter in case of custom builtins. At least for my usecase above it would be totally fine to have an expression which normalizes to some form with a generic interpreter and normalizes to another form with the business-specific interpreter: the goal here is to have all the tooling in the ecosystem to work with non-vanilla-Dhall expressions, not to keep hashes consistent in the presence of new builtins.
In short: "adding a custom builtin voids the guarantee on switching interpreters", which I think it's fair.
On the other hand, I think we should keep the "official language builtins" as magical - i.e. all interpreters have to implement and normalize them - so that semantic hashes would be consistent across interpreters. I see how it's tempting to declare all builtins explicitly so that breaking changes are less painful, but if the tradeoff would be to lose semantic-hashes-consistency then I think it's not worth it.

Example of how I'd use this feature:

-- CustomBuiltins.dhall
-- note: this is a bit clumsy because I'd like to inject both
-- new types and functions on these types, and they depend on each other
-- so I cannot just write down the record with the new types
-- Maybe there's a more ergonomic way to write this?

let DateTime = builtin DateTime : Type

let now = builtin now : DateTime

in
  { DateTime : Type
  , now : DateTime
  }
-- someCustomConfig.dhall
let customBuiltins = builtin custom : ./CustomBuiltins.dhall

in ..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.