Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builtins #487

Closed
effectfully opened this issue Jan 16, 2019 · 13 comments
Closed

Builtins #487

effectfully opened this issue Jan 16, 2019 · 13 comments

Comments

@effectfully
Copy link
Contributor

effectfully commented Jan 16, 2019

We have several kinds of builtins:

  1. Static builtin names like addInteger, takeByteString, BlockNum, etc. Right now each of them has its Haskell interpretation except intToByteString, VerifySignature, TxHash and BlockNum. This document provides some information on how we compute static builtins application in a well-typed way. In the implementation static builtin names are not saturated, while in the specification they are saturated, so this is something that needs to be fixed.

  2. Dynamic builtin names like charToString and trace. Those are functions that we can extend the language with, but that are not provided by default. Dynamic builtin names are handled analogously to static builtin names, it's just that for a static builtin name we know its TypeScheme (see the doc reference above) and interpretation statically, but for a dynamic builtin name we require the user (where "the user" = "someone who invokes the type checking procedure or one of the abstract machines") to provide them. This means that we only do some look-ups in maps storing types or interpretations of dynamic builtin names during type checking and evaluation, i.e. dynamic builtin names are not harder than static builtin names, just a little bit of boilerplate.

  3. Dynamic builtin types. There are two possible forms of them: a complicated one and a very complicated one. Right now the former is implemented, but we really want the latter. The complicated form is described in Language.PlutusCore.Constant.Typed in Note [Semantics of dynamic built-in types]. The most important part is

We only allow dynamic built-in types that can be represented using static types in PLC. For example Haskell's 'Char' can be represented as integer 4 in PLC. This restriction makes the dynamic built-in types machinery somewhat similar to type aliases in Haskell (defined via the type keyword).

You might think this is a minor thing, however together with dynamic builtin names this gives us some great potential and a lot of complexity. For example, we already can convert Plutus Core values to Haskell values using some terrible hacks:

we evaluate terms using some evaluator (normally, the CEK machine). For the temporary lack of ability to put values of arbitrary Haskell types into the Plutus Core AST, we convert PLC values to Haskell values and "emit" the latter via a combination of unsafePerformIO and IORef. For example, we fold a PLC list with a dynamic built-in name (called emit) that calls unsafePerformIO over a Haskell function that appends an element to the list stored in an IORef:

    plcListToHaskellList list =
        evaluateCek anEnvironment (foldList {dyn} {unit} (\(r : unit) -> emit) unitval list)

After evaluation finishes, we read a Haskell list from the IORef (which requires another unsafePerformIO) and return it.

This all is elaborated in Language.PlutusCore.Constant.Typed.

However, we of course want to remove this unsafePerformIO nonsense which means we need to be able to somehow put values of arbitrary Haskell types into the Plutus Core AST. Having this ability we can simply fold over a PLC data type inside PLC (using the CEK machine or any other evaluators that knows how to handle dynamic builtin names), but collect a Haskell value in an accumulator. And then the result of evaluation of a fold over a PLC data type is a very shallow wrapper around a Haskell value which we can then easily extract. And it should be possible to construct such fold in generic way, i.e. derive them from the Generic instance of a Haskell data type.

Of course, embedding the entire Haskell into Plutus Core is not a trivial thing. I have some ideas, but I do not really know whether they make sense or not.

Meanwhile, @j-mueller, right now we can convert PLC lists to Haskell lists (and hence lists of lists, lists of lists of lists, etc). I think, I can add support for sum and product types (without unsafePerformIO nonsense). This does not allow to convert an arbitrary PLC to value to a Haskell value, but maybe it's enough to unblock you at least?

Oh, and please someone suggest better names for these things.

@michaelpj
Copy link
Contributor

On static/dynamic builtins: IMO it would be nice if there were eventually only one class of builtin, i.e. our current "static" builtins were just turned into dynamic builtins. Dynamic ones seem strictly more powerful, so removing the weaker version in favour of that seems good. This also solves the naming issues since we only have one thing! (Although they're no longer "builtin", really...)

For people following along: here's how I've been thinking about the "turn PLC terms into Haskell values" problem.

  • We want to use our usual evaluator to do this - that's the sensible way to do something to a PLC term.
  • But we have no way of representing the result of this computation, since the evaluators just work with PLC terms.
  • However, if we have dynamic builtin types, then we naturally have (con a) for any a that's of a dynamic builtin type.
  • So we can stuff Haskell values into these constant and use dynamic builtins to manipulate them.
  • A simple version of this is using the dynamic builtin strings to fold a PLC term into a string - but if it's of a type compiled from a Haskell type, then we should also be able to fold it into a Haskell value of that Haskell type!
  • We would need the ability to create such a fold generically, but this is probably a similar exercise in generic programming to the ones we've already done for lifting etc.

@effectfully
Copy link
Contributor Author

I can add support for sum and product types (without unsafePerformIO nonsense)

Nah, probably still requires unsafePerformIO.

On static/dynamic builtins: IMO it would be nice if there were eventually only one class of builtin, i.e. our current "static" builtins were just turned into dynamic builtins. Dynamic ones seem strictly more powerful, so removing the weaker version in favour of that seems good.

I was previously opposed to that, but now I think you're right. I've just written the following comment in TypeSynthesis.hs:

-- We have a weird corner case here: the type of a 'BuiltinName' can contain 'TypedBuiltinDyn', i.e.
-- a static built-in name is allowed to depend on a dynamic built-in type which are not required
-- to be normalized. For dynamic built-in names we store a map from them to their *normalized types*,
-- with the normalization happening in this module, but what should we do for static built-in names?
-- Right now we just renormalize the type of a static built-in name each time we encounter that name.

If we are going to use dynamic builtins extensively (which is the case, right?), then there is no point in having more corner cases. But if we're going to use dynamic builtin names only for some hacks, then it makes sense to distinguish between hacks and the blessed approach.

We want to use our usual evaluator to do this - that's the sensible way to do something to a PLC term.

One alternative is to analyze the PLC AST directly, but it's a nightmare. The motivation for using evaluators is elaborated in Language.PlutusCore.Constant.Typed as well.

But we have no way of representing the result of this computation, since the evaluators just work with PLC terms.

You once said this:

perhaps the problem is that our evaluators expect to always produce a PLC term? Perhaps we should have a different notion of value for the result of execution that could include the output of dynamic builtins.

It can be indeed the case that extending the result of evaluation is something that we actually need rather than embedding the entire Haskell into Plutus Core. I'll play with that.

@j-mueller
Copy link
Contributor

Meanwhile, @j-mueller, right now we can convert PLC lists to Haskell lists (and hence lists of lists, lists of lists of lists, etc). I think, I can add support for sum and product types (without unsafePerformIO nonsense). This does not allow to convert an arbitrary PLC to value to a Haskell value, but maybe it's enough to unblock you at least?

Yeah product types and lists should be enough, but I just didn't want to write the contracts on the assumption that we would be able to do this - now that I know this feature is planned I'm not blocked on it, I can mimick it easily for testing purposes.

@effectfully
Copy link
Contributor Author

Yeah product types and lists should be enough

Ok, I'll add them then.

but I just didn't want to write the contracts on the assumption that we would be able to do this - now that I know this feature is planned I'm not blocked on it

Makes perfect sense, but note that even though the feature is planned, it's not clear whether it's possible to implement it in a sensible way. This all is a bit hand-wavy right now.

I can mimick it easily for testing purposes.

Great.

@effectfully
Copy link
Contributor Author

effectfully commented Feb 17, 2019

Added support for throwing a PLC error from a builtin name in #642 (not merged yet).

We need to do a few things (better before experimenting with parameterizing Term by additional type variables):

  1. Allow parametric polymorphism in TypeScheme
  2. Rename KnownDynamicBuiltinType to KnownType and reflect all the types via this type class, including "static" ones. I've recently (in [PLC] [Builtins] Allow to throw 'PLC.error' from dynamic builtin names #642) removed TypedBuiltinBool as a special case and realized that we can remove all special cases, even integers and bytestrings, like that. This is needed, because right now we can throw an error from a builtin name that normally returns an Bool, but not from one that returns an Integer, which is a silly limitation. The challenge however is in handling sizes, which I don't know how to approach at the moment.
  3. A funny thought: we do not really need to change the AST in order to be able to embed arbitrary Haskell into Plutus Core. We can just carry an environment of Haskell values together with a Plutus Core term which references values in the environment by their indices (say, in the Var constructor, but it's better to have a separate MetaVar constructor for this). At the very least this should allow us to play with the ideas without needing to fix 10000 errors that adding additional parameters to Term would cause.

@michaelpj
Copy link
Contributor

We can just carry an environment of Haskell values together with a Plutus Core term which references values in the environment by their indices (say, in the Var constructor, but it's better to have a separate MetaVar constructor for this)

Interesting! My only question is how we would type a PLC term with these in it? I guess providing a type-environment at typechecking type and a term-environment at runtime is actually very similar to what we do for the current dynamic builtins.

@effectfully
Copy link
Contributor Author

Interesting! My only question is how we would type a PLC term with these in it? I guess providing a type-environment at typechecking type and a term-environment at runtime is actually very similar to what we do for the current dynamic builtins.

Yes. Except right now we look up only dynamic builtin names and there is a finite amount of them, while with the new approach we can extend environments indefinitely. "Got a Haskell value that you can't embed into PLC? Put it into the current environment and reference it by its index". But this sounds like it may require a GC with reference counting or something.

@michaelpj
Copy link
Contributor

Plus everything is done with unconstrained indices, which is unpleasant.

@effectfully
Copy link
Contributor Author

Yes. But it's very good for experimenting at least and maybe as a temporary intermediate solution.

@kwxm
Copy link
Contributor

kwxm commented Apr 2, 2019

This comment's an offshoot from the now-closed "Do we have keywords?" issue: #463

The PLC parser currently has special treatment for builtin names, so they really are built in. This'll have to change once we have extensible builtins: we'll have to look up names in a table or something to make sure they really exist. Fortunately you now have to say builtin every time you mention a built in name, so there's no ambiguity about what's what.

We'll also have to change the spec, but let's cross that bridge when we come to it.

@effectfully
Copy link
Contributor Author

This'll have to change once we have extensible builtins: we'll have to look up names in a table or something to make sure they really exist.

We already have extensible builtins in this sense. Here is an excerpt from the classic pretty-printer:

instance PrettyBy (PrettyConfigClassic configName) (Builtin a) where
    prettyBy _ (BuiltinName    _ n) = pretty n
    prettyBy _ (DynBuiltinName _ n) = pretty n
...
instance (PrettyClassicBy configName (tyname a), PrettyClassicBy configName (name a)) =>
        PrettyBy (PrettyConfigClassic configName) (Term tyname name a) where
    prettyBy config = cata a where
        a (BuiltinF _ bi)      = parens' ("builtin" </> prettyBy config bi)

I don't think we've ever tested how this interacts with the parser. For example in Language.PlutusCore.Generators.AST we have

genBuiltin :: MonadGen m => m (Builtin ())
genBuiltin = BuiltinName () <$> genBuiltinName

i.e. we basically never generate a DynBuiltinName.

we'll have to look up names in a table

I'd start with "if it's marked as a builtin, but it's not a static builtiln, then it's an extensible one". Passing tables around is a good solution, but we're going to change the related interface to builtins, so I'd do the simplest thing and leave the better approach for future.

If you wish, you can add some tests related to extensible builtins and add support for extensible builtlins to the parser.

@michaelpj
Copy link
Contributor

Right, we haven't added anything about extensible builtins to the parser or spec, they're basically only available if you're constructing an AST directly.

@effectfully
Copy link
Contributor Author

We need to do a few things (better before experimenting with parameterizing Term by additional type variables):

1. Allow parametric polymorphism in TypeScheme

Done in #751.

2. Rename KnownDynamicBuiltinType to KnownType and reflect all the types via this type class, including "static" ones. I've recently (in #642) removed TypedBuiltinBool as a special case and realized that we can remove all special cases, even integers and bytestrings, like that. This is needed, because right now we can throw an error from a builtin name that normally returns an Bool, but not from one that returns an Integer, which is a silly limitation. The challenge however is in handling sizes, which I don't know how to approach at the moment.

Doing that in #983. Now that we don't have sizes, everything is straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants