Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast elision inside statically typed code #285

Closed
yannham opened this issue Feb 2, 2021 · 9 comments
Closed

Cast elision inside statically typed code #285

yannham opened this issue Feb 2, 2021 · 9 comments

Comments

@yannham
Copy link
Member

yannham commented Feb 2, 2021

Is your feature request related to a problem? Please describe.
Contracts are nice, but they do incur some runtime cost. There are probably many optimization opportunities, one being the elision of unneeded checks. Currently, every type annotation gives rise to a contract at run-time:

let f : Num -> Num =
  let id : forall a. a -> a = fun x => x in
  fun x => (id x) + (id 5)

This example is contrived but corresponds to a real use-case: using a type annotation inside a statically context, either to introduce a polymorphic type, to debug a type error, or just as a documenting annotation for a subterm.

In that case, the contract should be useless, since as the annotated sub-term is well-typed, it wouldn't elicit positive blame, and since the context is well-typed, it shouldn't elicit negative blame, assuming type soundness and blame safety.

Describe the solution you'd like
Any annotation in a statically typed block would be erased at run-time. This could be done at program transformation time, since typechecking occurs before, rewriting such Promise(ty, term) as term directly. If the typing information is still needed later, e.g. for the query subcommand, we can introduce a new node in the AST, which would be the same as a promise during typechecking but would be transparent at run-time. We could also store typing information as meta-values, with more control over which one give rise to a contract or not.

@aspiwack
Copy link
Member

aspiwack commented Feb 3, 2021

The contract is useless only if id is never used in a dynamically typed context.

It's actually a fairly non-trivial transformation, which I've been considering as out of scope for the near future.

@yannham
Copy link
Member Author

yannham commented Feb 3, 2021

The contract is useless only if id is never used in a dynamically typed context.

I am not sure. Since the enclosing block f is also statically typed, either it consumes the function internally (and, if well-typed, should not elicit blame), or the function escapes the block, but then it must appear in the codomain of f (say ... -> (forall a. a -> a)), and will thus be guarded by the outer contract anyway.

This example is particular because id has a polymorphic type that we happen to not infer, but another intuition in this direction is that if we take the same example with a monomorphic type id: Num -> Num, or imagine our system but with HM type inference, then the annotation is entirely superfluous and adding it or not shouldn't change the runtime behavior of the program.

It's actually a fairly non-trivial transformation, which I've been considering as out of scope for the near future.

I agree it's not urgent, and that it needs more justification than "it seems to work": probably a description of the type system together with a proof of the soundness of the transformation under reasonable hypotheses. I just wanted to write down the idea first.

@aspiwack
Copy link
Member

aspiwack commented Feb 3, 2021

You can insert dynamically typed blocks in a statically typed context using Assert. So simply having defined id in a static context is not sufficient.

The comparison with having no type signature is intriguing, though. It's an interesting curiosity to keep in mind.

@yannham
Copy link
Member Author

yannham commented Feb 3, 2021

You can insert dynamically typed blocks in a statically typed context using Assert. So simply having defined id in a static context is not sufficient.

Oh, you're right, of course. That's strange, because it means that in one "direction", when an untyped context calls from an inner typed term, then the typed code is always protected and blame safety holds (if it doesn't contain any untyped chunk, at least).

But this is not true in the other direction without annotations, that is an inner untyped code calling from enclosing typed code can call anything without firing any contract, making a well-typed term elicit blame, as in this example:

let funs = {
  x : Num = let plus = fun x => (minus x) + 1 in (plus 1) + (plus "a" | Num);
  minus : Num -> Num = fun x => x - 2
}

where the well-typed (fun x => minus x) get blamed for the contract violation. Intriguing, indeed. I don't remember how this is handled in Wadler's paper, but I'm gonna look again.

@aspiwack
Copy link
Member

aspiwack commented Feb 3, 2021

There is a sense in which this is the correct behaviour. But it's definitely quirky.

@yannham
Copy link
Member Author

yannham commented Feb 3, 2021

For the record, (one of) my use-case was the following: currently, we have the list stdlib as

{
  lists = {
    ...
    fold : forall a b. (a -> b -> b) -> List a -> b -> b = fun f l x => ...
    ...
  }
}

As noted in #226, this currently makes the stdlib unusable in typed context. A second problem appeared during #284 when trying out sort for example, which is performance: on mutual recursive functions, the costs of contracts calls quickly add up.

A simple solution, even if temporary, is to hoist up the type annotations up to the lists record:

lists : {
  ...
  fold: forall a b. (a -> b -> b) -> List a -> b -> b,
} = {
  ...
  fold = fun f l x => ...
}

The gain is twofold:

  1. The type information becomes available outside of the lists record
  2. Since functions inside the module are not annotated anymore (although we could argue this is a bad thing, documentation wise), they don't fire any contract during recursive calls.

Sadly, this doesn't work because we don't do generalization automatically, thus fold and alike need to be re-annotated locally to allow them to have a polymorphic type. This would still solve 1., at the price of two type annotations, and two contracts checks when calling stdlib functions from the outside, but it makes 2. impossible, because polymorphism => annotation => contract.

The point is, we could have a way of annotating subterms just for the sake of typechecking, but without firing a contract. Maybe it could be special syntax restricted to the stdlib. Or maybe that's already too many different type annotations.

@aspiwack
Copy link
Member

aspiwack commented Feb 3, 2021

Sadly, this doesn't work because we don't do generalization automatically

This has really little to do with automatic generalisation. It's only a limitation in the implementation.

I'm surprised about the connection to the sort story. I don't expect sort to bear a lot of cost from contracts: it's a superlinear function, with pretty much no tail calls.

To preserve tail calls we may want to be a bit clever about contract check insertion in (mutually) recursive definitions. It's true that not checking them at all is attractive, but it will give less precise blame when using the recursive call in a dynamically typed context. Maybe we can leverage some ideas from the Threesomes, With and Without Blame paper.

@yannham
Copy link
Member Author

yannham commented Feb 3, 2021

I'm surprised about the connection to the sort story. I don't expect sort to bear a lot of cost from contracts: it's a superlinear function, with pretty much no tail calls.

On an very unscientific benchmark™, for a few non random at all samples™ of small size (between 50 and 100), I roughly get a factor 10 in real time with and without contracts (without means that all contracts are directly disabled in the eval function). Higher order contracts with polymorphism still does a bunch of mumbo jumbo unwrapping, passing continuations, allocating a few local variables, plus the list contracts that are linear and may need to copy their argument which basically doubles the allocations, so all of this is probably not free.

But memory management is also very naive and the whole interpreter as well as the implementation of contracts is pretty much un-optimized, so maybe we can bring down this factor to something reasonable, even without eliding recursive calls.

@yannham
Copy link
Member Author

yannham commented Feb 4, 2021

Closing for now, since the proposal as issued initially is unsound. There are probably others sound contract optimizations to make, but let's discuss them on dedicated issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants