Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module references #12

Open
sirlensalot opened this issue Sep 3, 2020 · 3 comments
Open

Module references #12

sirlensalot opened this issue Sep 3, 2020 · 3 comments

Comments

@sirlensalot
Copy link
Contributor

sirlensalot commented Sep 3, 2020

Problem Statement

Pact lacks a mechanism for polymorphic dynamic dispatch: all dispatch in Pact is resolved at compile/install time, and polymorphic dispatch is not offered at all. While this is not a significant limitation for many smart contract applications, and indeed "static linking" has safety benefits, certain kinds of apps need to interact with other applications dynamically.

Consider a Uniswap-style exchange that interacts with fungible-v2 tokens. On the EVM, Uniswap offers the ability to add new pairs on-demand, as a business-critical feature. In Pact, dynamic interaction with other contracts is not possible.

A work-around could employ a mechanism, such as a "fungible registry" that would provide indirection along the lines of dispatching on token symbol to fungible-v2 methods like transfer. A new token would simply require a modification referencing the new token's code and an update to the code on the blockchain. Such a registry could service any number of applications seeking to access fungible tokens.

Nonetheless, in addition to updating the registry, the client software, in this example the exchange, would have to be re-uploaded simply to link to the new registry version. This is because Pact intentionally does not automatically upgrade downstream dependencies, as an safety feature for code stability and resilience.

However, the registry code itself would reveal another problem: the lack of a polymorphic dispatch mechanism. The registry would have to manually case on symbol for each operation, resulting in massive duplicative calls to transfer and other methods. Code proliferates with the potential for bugs.

A solution must allow for dynamic dispatch to a module, and polymorphic access to the methods of implemented interfaces.

Proposal Draft: Module References

Module references reify a Pact module into a concrete datatype. For purposes of this draft we will say it's type is ref, but it might benefit from a different name. Here is a function signature that accepts a module ref:

(defun add-fungible ( symbol:string fungible:ref ) )

ref by itself is not very useful as it gives no indication what interface the module implements. A simple re-purposing of Pact's user-schema type syntax could indicate a single supported interface:

(defun add-fungible ( symbol:string fungible:ref{fungible-v2} ) )

If refs were to support multiple interfaces, this syntax would be insufficient. However an argument can be made for only supporting single interfaces to make roles clear. Consider an ancilliary interface to fungible-v2 that specifies whitelisting, a signature would expect two arguments:

(defun add-whitelist-fungible ( symbol:string fungible:ref{fungible-v2} wlist:ref{whitelist-v1} ) )

Usage

The simplest way to use a module ref is for it to "stand in" for standard Pact unqualified namespace syntax:

  (wlist.approve "Alice" 100.0)
  (fungible.transfer "Alice" "Bob" 10.0)

This of course will complicate resolution. Presumably reference arguments would shadow unqualified namespaces.

Instantiation

The most straightforward way to provide a module reference would be to reference the module itself as a bareword:

  (exchange.add-fungible "KDA" coin)

In the case of a module that implements two interfaces, it would just be used twice:

  (exchange.add-whitelist-fungible "kDOGE" user.kdogecoin user.kdogecoin)

At least in the root namespace, this would cause name clashes with existing Pact code which is free to reuse module names. For instance the coin contract can use a variable called coin with no ambiguity.

This also impacts resolution as it would introduce [namespace.]module references in Pact code as concrete datatypes. An alternate approach would be to explicitly declare module references, with perhaps a defref form:

(module kdogecoin GOV
  (implements fungible-v2)
  (implements whitelist-v1)
  (defref FUNGIBLE fungible-v2)
  ...
)

Or, an implements declaration could be extended to declare the ref:

  (implements fungible-v2 as FUNGIBLE)

The advantage of this is simply to provide a context for resolution, as opposed to special syntactic support for recognizing module refs. It also avoids the name clash issues.

Additional requirements

A module reference MUST be storable in the database, in order for something like add-fungible to work as intended.

Further considerations

Safety. Pact's existing system of static linking provides strong guarantees:

  1. Turing incompleteness. Pact statically guarantees that no code can recurse. This also has performance benefits as the runtime is always walking a fixed tree of outcomes.
  2. Dependency safety. Static linking guarantees that "the code you upload (and sign) is the code you run". Dynamic linking implies that new code can be executed at any time.
  3. Admittability to formal verification. Pact's static, terminating trees are very easy to verify. Interfaces do support properties, so there is some ability to leverage those in a verification environment. However it is simply impossible to make guarantees about arbitrary code.
@emilypi
Copy link
Member

emilypi commented Sep 4, 2020

This looks like it will be an incredibly useful feature for Dex's. I have some comments and general questions surrounding the implementation of something like this.

Syntax

  1. I propose that we reuse the module keyword here and to generate syntax a la

    (defun add-fungible ( symbol:string fungible:module{fungible-v2} ))

    Which can be read "add-fungible takes a string symbol and a module reference that implements fungible-v2 as
    arguments". I think this flows intuitively with the understanding that one passes a module reference here, where ref is
    somewhat ambiguous.

  2. I enjoy the bareword syntax for passing module references and the as qualification for disambiguating module references (and imports in general!). However, I'm unsure about the defref syntax. It seems to me that using [namespace.]module-style references would be the most transparent, and we could make it a rule that module references can only be a fully qualified module name or an alias to one. Module names already store this information in the database, and the user must know precisely which module reference they are working with at all times. That's a reasonable bar to meet IMO.

Questions

  1. Since interface members must not clash in the implementing module, we are gauranteed that if a module M implements A and B, then the module reference should be a sub-rowtype of A + B - that is, the module's reference type should be valid for either module{A} or module{B}, since both are a subtype of module{A + B}. It would be nice to be able to say something more complicated about module reference parameters like this, instead of just being confined to a single interface reference. What do you think of the addition of a with keyword, so that we can express this? For example,

    (defun add-whitelist-fungible ( symbol:string fungible:ref{fungible-v2 with whitelist-v1} ))

    Then, you can say

      (exchange.add-whitelist-fungible "kDOGE" user.kdogecoin)

    And the reference is sound. One nice thing about this approach is that if a module implements two interfaces, formal
    verification does not suffer because models are either contradictory or they dovetail, so module{A + B} in this case
    will have a set of properties that will be models(A) ∪ models(B) ⊆ models(M) and will not clash. So we can say
    meaningful things about properties of modules using their reference type - it would just have to be restricted to the
    models defined in the interfaces.

  2. This is more of a comment than a question, but if we do go with the above syntax, then we have to aggressively resolve members of an interface in order to resolve its type. Since these are signatures, it shouldn't be too huge of a problem

@sirlensalot
Copy link
Contributor Author

sirlensalot commented Sep 4, 2020

  1. I like with but might prefer a terser , or +.

  2. Agree that defref et al is unnecessary, happy to go with bareword module names. EDIT I think I am accustomed to "module terms declared in the module are in-scope in the module", ie there's a clear production for each in-scope name. If module was instead defmodule, the production of the module reference would be crystal-clear, but it's fine the way it is too -- we'll simply introduce the module reference into the environment when module is declared.

  3. I wanted to point out that a concern in dynamic dispatch is looping. We already prevent this at runtime for user guards and keyset predicates (via applyByName in Pact.Eval). We are of course losing the compile-time protection, but runtime will be ok.

@sirlensalot
Copy link
Contributor Author

@emilypi Wanted to record here our discussion about the isomorphism with row types, which rationalizes borrowing the syntax from user schema types. For purposes of this discussion I'll go with + instead of with.

We can generally extend the {...} type parameter production to allow for composition. Thus if {foo} designates the schema foo for an object or table, {foo + bar} constructs the composite schema foo ∪ bar. Presumably this would also invoke requirements for disjoint field names. This essentially constructs an "anonymous schema" that can probably be tracked by a canonical mangling of the component names.

Likewise, for module references, {baz} designates the type of a reference to a module that implements baz. {baz + quux} constructs the "composite interface" baz ∪ quux with the requirements noted above of disjoint signatures.

The isomorphism is fascinating to me, as I've never considered "row types for module signatures" and in general have only heard of row types applying to "plain old" datatypes. But I suppose an interface can be seen as a constructor taking function types.

I guess the only thing I might throw into this discussion is "what about type variables"? For instance Haskell allows for composition of typeclass constraints by accumulating constraints on a type variable, as opposed to a + operator. I mention this because a planned upgrade for higher-kinded user schemas in Pact, e.g. (forgive the ugly syntax):

  (defschema row-with-key <r>
     key:string
     row:r)

At the usage site, how is r specialized? I'm guessing it would be some kind of universal vs specialized thing:

  ;; 'r' is free/undefined so it expects the calling context to specialize, and is universally quantified for 'foo' 
  (defun foo (row:{row-with-key r}) ... )

  ;; specialization allows 'bar' to know about the type
  (defun bar (row:{row-with-key person}) ... )

I would (perhaps prematurely) conclude that schema type variables will not feature constraints and instead will be universal vs specialized in this fashion. This seems to motivate the "row type" presentation.

The one thing I was noticing about row type presentation vs Haskell constraints on a type variable is you get a little bit of reuse within a signature:

  (defun baz:object{foo + quux} (bar:object {foo + quux}) ... )

There's no "shorthand" for the {foo + quux} construction like you get with baz :: Foo a => Quux a => a -> a in Haskell. I think we can live without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants