Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design parameters ('epic issue') #1

Open
kriskowal opened this issue Apr 11, 2021 · 24 comments
Open

Design parameters ('epic issue') #1

kriskowal opened this issue Apr 11, 2021 · 24 comments

Comments

@kriskowal
Copy link
Collaborator

Let’s consider this massive ticket an epic and consider breaking out individual design threads as linked issues. Please feel free to join me in evolving the ticket description.

⚠️ Following opinions are not motivated by much familiarity with Agoric’s current CapTP! My experience flows from prototyping a very limited Q-Connection CapTP many years ago, and much more pressingly, my experience with Thrift and Protobuf. I’m vaguely aware of the design of Cap’N’Proto and FlatBuffers.

Wire protocols have a bunch of non-orthogonal design dimensions, and cross-language communication has a lot of gotchas. A protocol designed for high fidelity communication among JavaScript workers won’t necessarily (but might) be just as suitable for JavaScript workers communicating with Racket workers, and vice versa. A different protocol might be suited for communication among close relative languages like Python, Ruby, and shell scripts. A different protocol might generalize to all those languages plus C, C♯, Java, Go, and Rust. That last language class might be the first where it’s necessary to introduce an IDL to participate idiomatically.

We might also be in need of multiple coherent CapTP protocols, possibly with the assistance of gateways.

I would much rather design a wire protocol or IDL for a closed set of design languages than pretend that the design is suitable for any! Every new language brings some limiting quirk to the table, like JavaScript’s null and undefined, JavaScript’s 53 bit integers, JavaScript’s conflation of objects-as-structs and objects-as-dictionaries, Java not having unsigned integers, Go’s zero-value idiom, C’s struct packing, Python’s snake case, Perl’s advanced dementia.

So I’d like to collect some opinions about the goals and non-goals of OCapN CapTP.

  • participating languages. I like limiting the scope to Racket and JavaScript only if adding even one more participating language is a specified non-goal.
  • binary on the wire, text on the wire, or one of each.
  • support for schema evolution. Should it be possible for a proxy to forward a message without loss of information, regardless of whether its IDL version is older than the origin or destination?
  • IDL, no IDL, or option of IDL
  • self-describing on the wire? connection-oriented shared evolving dictionaries?
  • compact on the wire or lean on compression
  • lossless intermediates (proxies and storage), regardless of language of proxy
  • safe to assume: frozen data (records and tuples, not objects and arrays)
  • null, undefined, or both? This relates to the issue of whether an intermediary (proxy or storage) is obliged to round-trip both null and undefined, or if they both bottom out to one or the other.
  • connection oriented and supporting evolving shared dictionary caches for interned keys and enums?
  • treatment of symbols (I’m sure the only sane answer here is that symbols do not transit except for some symbols that are recognized to correspond to specific common features with suitable idioms in every participating language, like async iterators and async linked lists)
  • are records structs? are the field names consistent with the idioms of the wire protocol or consistent with the idioms of the subjective language and translated? For example, if JavaScript sends {keyName}, does Python receive {key_name}, C♯ {KeyName}?
  • is this protocol suitable for use as an arena allocator (Cap’N’Proto and FlatBuffers are)

I’d suggest we would at minimum need support for the following data types:

  • byte arrays, which might have to be base64 encoded if the protocol is text
  • utf-8 strings
  • signed integers up to 53 bits as would be represented with a JavaScript number and IDL might specify for languages that care.
  • signed integers of 64 bits or more as would be represented with a JavaScript BigInt and IDL might specify for languages that care.
  • floating point numbers of up to 64 bits (53 bit mantissa) as would be represented in JavaScript as number, with suitable representations for Infinity, -Infinity, NaN, and maybe even -0, unlike JSON.
  • enums, for which the representation in JavaScript and in command-line tools is idiomatically a string, but numbers most places, and depending on whether the wire protocol is self-describing and whether schema evolution is a goal, can be quite complicated to get right.
  • tuples
  • records
  • maps?
  • sets?
  • dictionaries? (as would be represented with JavaScript Object but Python dict)
@zarutian
Copy link

As the one who is writing an explorative implementation of Spritely Goblins and one who has specified an msgpck captp schema, spent too much time digging around in the captp pages on erights.org and in its E on Java code base I have quite some opinions on these matters.

What I have found out is that I do detest IDLs or other non-self descriptive formats like JOSS that make architectural (both design and ISA) and coding environment assumptions. (For instance capnproto struct datum pack ordering has yet to be specified in other form than template heavy c++ code.) Such IDL based systems just require too much 'hacktivation' energy to get started. Meanwhile writing a Syrup parser was stupidly simple and took about half a day.

Which datum and data forms to support?
Well, bytestrings and utf-8 strings are a good start for binary blobs and texts. BigInts, perhaps with diffrent size ranges encoded diffrently, Msgpack style.
Perhaps Symbols too for most used verb selectors, js like symbols, and so on.
Compound data forms needed at minimum are the list (array) and map (key value pairs) for passing position based arguments and keyword arguments of invocations over the wire.
I do like Syrup records for stuff like the ops and descs.

Probably more thoughts and comments later.

@MostAwesomeDude
Copy link

I wanted to add a couple notes from the E-flavored object-capability world. In Monte, we've experimented with several flavors of CapTP.

AMP and JSON were sufficient to allow me to implement multiprocessing for Monte, over AMP. This suggests that we don't need to invent a new serialization format (and IDL, etc.) merely for transporting capabilities.

I wish that I could say "let's use Capn Proto" and be done. However, it turns out that implementing a full Capn Proto subsystem is a lot of work! We've needed a capnpc entrypoint, and there is an asymmetry between reading and writing which is hard to factor well and leads to a messy support module. This also is a problem with JSON and AMP; I suspect that, in general, the capabilities to encode and decode are not actually facets of a single codec but two distinct scripts welded into a single location. However, with JSON and AMP respectively, the serialization and transport protocols respectively are not very hard to implement.

That said, we do already have toolchain support for compiling Capn Proto IDL to Monte bytecode. So let's use Capn Proto and be done with wire details.

@kriskowal
Copy link
Collaborator Author

Thanks @MostAwesomeDude. Can you confirm that “Let’s use Cap’n’Proto” implies you’re in favor of requiring an IDL?

@kriskowal
Copy link
Collaborator Author

I agree we should not invent a wire format and that I’d like to frame the conversation around choosing a format that satisfies the rest of the requirements, when we have a firmer notion of which branch of the design space we’re wandering down. If we’re leaning binary, MsgPack or CBOR are likely sufficient. I like the Protobuf varint because it’s precision agnostic and can conceivably scale up to BigInt and allows for precision increases over schema migration, but these would-be-nice not necessary. I don’t know Cap’n’Proto’s wire protocol enough to judge what constraints it puts on JavaScript or Scheme idioms in particular, so at some point I hope to ask an expert how it (and any other binary protocol) would behave for specific edge cases. I also don’t know whether you can run Cap’n’Proto without an IDL. Agoric runs without IDL today, which is great for rapid iteration.

@zenhack
Copy link
Collaborator

zenhack commented Apr 13, 2021

I'd second @MostAwesomeDude's suggestion to just pick capnp for serialization and be done with it, and I'd take it one step further: let's use capnproto rpc as the basis for the protocol, and figure out what extensions are needed to do the things people want to do.

Unfortunately, capnproto exists already, so if we do anything other than build something compatible with it, we end up with a world where there is more than one CapTP protocol in active use -- so I would like to thoroughly explore the idea of building the things people want on top of Cap'n Proto RPC, possibly with extensions where needed and feasible, before going ahead and building something incompatible. Maybe there will be true dealbreakers, but maybe not.

The biggest advantage of this approach is that if we can make it work, we can potentially avoid gateways, which would segment the network when it comes to three-party handoff; they become bottlenecks through which all inter-protocol traffic must pass, which is unfortunate. If we can't make it work, we can potentially build gateways, but I suspect that we can.


This leaves open a couple questions that I've thought about:

  • Goblins and Agoric both want to use language-native objects relatively seamlessly. They're dynamically typed languages, so there's the question of how to model that in capnproto. Whatever types we include in the interfaces used for this purpose, we can just define a union type in the IDL to cover them, e.g. for javascript we might do:
struct KV {
  key @0 :Text;
  value @1 :Value;
}

struct Value {
  union {
    number @0 :Float64;
    string @1 :Text;
    array @2 :List(Value);
    null @3 :Void;
    undefined @4 :Void;
    object @5 :List(KV); # Unfortunately capnp doesn't have a built-in map type, but we can layer semantics on top for that.
    function @6 :Function;
    # ...
  }
}

interface Function {
  call @0(args :List(Value)) -> (result :Value);
}

...and the libraries can hide the IDL from their users, embedding js types in capnproto like the above, but full-capnproto implementations can still use this schema to talk to programs using the extra js layer.

  • For pipelining purposes, right now capnproto only supports projecting on fields, and since those are struct fields, this won't allow pipelining on things like javascript objects if those are modeled. The way pipeline operations are modeled in the protocol right now actually makes this a little hard to evolve, but I have an idea of how to do it: just allow calling methods on things other than "capability" pointers, and capnp implementations can provide a way to specify how to service method calls on non-capabilities. Then, we can add pipeline ops as needed with an interface:
interface JsPipeline {
  getProperty @0 (name :Text) -> (value :Value);
  # ...
}

If the remote vat doesn't understand the method, it will throw an exception with type = unimplemented, in which case the operation can just be performed locally when the promise resolved. This allows us to experiment with all sorts of possible operators, without needing to graft each one on to the protocol in an ad-hoc way.

This still leaves open the question of what Value should look like exactly, and how many language models we should or shouldn't cram into it. It also opens up the possibility of just having different times for racket vs. js values, and having to drop down to a lower-level interface to send "non-native" values, whether those be a Value for a different language, or some other capnproto interface.

I'm sure there are various other considerations I haven't thought of.

@kriskowal
Copy link
Collaborator Author

If I may recap @zenhart, I believe you are proposing the Agoric and Goblins communicate using a Cap’n Proto meta-IDL. Other languages would codegen a client from this meta-IDL in order to interact with Agoric and Goblins vats and would have the option of communicating amongst themselves with their own hand-rolled IDL’s.

The hand-rolled IDL’s provide a superior developer experience for languages using generated clients, since the IDL maps more directly to language idioms. This creates an incentive for such implementations to provide both their own IDL and a meta-IDL bridge. These are effectively hand-rolled gateways embedded in services.

Protobuf 3 similarly provides a JSON equivalence, but the conversion is well-defined and not hand-rolled.

Do we believe that it would be possible for Agoric/Goblin intercommunication to occur without an IDL, using a partially self-describing wire protocol, that can be fully-described for the purpose of generated clients not built upon Agoric or Goblins?

@MostAwesomeDude
Copy link

I want to summon @kentonv, who can steelman my claims (and more likely will remind me how wrong I am!) I have three points to address: Whether the IDL is required to traverse buffers, whether dynamic languages can quickly adopt an interface with the IDL,, and whether Goblin, Agoric, Monte, etc. could talk with clients in other languages.

It's possible to traverse Capn Proto buffers without an IDL. The resulting data structure is a tree. (Technically, it can be a cyclic graph, but @zenhack or @kentonv would ask implementations to not do that. Monte doesn't mind cyclic structures though.) The tree has full type information WRT the buffer's layout; everything can be pulled out and read. However, the names of unions and enums are missing; it's all numbers.

I've done this before, because writing a Capn Proto implementation requires hand-parsing buffers in a bootstrap. After the necessary support library is factored out, the bootstrap module is pretty small, but it was a difficult time. For JS, there's an official upstream Node.js implementation. For Racket, I can't find anything, but Racket's got all of Monte's tools and more, so it sounds feasible although difficult.

Capn Proto RPC offers a sort of high-level memory-safe polymorphism via AnyPointer. This mechanism is like void* casting in C, or reinterpret_cast<> in C++, but respects the actual shape of the buffer's tree and can safely indicate errors to a parser. When we encode the actual payload in the RPC protocol, we are allowed to encode AnyPointer to a data structure, made out of any Capn Proto schemata which we happen to have on hand, and we also have a tagalong List(CapDescriptor) whch allows us to recover capabilities embedded within that structure.

Those were not great words. In better words: If you know Monte's schema for messages, then you can send messages to Monte vats speaking Capn Proto RPC. Two things to note: The Monte schema is very plain and would be easy to reverse-engineer (read: sort of self-describing), and also monte-language/typhon#220 would allow us to accept multiple schemata which are indexed by those hexadecimal UUIDs that are on the first line of the IDL (Goblins, Agoric, Sandstorm, etc.)

(Here is where my persistent vat implementation would go -- if I had one~! A persistent vat is just one which can host meaningful SturdyRefs with durability and the ability to take backups and time-warp and etc.)

I gotta endorse @zenhack's point that Capn Proto RPC has seen real-world use, mostly via Sandstorm.

@kentonv
Copy link

kentonv commented Apr 13, 2021

I haven't had a chance to read this whole thread, but wanted to throw a couple things out there...

Cloudflare's Durable Objects is an actor-model global distributed compute platform that is today implemented using Cap'n Proto RPC. At present we don't directly expose Cap'n Proto to the application layer, instead requiring apps to use HTTP-shaped interactions. We do, however, implement e-order (it's even in the docs).

We'd like to expose non-HTTP-shaped RPC to the application layer. It's likely we'd do this using V8 serialization to encode JavaScript objects. V8 serialization implements the "structured clone" algorithm that appears commonly in the web platform. This is the best thing for us to use because V8's implementation is well-optimized, well-tested, and supports a well-defined and widely-understood subset of types. In our system, the serialized bytes will never be exposed directly to the application, so it doesn't matter than it's V8's particular format. This format is also available in Node.js.

But I guess defining a common standard based on V8 serialization -- which itself is not documented as a standard -- might be awkward.

The biggest advantage of this approach is that if we can make it work, we can potentially avoid gateways, which would segment the network when it comes to three-party handoff; they become bottlenecks through which all inter-protocol traffic must pass, which is unfortunate. If we can't make it work, we can potentially build gateways, but I suspect that we can.

It's possible to build distributed gateways. Complicated, but possible.

@cwebber
Copy link
Contributor

cwebber commented Apr 14, 2021

Note that some of us had a bunch of useful discussions last evening (somewhat impromptu on a call that wasn't planned for this purpose specifically), and there's another meeting crossing pretty much the Agoric, Spritely, CapnProto worlds on an upcoming call.

Foolishly I did not take notes, but I'm going to write down what I remember:

  • Everyone seems more optimistic now that @zarutian has implemented a basic version of Goblins' version of CapTP on top of JS/Agoric's tools
  • Goblins' CapTP uses Syrup as a temporary encoding option. It is very similar to SPKI canonical s-expressions, bencode, netstrings. See these comments to see the entire system shortly summarized. Syrup was well received at the meeting as being simple and comprehensible. Nobody is making a commitment to sticking with it (you could encode the same data more efficiently using other systems) but it seems to very nicely keep things moving along for the moment, and seems to have aided @zarutian getting Spritely's CapTP done fast. It might be worth just sticking for this for the moment: it's a binary format that's also easy enough, and switching it out with a different encoding can be done fairly easily later (Syrup is already an encoding of Preserves, which already has two other encodings for a human-readable syntax and binary syntax... as long as we stick to those same base types, swapping out the encoding can be easy enough).
  • The general sense is that it's probably best to first focus on getting Agoric and Spritely's stuff demonstratably interoperable before talking about any other interoperability, including with Sandstorm and CapnProto, however we would like to do a survey of the major different approaches between the systems which we think can positively inform things and start the wider compatibility conversation later.
  • Regarding the IDL bit: note that part of the conversation has memetically seemed close to "If OCapN uses CapnProto, this is going to simplify talking to sandstorm apps from Agoric and Spritely programs!" However this doesn't necessarily seem true... in general, if you're gonna "gödel it", it's easier to gödel upwards in abstraction layers rather than crash through the abstraction floor. Layering Spritely and Agoric programs on top of CapnProto would involve adding a "dynamic captp type", which is probably the "only" type that Spritely and Agoric programs could speak... so talking to existing sandstorm apps is not a clear and assured thing. (What about in the reverse direction? Many Haskell programs do take what's arguably an IDL approach in unpacking something like JSON into expected structures at "parse time", so an IDL can also be layered on top of an untyped system... see also Typed Racket and how it negotiates at the boundary.)
  • We had good progress on discussing the "method-centric vs procedure-centric vs procedures-or-methods-centric" approaches to invocation. We seem to have two paths we seem to think are interesting:
    • Have two different application styles, one for method invocation, one for procedure invocation. This would be possible in Goblins, but ugly. It also would mean that what an interface "looks like" would vary depending on whether Agoric implemented it first or Goblins implemented it first, leading one or the other side to be unhappy possibly, but maybe this is minor.
    • It would be possible to do the "lambda the ultimate" type approach, ie procedure-centric, and still work with objects on Agoric's side. In incoming messages in CapTP, if recognizing sending a message to an object, we would assume the first argument is a method selection and treat it that way to peform method dispatch. If sending a message to a procedure, apply all arguments. This has the upside of supporting more-unified application interfaces between the Spritely and Agoric implementations, but has the downside that you cannot have a function in javascript which also has methods. @erights posited (correct me if I get this wrong please) that "we can design for this, and this might confuse Javascript developers as something they must adapt to initially, but it might not be more confusing than the new considerations Javascript developers are already adapting to"
  • It seems likely we're going to very easily and quickly agree on a core set of types for Spritely/Agoric interop, and this is good news.
  • Some time was discussed around the choices of memory management. E-style (including Spritely, Agoric) CapTP implementations make an assumption of distributed (acyclic) GC, CapnProto makes the assumption of manual-release. @erights seems convinced these lead to different design considerations, particularly in the extra work in E-style CapTPs to avoid a GC race condition that might not be encoded in CapnProto (but it seems we might have to review this). I asked whether or not the reverse direction approach could be done: could one start with a protocol that default-assumes GC might be used, and then an "explicit release" in a memory managed language is effectively "actively reducing the final count" in the same way the GC would have. @erights confirmed he believed this was a viable direction.
  • The most important next thing might be to do a demo of Agoric-based tooling and Goblins tooling talking over goblin-chat, for instance. That would be a visually powerful demo.

There's a lot else to discuss obviously I think. Also my memory, as usual, may have gotten something wrong. Let me know if something needs correcting.

@cwebber
Copy link
Contributor

cwebber commented Apr 14, 2021

Oh one other thing that was an interesting and useful observation: this might not be an Agoric/Spritely CapTP vs CapnProto duality debate. CapTP is a membrane... so that means OCapN CapTP is also a membrane. Likewise, CapnProto is a membrane. So it could be possible to write the same program that works in either. Indeed this is exactly the idea for supplying ActivityPub support in Spritely: make it a membrane.

One interesting bit though is that even if we had two protocols/membranes, the idea of the OCapN netlayer interface idea could work for both... it makes sense to layer both CapnProto and OCapN CapTP over the same base connection types. So, the mechanism to either connect over Tor Onion Services, I2P, DNS+TLS, etc with OCapN goblin-chat could also be generalized for CapnProto to be able to use. In this sense, if we added a separate layer for CapnProto, we could still get a big win so that Spritely and Agoric programs could easily still talk to CapnProto applications, even if the default for Spritely / Agoric programs might not be as IDL-centric and might not have the same memory management assumptions.

@cwebber
Copy link
Contributor

cwebber commented Apr 14, 2021

BTW, one of the reasons I chose Syrup for my initial implementation is exactly because I wanted something simple that could be written in just 2-3 hours on any platform that wasn't aiming to be the best encoding of all time. It's a huge bikeshed and we're going to be very vulnerable to spending all our time painting it if we're not careful. This is really the least important layer for us to spend time on (well, maybe that's not true if you start with a static/IDL centric approach, but it's true from my perspective); the really hard design decisions aren't in the layer that could be swapped out in about three or so lines of code if necessary later. I'd prefer we could pick something and move on. Since we have two implementations already using Syrup and the implementors of those seem to think it was decent enough, I'd say let's focus on the other much more interesting and difficult bits.

@zenhack
Copy link
Collaborator

zenhack commented Apr 14, 2021

Regarding GC vs. manual memory management, I strongly suspect there's a misunderstanding somewhere, because you seem to be suggesting that this has protocol level implications, and I do not think that's the case -- capnproto uses the same refcounting design at the protocol level as everything else. In a GC'd language there's absolutely no reason you couldn't just rig up a finalizer to drop the refcount and call it a day, as I have in haskell-capnp so far. If the race condition you mention is the one I think it is, capnproto does indeed deal with this. As far as I can tell the "viable direction" you describe is where we already are. But we should make sure we're on the same page.

Regarding the IDL bit: note that part of the conversation has memetically seemed close to "If OCapN uses CapnProto, this is going to simplify talking to sandstorm apps from Agoric and Spritely programs!" However this doesn't necessarily seem true... in general, if you're gonna "gödel it", it's easier to gödel upwards in abstraction layers rather than crash through the abstraction floor. Layering Spritely and Agoric programs on top of CapnProto would involve adding a "dynamic captp type", which is probably the "only" type that Spritely and Agoric programs could speak... so talking to existing sandstorm apps is not a clear and assured thing.

I guess I'd envisioned dealing with the other direction via reflection, which still needs designing/implementing on the capnp side, but would allow calling into capnp from dynamically typed captp by mapping the dynamic data to the methods described by the capnproto schema.

@kentonv
Copy link

kentonv commented Apr 14, 2021

Right. Cap'n Proto, as a protocol, implements exactly the same refcounting approach as CapTP.

The difference is philosophical: I don't believe you can simply hook up this refcounting to local GC finalization hooks and achieve acceptable results. Good GC implementations depend heavily on memory pressure notifications. In order to have distributed GC, you need a notion of distributed memory pressure. Otherwise, a machine where no memory pressure exists will simply hold on to all its capabilities forever, and the remote machines hosting the target objects will not be able to collect them no matter how much memory pressure they are seeing.

So unless someone has a proposal for actual distributed GC that solves these problems, I think it's necessary for applications to explicitly drop their capabilities when they no longer need them. But that's really up to the application; if you disagree and think finalization callbacks are fine, there's no reason you can't do it that way on top of Cap'n Proto...

That said, I do agree that the choice of philosophy here can heavily influence how application interfaces are designed.

@zenhack
Copy link
Collaborator

zenhack commented Apr 14, 2021

Note that GHC's runtime will start a GC cycle if it is idle for some (relatively short) amount of time, and it's very rare to have activity going on that's non-allocating. I'm not 100% convinced kenton's concern applies to any GC design (for that matter, the refcounted pointers that capnproto-c++ uses internally are arguably a form of GC, an approach shared by CPython, though not otherwise terribly common), but it does seem reasonable to worry that depending on the GC design and the idioms of the language, relying on finalizers may result in leaking memory on remote machines.

I also wonder if triggering network communication with finalizers opens us up to sidechannel attacks, by leaking information about unrelated allocation activity inside the vat. I don't really know how to think about trying to quantify this.

But, again, all of this is entirely orthogonal to the protocol.

@zenhack
Copy link
Collaborator

zenhack commented Apr 14, 2021

Note also that capnproto-c++ uses refcounted smart pointers for capabilities, so the code you write does not really spend any text explicitly freeing things. Arguably refcounting is a form of GC, which trades throughput and the ability to reclaim cycles for predictability and simplicity. Most language runtimes make a different trade-off, but CPython is a notable exception.

@kentonv
Copy link

kentonv commented Apr 14, 2021

Yes, I am in the RAII camp, which is very different from "manual" memory management. Please don't lump us together. 😃

@cwebber
Copy link
Contributor

cwebber commented Apr 16, 2021

That's good news if true re: the memory management stuff being the same. I got the impression from @erights that Cap'n Proto did something different about memory management than captp did... I think @erights has thought that a very "different assumption is being taken with cap'n proto's handling of memory. If that's not true, that's one major point of concern removed.

@erights
Copy link
Collaborator

erights commented Apr 16, 2021

I may indeed be confused about this. I would love to find out that this is not a barrier.

Looking forward to clarity on this!

@erights
Copy link
Collaborator

erights commented Apr 19, 2021

See https://github.com/Agoric/agoric-sdk/pull/2909/files for a draft more detailed semantics of the Agoric CapTP's Passable. The parts above the divider (currently the entire PR) are the abstract syntax and semantics. Only the part below the divider are about our concrete encoding into JSON. Switching to a different concrete seriailzation (e.g., syrup) should not affect the part above the divider.

Although somewhat biased towards JS, the abstract semantics above the divider is intended to be language independent enough to serve as a basis of inter-language interoperability, e.g., with Goblins and perhaps with Cap'n Proto. We'll see.

@cwebber
Copy link
Contributor

cwebber commented Apr 19, 2021

I already commented on the above PR, but this is really great! I agree with describing the passables in terms of their abstractions, as opposed to in terms of a specific marshalling. (Perhaps that's the definition of a recent pun: when a serialization system too heavily leaks into the abstraction of the datatypes, maybe we have entered into "marshall law".)

We should thus do the same also for all the operations, as well as the certificate structures for handoffs. If we can describe this all abstractly, we'll have a lot of flexibility in terms of swapping out the serialization, and can focus on it less.

(Of course the serialization in terms of canonicalization will matter a lot on the wire, since that will affect signatures of certificates... but it should not affect seriously the way we write our code.)

@zarutian
Copy link

zarutian commented Apr 20, 2021

Something like this abstract spec-wise, @cwebber ?

op := deliverOp | deliverOnlyOp | gcAnswerOp | gcExportOp |
       listenOp | terminateOp   | bootstrapOp | eventualGetOp |
       deliverFuncOp |  deliverFuncOnlyOp
deliverOp := deliverOpMarker answerPos redirector target verb arguments kwarguments
deliverFuncOp := deliverFuncOpMarker answerPos redirector target arguments kwarguments
deliverOnlyOp := deliverOnlyOpMarker target verb arguments kwarguments
deliverFuncOnlyOp := deliverFuncOnlyOpMarker target arguments kwarguments
gcAnswerOp := gcAnswerOpMarker answerPos
gcExportOp := gcExportOpMarker exportPos wireDelta
listenOp := listenOpMarker remotePromise resolver
terminateOp := terminateOpMarker terminationReason
bootstrapOp := bootstrapOpMarker answerPos resolver
eventualGetOp := eventualGetOpMarker answerPos redirector target prop
redirector := anyDesc
resolver := anyDesc
remotePromise := anyDesc
target := anyDesc
verb := string | symbol
arguments := argumentsMarker args
args := arg [ args ]
arg := any
kwarguments := kwargumentsMarker kwargs
kwargs := argKey argValue [ kwargs ]
argKey := any
argValue := any
prop := any
any := anyDesc | datum | compoundData
anyDesc := answerDesc | exportDesc | importDesc | handoffDesc
answerDesc := answerDescMarker answerPos
exportDesc := exportDescMarker exportPos
importDesc := importObjDesc | importPromiseDesc | newImportObjDesc | newImportPromiseDesc
importObjDesc := importObjDescMarker importPos
importPromiseDesc := importPromiseDescMarker importPos
newImportObjDesc := newImportObjDescMarker importPos
newImportPromiseDesc := newImportPromiseDescMarker importPos
...

(I probably edit and add more when I have nenna/gumption for it)

@cwebber
Copy link
Contributor

cwebber commented Apr 20, 2021

Wow that's a great start! I have more comments but incredible work @zarutian!

@cwebber
Copy link
Contributor

cwebber commented Apr 20, 2021

Note that re: deliver*Op, currently in Goblins we do have the verb, but don't actually use it. I added it in anticipation of working with my understanding of the plan to have methods be core in agoric, but then dropped it when I went the "lambda the ultimate" route... but not from the ops yet, it seems...

I see you added listenOp, which I appreciate even though I think only Goblins and the version you ported to your agoric-sdk branch are the ones who have had a separate listenOp before... it turns out to be important to the procedure-application-centric approach combined with promise pipelining. The reason is that since promises can be sent arbitrary messages to be pipelined, intercepting a "specific method" doesn't make sense / isn't necessarily safe, hence an independent op.

This seems like a good start... maybe we have enough to go off of to start working on some fresh docs.

@zenhack
Copy link
Collaborator

zenhack commented Apr 23, 2021

(Of course the serialization in terms of canonicalization will matter a lot on the wire, since that will affect signatures of certificates... but it should not affect seriously the way we write our code.)

As xentrac on IRC pointed out forcefully the other night, relying on canonicalization for signatures is probably a bad idea; any signature checking should be on an existing blob, and we should not rely on being able to reproduce the exact bytes for the purpose of checking a signature. So I think this is not really a concern either.

@jar398 jar398 changed the title Design parameters Design parameters ('epic issue') Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants