Skip to content

Commit

Permalink
updated the documentation pretty massively, but not ready with devcon…
Browse files Browse the repository at this point in the history
…tainer yet
  • Loading branch information
Ian Smith committed Jul 4, 2023
1 parent e9efa95 commit 4f3bfcd
Show file tree
Hide file tree
Showing 8 changed files with 402 additions and 97 deletions.
5 changes: 4 additions & 1 deletion site/content/en/docs/Concepts/_index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
---
title: Concepts
weight: 1
weight: 5
description: |
This document explains the core concepts that are important to understand when
working with parigot.
---

{{% pageinfo %}}

The information in this page refers to the `atlanta-0.3` release of
parigot.

{{% /pageinfo %}}
163 changes: 163 additions & 0 deletions site/content/en/docs/Concepts/marshal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
+++
title= "Marshaling and Unmarshaling"
description= """Marshaling packs an internal data structure in a program into a \
well-defined external format so the data structure can be given or transmitted \
to another system. Unmarshal does the reverse upon receiving."""
date='2023-07-04'
weight= 2
+++

### Protocol Buffers

Protocol Buffers ("protobufs") is a specification of a data serialization format, an interface
design language (IDL), and a code library. The first of these is the most important,
with the other two a bit more ancillary. By interchange format here we mean a
particular byte layout that is carefully speced out in terms of how each data
type should be formatted. For example, if I want to send an integer value from one
system to another system, what *exactly* should I send? What if the receiver
has 32 bit integers and the sender 64 bit ones? What if the sender and receiver have
different [Endianness](https://en.wikipedia.org/wiki/Endianness)? Even this
trivial example is fraught with peril.

Protocol Buffers format has been widely used for 15 years, and for more than 20
years by google. It is battle-tested. There are numerous other, typicaly
newer, challengers to the data serialization throne, but none have managed to
disloge protobufs because it is well known, well tested, and reasonably good on nearly all
dimensions of goodness for a data serialization format. Other challengers include
formats like Thrift, Avro, MessagePack, BSon, Hessian and (god help us) XML. Many, but
not all of these, like protobufs, have an accompanying IDL to allow users to
specify their data structures of interest.

{{% alert title="Deep Cut" color="info" %}}
parigot is not really tied to the protobuf serialization format. parigot would
operate exactly the same with another combination of an IDL and a serialization
format. Although protobufs is being used presently, this primarily because of the
wide acceptance of protobufs rather than some feature of it.
{{% /alert %}}

#### Protocol Buffers IDL

Here is a lightly edited example of a pretty trivial "service" definition. This example
defines a service called `greeting` with a single method called `FetchGreeting`
which naturally takes an input of a `FetchGreetingRequest` and returns a
`FetchGreetingResponse.`

{{< tabpane right=true >}}
{{% tab text=true header="Golang" lang="go" highlight=true %}}

// Greeting is a microservice with a very simple job, return a greeting in
// language selected from the Tongue enum.
service Greeting {
// FetchGreeting returns a greeting in the language given by the
// Request, field "tongue".
rpc FetchGreeting(FetchGreetingRequest) returns (FetchGreetingResponse);
}
{{% /tab %}}
{{% tab header="Python" lang="python" disabled=true /%}}
{{% tab header="Java" lang="java" disabled=true /%}}
{{< /tabpane >}}

Content like the above would be contained in the file `greeting.proto` or similar.
Although it looks like a programming language, and it is clearly quite similar
to one, this is a "specification","spec", or "schema" in that it only defines the
data to be transmitted and the functions for the data to be transmitted to, as
well as the reverse process for return values.

Let's take a look at the specification of the __messages__ which are the data
objects in a protobuf schema. In our example, we have two of them, the matching
`FetchGreetingRequest` and `FetchGreetingResponse`.

{{< tabpane right=true >}}
{{% tab text=true header="Golang" lang="go" highlight=true %}}

// FetchGreet is called to retreive a common greeting, like
// Bonjuor in french.
message FetchGreetingRequest {
Tongue tongue = 1;
}

// FetchGreetingResponse is returned to a caller who sent a request
// to the FetchGreeting endpoint.
message FetchGreetingResponse {
string greeting = 1;
}
{{% /tab %}}
{{% tab header="Python" lang="python" disabled=true /%}}
{{% tab header="Java" lang="java" disabled=true /%}}
{{< /tabpane >}}

So this example is largely what you would expect with the caller requesting the
greeting in a particular language, the "tongue", and the callee returning back
a response that contains the text like __bonjour__ or __guten tag__.

It is worth noticing that the definiton of `FetchGreetingRequest` is not finished
at this point because it references a different "type" called `Tongue`. Let's
show the last two types.

[#enum-protobuf]
{{< tabpane right=true >}}
{{% tab text=true header="Golang" lang="go" highlight=true %}}

// which language do you want?
enum Tongue{
Unspecified = 0;
English = 1;
French = 2;
German = 3;
}

// The first four values of any error enum are to be as shown below.
enum GreetErr{
option (protosupport.v1.parigot_error) = true;

NoError = 0; // required
// Dispatch error occurs when we are trying to call a service
// implemented elsewhere. This error indicates that the process
// of the call itself had problems, not the execution of the
// service's method.
DispatchError = 1; // required
// UnmarshalFailed is used to indicate that in unmarshaling
// a request or result, the protobuf layer returned an error.
UnmarshalFailed = 2; // required
// MarshalFailed is used to indicate that in unmarshaling
// a request or result, the protobuf layer returned an error.
MarshalFailed = 3; // required

// FetchGreeting returns this when the parameter presented to
// it is not a language in its list.
UnknownLang = 4;
}

{{% /tab %}}
{{% tab header="Python" lang="python" disabled=true /%}}
{{% tab header="Java" lang="java" disabled=true /%}}
{{< /tabpane >}}

An __enum__ in a protobuf specification is a collection of small integer values
with names to make them easier to remember and understand when seeing them. Our
first enum here is a "normal" enum and the second one, `GreetErr` is a special
one for parigot. You will note the extra "option" that is used to inform parigot
about this special type. A parigot error value is returned from every call to
a remote service. It is expected that developers will specify all the possible
error values in their error types.

It is worth considering that `FetchGreetingRequests` references another type
as part of its definition--`Tongue` in this case. What if `Tongue` referenced
one or two more types it __its__ definition? This is where the protobuf IDL
works in concert with the data serialization format. Any combination of things
that can be specified in the protobuf IDL will be marshaled to a well-known
sequence of bytes and when unmarshaled by another program will produce the
correct data structure.

{{% alert title="Deep Cut" color="info" %}}

parigot analyzes the definitions in
the protobuf spec and determines if there are request messages or result
messages that do not have any members. These need to be in the specification
(in case you want to add something later) but are not useful in programs.
parigot removes these parameters from the code it generates so the method
`Bar()` takes no parameters if the corresponding input message has no members.
parigot behaves analogously for output; parigot, however, always returns an
error code.

{{% /alert %}}
103 changes: 103 additions & 0 deletions site/content/en/docs/Concepts/remote.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
+++
title= "Remote Procedure Calls"
description= """Remote Procedure calls are function calls that take place over a network."""
date='2023-07-04'
weight= 3
+++

Remote Procedure Calls (RPCs) have been around since the earliest days of
computer networks, dating from about 1980. It seems natural that one computer
might want to send a request for computation to be accomplished by another
computer and then for the remote computer to send back the result. These
systems have mostly followed the same basic design.

The design is that a specification is written to specify what Procedure Calls
can be made between the two computers, what parameters the first will send
to the second, and what return result will be sent from the second to the first.
This "specification" requires that the two computers send compatible data formats
between them when making the request and the response.

It should be clear that the [previous section]({{< ref "marshal" >}}) section explained a particular
specification language (the protobuf IDL) and the data format to use when the
communcation channel (the protobuf data serialization format).

The second part of the common design of RPC systems is that the RPC system generates
"boilerplate" code that handles the packing up of parameters, sending them, waiting
for the response, then unpacking the result and presenting back to other layers
of the program in a way that makes sense for the programming language. This generated,
repetitive code is ofter referred to as the __stubs__. There is a "stub" for each
method call and response that makes the calling of a Remote Procedure Call look
either mostly or completely like a "normal" procedure call that does not utilize
the network.

This ability to "hide" the use of the network in an RPC is of arguable value. Consider
this procedure call where the method `Bar` is being called on the object `Foo` and
the result placed in the variable `result`.

{{< tabpane right=true >}}
{{% tab text=true header="Golang" lang="go" highlight=true %}}

result=Foo.Bar()

{{% /tab %}}
{{% tab header="Python" lang="python" disabled=true /%}}
{{% tab header="Java" lang="java" disabled=true /%}}
{{< /tabpane >}}

If this is a normal, non-networked function call that our current processors
can do billions of times per second, the odds of this function truly failing
are very low. "Failing" here might be a situation such that the program is out
of memory, memory corruption by cosmic rays has been detected, or the processor is
powering down so as to not overheat. These are all failures, it is true, but outside
the first one these types of failures are so rare that a typical developer may never
see them in their who career. Running out of memory is not outside the experience
of most developers, but that usally is a catastrophic development for the running
program not one that most programming languages provide much way to do anything
about (the program just crashes).

However, if by the generation of stubs the call in the example above uses a network,
the class of failures that can happen is not only much larger, but the liklihood
of a failure is vastly larger. Some reasonable failures might be:
* The machine that was expected to do the computation `Bar` on behalf of `Foo` is
currently overloaded and cannot accept the request, although it might be able
to later!
* The network connecting the caller and receiver of the mesasge about `Bar` are
not connected via a network right now (the network is down, the plug was pulled
out by the dog).
* The remote machine that would normally do the computation of `Bar` is offline
for maintainence.

... and there are many more. This is the reason that parigot *always* returns
an error code from any method call, because we do not want to have a situation
where the parigot developer "forgets" that the are using a network when making
a procedure call. The case of `Foo.Bar()` above makes it very pleasant to ignore
the networking involved in the computation, until it doesn't.

### parigot and generated code

Previously, we mentioned that RPC systems for 40+ years have been generating
code to make RPC calls easier and more pleasant to use. We also mentioned that
there are risks with *completely* hiding the network from a user trying to do
what appears to be a simple procedure call. parigot generates a large amount of
code based on the `.proto` files that specify the interfaces between services in
your system. parigot tries to strike a balance between convenience of notation
and exposing the multitude ways that a network can fail. In the case of the
current golang support parigot generates code that is strongly typed such that
the developer must use the correct types when interacting between a caller and
receiver (there are no loopholes in the type system). Further, parigot is
careful to expose to the user the return value that would be expected and might
include information in it about the details of the network failure. Finally,
when using the continuations sytle of development with parigot, parigot has
strongly typed notions called `Futures` that express that a network call is in
progress and may yet fail.

{{% alert title="Limitation" color="warning" %}}
In the example [previously with the enums]({{< ref "/marshal#enum-protobuf" >}}), the reader may have noticed that
some of the enum values of an error are "mandatory". These values are
reserved for errors that parigot's generated code calls. At the current
time, the additional values that need to be added to the "reserved" list
of enum error values are missing. These additional values are to allow
parigot to propagate network failures it detects to your code. At the
moment, all the various network errors are conflated with the Marshal and
Unmarshal errors.
{{% /alert %}}
29 changes: 29 additions & 0 deletions site/content/en/docs/Concepts/together.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
+++
title= "Putting it all together"
description= """How parigot unifies all the concepts in this section."""
date='2023-07-04'
weight= 4
+++

We have discussed in this Concepts section three concepts that are clearly somewhat related:

1. WASM
2. Marshaling and unmarshaling
3. Remote proceedure calls

... but parigot makes them brothers.

parigot lets you do the following things (see if you can spot the correspondence!):

1. Define an application of many services where the interfaces between the services is specified with the protobuf IDL.
2. Program that app using the programming language of your choice.
3. Test and debug your application as single program that has multiple
services within it.
4. Deploy your application as a constellation of services
that are separated by a network.

If you skipped the previous sections, the connection is 1 in the second
list corresponds to 2 in the first list. Similarly, 2 corresponds to 1, and
3 and 4 of the second use are the two variants of 3 discussed earlier.

Happy microservicing!

0 comments on commit 4f3bfcd

Please sign in to comment.