Simplified Event API

Joe Hellerstein edited this page Dec 8, 2015 · 1 revision

P2 Event proposal

The goal of this formulation of events under P2 is to establish a system in which *everything* is a tuple, and in which tuples are always either in a materialized table, or in a transient event queue. Networking, events, etc would be implemented as triggers in another language, such as C or python.

In this proposal, an event queue is just a table that stores tuples with zero lifetime, and a fixed (perhaps in the future) creation time. Triggers check for the existence of tuples in tables (usually event queues) in between each fixpoint calculation. Triggers may modify the contents of the tables (so that they can create events by inserting tuples into the event queues). Tuples with zero lifetime are guaranteed to come into existence for exactly one datalog fixpoint calculation.

Non-goals

  • A proposal for new syntax
  • Concern for implementation efficiency / discussion of potential optimizations
  • Discussion of delta_insert, delta_delete rules, datalog execution semantics, etc. (These are being covered in parallel elsewhere...)
  • Discussion of how these event handlers and their callbacks are registered. Allowing overlog code to request native callbacks by name should work. We could play linker tricks in order to service such requests.

Goals

  • A simple, unified conceptual model for the next iteration of overlog.
  • A simple, unified conceptual model for people writing extensions to the next implementation of overlog.
  • Movement of as much of the runtime as possible into overlog, or into extensions written to a narrow event-handling API.

Background

Tyson's redesign of P2 is based on a local datalog interpreter, probably with some facilities for event input and output. The (fuzzy) proposal introduces two types of tables; ones that are materialized tables (and live from iteration to iteration), and ones that are event queues (and are emptied after each data fixpoint is completed). Each datalog fixpoint corresponds to a timestep.

Datalog code executes during such fixpoints. Event side effects are executed between fixpoints.

Overlog-style event syntax

  • foo["a",1,c]
  • A zero-lifetime triple in event queue foo. It stores a string, an integer, and the current value of some variable. Such tuples are called "event tuples".
  • foo("a",1,c)
  • An infinite-lifetime triple in relation foo. Such tuples are called "persistent tuples". I'll assume that "soft-state", tuple expiration, and other time-based semantics can be implemented via rewrites and triggers written in native languages.
  • baz :- bar, bat
  • Some datalog rule. We'll allow arbitrary combinations of event tuples and persistent tuples.

C-style event syntax

  • void foo(Tuple t);
  • A function that takes a native encoding of a tuple, and returns nothing.
  • Tuple t = { "a",1,c };
  • Populate a tuple object with some values. The tuple may be destined for an event queue, or for a table.
  • Tuple t = { "a",1,_ };
  • Set t to be a pattern that matches any triple with "a" and 1 in it's first two positions. The _ is a wildcard.

Proposal

All cross-fixpoint events (including networking, timeouts, and deltas generated from table insertions/deletions) would be implemented using the same event mechanism. The following two functions will be exported via a programming interface:

  • void registerListener(char * tableName, Tuple t, callback * function);
  • void generateEvent(char * tableName, Tuple t);

Or, perhaps even:

  • void generateEvent(char * tableName, Tuple t, timestamp when);

(XXX I arbitrarily chose "tableName" over "eventQueueName", or even "eventType"...)

We may need to differentiate between pre-execution callbacks and events, and post-execution callbacks and events.

Example 1 (networking)

Rewrite this localized rule:

  • a(@x,y,z) :- b(@y,z),c(z,@y).

to

  • msg_delta_a_insert[y, x, x, y, z] :- b(@y,z),c(z,@y)

(where the first two arguments to msg_delta_a_insert are the source and destinations, while the last three encode the tuple that will be sent)

Register the following C-style code callback:

  • registerListener("msg_delta_a_insert",{_,_,_,_}, net_send_tuple_callback);

The function net_send_tuple_callback would actually send the tuple over the network, and would have a signature like this:

  • void net_send_tuple_callback(Tuple t);

It would know which fields in the tuple were used for which purposes, and act accordingly (open a TCP connection, if necessary, serialize the tuple, send it over the line, etc...). I'm not particularly happy with creating an event type named msg_delta_a_insert". I see a few alternatives; none of them are pleasant:

  • Allow event queues to contain tuples with different cardinalities and types. Event queues would no longer be relations, and therefore, difficult to query.
  • Create a bunch of event types; wrap_1_uple, wrap_2_uple, wrap_3_uple, etc, which would pack tuples into opaque types depending on cardinality. Somehow pass the output of that into "msg_delta", which would send network tuples directly. On the other end, the tuples would remain opaque until they were processed by an event handler that unpacks msg_delta events.

Note that implementing networking in this way should make it very easy to add support for other types of network connections (eg: UDP), or to layer P2 programs on top of overlays implemented in P2. Such overlays would simply need to provide a event-triggered rule with the same interface as msg_delta_a_insert.

Example 2 (periodic)

Let's say we wanted to implement "periodic". This rule would reinsert a tuple into a table every ten seconds for five iterations:

  • rulename a(@J, E) :- periodic[@J, E, 10,5].

It could be implemented with by inserting a tuple into the "periodic" event queue: {@J, E, 10, 5, ''num iterations remaining''} It could be rewritten as follows:

  • a(J, E) :- periodic[rulename, J, E, 10, 5, _].
  • periodic[rulename, J, E, 10, 5, 5].

Note that the second line generates an event at compile time. I believe this is illegal in the current runtime.

Now, periodic's event listener callback would look something like this:

  periodic_callback(Tuple t) {
    //                    { 0         1  2  3   4  5 }
    // t looks like this: { rulename, J, E, 10, 5, ? }
    if(t[5] > 0) {
      Tuple t2 = { t[0], t[1], t[2], t[3], t[4], t[5]-1 };
      generate_event(t2, now() + t[4]);
    }
  }

The rewritten rule would automatically be triggered at the same time as this callback, as it is also subscribed to the event.

Example 3 (poor-man's table implementation)

The table implementation sketched here doesn't accept updates from the datalog interpreter, and can't implement shadow copies, or other versioning-based optimizations when applying changes from fixpoints. Instead, it would be broken into three related modules:

  • A general-purpose cache that temporarily holds updates created by the current fixpoint around for further rule evaluation. Tuples not found in cache would be looked up in the version of the table as it existed at the beginning of the fixpoint.
  • A set of table delta event callbacks that update tables correctly between fixpoints. The second would interpret delta events, and use them to apply changes to the data stored in the table.
  • The third module would provide scans and other access methods over the data stored in the table.

API's for the cache and the functions that provide table scans, indices, etc. are beyond the scope of this document.

Extra stuff

More on instantiating event handlers

Code like this could be used to instantiate new event handlers:

// built in event handler; analogous to materialize(...):

registerNativeEventHandler[trigger, /* implementation name? schema? */].  

trigger[a,b,c] : - foo(a,b), bar(b,c).

This uses event handlers to bootstrap the event handler mechanism.

Potential Optimizations

  • Most programs probably won't derive any facts from events with associated event handlers. The runtime could detect this, invoke the associated function, then drop the event without passing it into the next fixpoint computation...