Skip to content

Latest commit

 

History

History
426 lines (336 loc) · 17 KB

java_api.md

File metadata and controls

426 lines (336 loc) · 17 KB

DDlog Java API

This document describes the steps required to invoke a DDlog program from Java, based on the test/datalog_tests/redist.dl example DDlog program and Java code in java/test_flatbuf.

Prerequisites

See the README for instructions on how to install the Google FlatBuffers library required to compile DDlog Java bindings.

Compile DDlog Java bindings

Skip this step if you are using a binary release of DDlog, where pre-compiled Java bindings can be found in java/ddlogapi.jar. To compile them from source:

cd java
make

Compiling DDlog programs with Java API enabled

Pass the -j switch to the DDlog compiler to generate Rust and Java code required to communicate with the DDlog program from Java. When compiling the resulting Java program, enable the flatbuf feature:

ddlog -i test/datalog_tests/redist.dl -L lib -j
cd test/datalog_tests/redist_ddlog
cargo build --features=flatbuf --release

Link the compiled DDlog program along with the DDlog Java API bindings in a single dynamic library. Assuming the $DDLOG_HOME environment variable points to the directory where DDlog is installed:

cc -shared -fPIC -I${JAVA_HOME}/include -I${JAVA_HOME}/include/${JDK_OS} -I. -I${DDLOG_HOME}/lib ${DDLOG_HOME}/java/ddlogapi.c -Ltarget/release/ -lredist_ddlog -o libddlogapi.so

(on a Mac, use the .dylib extension instead of .so).

Linking against DDlog API

Finally, add the following dependencies to your Java project to be able to use the DDlog API:

  • The generic DDlog Java API shared by all DDlog programs in the ddlogapi.jar package (see above).

  • The FlatBuffers Java runtime library, found in the java directory of the FlatBuffers distro.

  • The auto-generated ddlog package that contains program-specific bindings needed to serialize and de-serialize data exchanged by Java and DDlog. It is found in he flatbuf/java directory inside the generated <progname>_ddlog source tree.

Using the DDlog API from Java

Here is a minimal Java program, copied from java/test_flatbuf/Test.java that instantiates and uses the DDlog API.

import java.io.IOException;
import java.util.*;
import java.lang.RuntimeException;

/* Generic DDlog API shared by all programs. */
import ddlogapi.DDlogException;
import ddlogapi.DDlogAPI;
import ddlogapi.DDlogCommand;

/* Additional program-specific bindings generated by `ddlog`. */
import ddlog.redist.*;

public class Test {
    private final DDlogAPI api;

    Test() throws DDlogException, IOException {
        /* Create an instance of the DDlog program with one worker thread. */
        this.api = new DDlogAPI(1, false);
        api.recordCommands("replay.dat", false);
    }

    void onCommit(DDlogCommand<Object> command) {
        int relid = command.relid();
        switch (relid) {
            case redistRelation.Span:
                SpanReader span = (SpanReader)command.value();
                System.out.println("From " + relid + " " + command.kind() + " Span{" + span.entity() + "," + span.tns() + "}");
                break;
            default: throw new IllegalArgumentException("Unknown relation id " + relid);
        }
    }

    void run() throws DDlogException {

        /* First transaction */
        {
            /* Start transaction.  All DDlog table updates must be made in the
             * context of a transaction. */
            this.api.transactionStart();

            /* Create a builder object that will be used to serialize DDlog commands
             * into a buffer. */
            redistUpdateBuilder builder = new redistUpdateBuilder();

            /* Create several DDlog commands.  Commands are stored inside the
             * builder. */
            builder.insert_DdlogNode(10000);
            builder.insert_DdlogBinding((short)100, 10000);
            builder.insert_DdlogDependency(10000, 20000);

            /* Apply commands serialized by the builder to the DDlog program. */
            builder.applyUpdates(this.api);

            /* Commit transaction, triggering the `onCommit` callback for every
             * record in an output relation modified by the transaction. */
            redistUpdateParser.transactionCommitDumpChanges(this.api, r -> this.onCommit(r));
        }

        /* Second transaction */
        {
            this.api.transactionStart();
            /* each applyUpdates requires its own builder */
            redistUpdateBuilder builder = new redistUpdateBuilder();
            builder.insert_DdlogNode(20000);
            builder.insert_DdlogBinding((short)200, 20000);
            builder.delete_DdlogNode(10000);

            builder.applyUpdates(this.api);

            redistUpdateParser.transactionCommitDumpChanges(this.api, r -> this.onCommit(r));
            this.api.stop();
        }
    }

    public static void main(String[] args) throws IOException, DDlogException {
        Test test = new Test();
        test.run();
    }
}

To run the program:

java -Djava.library.path=. Test > test.dump

Note the use of the -D switch to tell Java where to look for the libddlogapi.so dynamic library.

This program uses two Java packages to interact with the DDlog program. The first one is the ddlogapi package, which exports Java wrappers around all DDlog C API methods in rust/template/ddlog.h. The source code of this package can be found in the java/ddlogapi directory. While in principle complete, this API is not particularly ergonomic when working with DDlog values (see below). In order to facilitate programmer-friendly, type-safe manipulation of input and output records, the DDlog compiler, when invoked with -j flag, generates an additional Java package specialized to each particular DDlog program. The package is called ddlog.<prog_name> (ddlog.redist in the above example) and can be found in the <prog_name>_ddlog/flatbuf/java/redist directory. Below, we discuss both packages in more detail.

The ddlogapi package

class DDlogAPI

An instance of the DDlogAPI class represents a running DDlog program. The DDlogAPI constructor starts the program. DDlogAPI public methods are wrappers around C API functions declared in ddlog.h and have the same names, formatted in camel case instead of snake case, e.g.:

  • stop - terminate the DDlog program
  • transactionStart - start a transaction
  • applyUpdates - apply updates to output tables
  • transactionCommit - commit the transaction
  • dumpTable - dump the content of an output relation
  • dumpIndex - dump the content of an index

Most DDlogAPI methods throw an instance of DDlogException, containing an error message from DDlog. In addition, methods that work with files, e.g., recordCommands, are declared with throws IOException.

class DDlogCommand

The DDlogCommand<T> class represents an update to a DDlog relation, i.e., an insertion, or deletion of a record (record modifications are currently not supported through the Java API). To perform a set of updates to input relations, the client creates an array of DDlogCommand's and passes it to the DDlogAPI.applyUpdates() method. Dually, when the client commits a transaction by calling DDlogAPI.commitDumpChanges(), they get back zero or more DDlogCommand's, which represent changes to output relations.

The DDlogCommand<T> class is parameterized with a class that represents DDlog records. One such class, defined in the ddlogapi package, is DDlogRecord, which implements a self-describing representation of arbitrary DDlog values. It provides constructors to instantiate primitive DDlog types (e.g., DDlogRecord(boolean b) creates a value of type bool) and static methods to inductively build more complex data structures by assembling records into structs, tuples, vectors, and other container types (e.g., public static DDlogRecord makeTuple(DDlogRecord[] fields) groups multiple values in a tuple). It also provides methods to introspect DDlog values and extract their individual fields.

The DDlogRecord class offers a program-independent way to work with DDlog values, including values whose types are now known ahead of time. At the same time, it can be cumbersome to use, does not enforce type safety (e.g., one can accidentally construct a record that does not match its type declaration), and has suboptimal performance.

These limitations are addressed by the auto-generated API presented below.

The ddlog.<prog_name> package

The ddlog.<prog_name> package, generated by the DDlog compiler when invoked with the -j switch, offers a user-friendly way to read and write DDlog values. It is specialized to a particular DDlog program and defines type-safe bindings for DDlog types declared in this program, along with a convenience <prog_name>Relation class that contains symbolic bindings for program relation identifiers.

class <prog_name>Relation

A DDlogCommand instance contains a numeric identifier of an input or output relation this command modifies, accessible with the relid() method:

int relid = command.relid();

The ddlogapi.DDlogAPI class provides a program-independent way to map relation name to identifier via the int getTableId(String table) method. The auto-generated <prog_name>Relation class provides a safer alternative by defining symbolic constants for all program's input and output relations. The following snippet chooses an action to perform based on relid value:

switch (relid) {
    case redistRelation.Span: ...
    default: throw new IllegalArgumentException("Unknown relation id " + relid);
}

class <prog_name>UpdateBuilder

Internally, the ddlog.<prog_name> package works by serializing input updates into a FlatBuffer and deserializing changes received from DDlog via a FlatBuffer. At the low level, this functionality is backed by FlatBuffers-enabled C APIs.

The package exposes two sets of auto-generated classes responsible for serializing and deserializing DDlog commands respectively. The main class in the serialization API is <prog_name>UpdateBuilder. It supports the following workflow:

  1. Create an instance of <prog_name>UpdateBuilder:

    redistUpdateBuilder builder = new redistUpdateBuilder();
    
  2. Use the builder to create one or more DDlog commands:

    builder.insert_DdlogNode(10000);
    builder.insert_DdlogBinding((short)100, 10000);
    builder.insert_DdlogDependency(10000, 20000);
    

    Here, the insert_ prefix indicates that we are creating a command that inserts a record to a DDlog relation; the rest of the method name is the name of an input relation to insert the record to, e.g., DdlogNode. The number and types of arguments to each insert_XXX method match relation signature.

  3. Call applyUpdates() method of the builder to push updates to the DDlog program:

    builder.applyUpdates(this.api);
    
  4. Go back to step 1 to perform new updates. Note that this requires creating a new builder instance for each new set of updates.

Additional classes are generated to facilitate the construction of more complex types. For example, consider a relation that has a field whose type is a tuple:

typedef tuple = (bool, bit<8>, string)
input relation JI(a: (bool, bit<8>, string))

In order to insert to this relation, we must first construct a tuple object and then use it to create a JI record:

Tuple3__bool__bit_8___stringWriter ji = builder.create_Tuple3__bool__bit_8___string(true, (byte)10, "string");
builder.insert_JI(ji);

The Tuple3__bool__bit_8___stringWriter class is generated by DDlog. Similar classes are generated for all complex DDlog types used in the program.

class <prog_name>UpdateParser

<prog_name>UpdateParser class is the dual of <prog_name>UpdateBuilder whose job is to deserialize updates received from DDlog. Its main method transactionCommitDumpChanges() takes a DDlogAPI instance and a callback. It commits the current transaction invokes the callback for each output relation update computed by DDlog.

redistUpdateParser.transactionCommitDumpChanges(
    this.api, r -> this.onCommit(r));

Here is an example callback:

void onCommit(DDlogCommand<Object> command) {
    int relid = command.relid();
    switch (relid) {
        case redistRelation.Span:
            SpanReader span = (SpanReader)command.value();
            System.out.println("From " + relid + " " + command.kind() + " Span{" + span.entity() + "," + span.tns() + "}");
            break;
        default: throw new IllegalArgumentException("Unknown relation id " + relid);
    }
}

The callback takes an argument of type DDlogCommand<Object>, where the Object contains a DDlog record. In order to access its fields, we must downcast the value to the appropriate class for the relation it belongs to. For example, if the record belongs to a relation named Span, its actual type is SpanReader, which in turn provides methods to access its fields.

class <prog_name>Query

The auto-generated ddlog.<prog_name>.<prog_name>Query class provides methods to query DDlog indexes. There are two methods for each index: the query<index_name> returns all values associated with a given key, and the dump<index_name> returns all values in the indexed relation. For example, given a program called graph.dl with the following declarations:

// graph.dl

relation Edge(from: bit<32>, to: bit<32>)
index Edge_by_from(from: bit<32>) on Edge(from, _)

DDlog generates class graphQuery as follows:

public class graphQuery
{
    public static void queryEdge_by_from(DDlogAPI hddlog, long from, Consumer<EdgeReader> callback) throws DDlogException
    {...}

    public static void dumpEdge_by_from(DDlogAPI hddlog, Consumer<EdgeReader> callback) throws DDlogException
    {...}
}

Both methods take a callback invoked once for each record returned by the query. Both methods return after enumerating all records via the callback.

Here is a more exotic example:

input relation Rel(m: Option<bit<32>>)
index Rel_by_m(m: Option<bit<32>>) on Rel(m)

Here, the index has a complex key type (Option<bit<32>>). Serializing this type into a flatbuffer requires a FlatBuffer builder; therefore the generated query method has a more complex signature:

public static void queryRel_by_m(DDlogAPI hddlog,
                                 Function<testFlatBufferBuilder, std_Option__bit_32_Writer> m,
                                 Consumer<TIReader> callback) throws DDlogException

Note how instead of supplying the value of m, the caller supplies a callback that takes a FlatBufferBuilder instance and returns a handle to the serialized representation of m.

DDlog-to-Java type mapping summary

Fields of primitive types, e.g., bit<8>, map directly to Java primitives; complex types, like tuples and structs, map to their own <type_name>Reader instances. The following table summarizes the correspondence between DDlog and Java types.

The following table summarizes Java types used in serialization and deserialization APIs for each DDlog type. In the table, T_w and T_r stands for serialization and deserialization types of T that can be looked up in the same table.

DDlog type Java serialization API Java deserialization API
bool boolean boolean
bigint BigInteger BigInteger
string String String
IString String String
bit<N>, N<=16 int int
bit<N>, 16<N<=64 long long
bit<N>, N>64 BigInteger BigInteger
signed<N>, N<=8 byte byte
signed<N>, 8<N<=16 short short
signed<N>, 16<N<=32 int int
signed<N>, 32<N<=64 long long
signed<N>, N>64 BigInteger BigInteger
Tuple: (t1,..,tN) TupleN__t1..__tnWriter TupleN__t1..__tnReader
Struct: S<t1,..,tN> S__t1..__tnWriter S__t1..__tnReader
Vec<T> List<T_w> List<T_r>
Set<T> List<T_w> List<T_r>
Map<K,V> List<Tuple2__K_w__V_w> Map<Tuple2__K_r__V_r>
Ref<T> T_w T_r