Skip to content
Anil Madhavapeddy edited this page Jul 4, 2019 · 55 revisions

A scratch space for collaborative notes about the day's activity at Shonan.

Day 1 (1st July 2019)

Dynamic schedule, beginning with introduction talks.

Introductions

Oleg Kiselyov

  • Generating code for HPC kernels (A Shonan Challenge)
  • Language Integrated Query integration, how to integrate and optimise generated queries and output standard SQL
  • Metcast - a practical use of some of these techniques to build a meteorological dissemination system
  • Developed since 1999, people on-call to keep it operational
  • Complicated high performance stream processing (windowing, multiple producer/consumer) with big and frequently updated data
  • What can we effectively apply code generation to?

Anil Madhavapeddy

  • Didn't take notes on myself!

Ryan Newton

  • Started with stream data at MIT tracking marmots!
  • Then working on monotonic data and instead of parallel reduction developed LVars for in-memory shared data structures modeled as lattices. Restricted interfaces that can ensure deterministic observations
  • Compact data: move from discontinuous subgraphs to compact regions, and now to fully serialized heaps where IO can be operated directly on.
  • The Gibbon project is a compiler for a subset of Haskell that allows a subset of code to run directly on the fully serialised heap.
  • Also working on reproducible containers (cloudseal.io) which is important to data processing in order to provide machine independence.

Tiark Rompf

  • Started working on Scala, and key idea is the ease of language extensions.
  • Comes with a cost: performance. Lots of people started shifting away since abstractions didn't come for free.
  • How far can we push squashing the stack? Can we have an end-to-end native data path based around metaprogramming so that there is no perf cost?
  • Initial results with Relational+ML are extremely good.
  • Flare/Lantern are two systems that connect back-propagation with delimcc and fit very well with metaprogramming ideas.
  • Combining metaprogramming techniques with verification -- connecting code with invariants and generating code with annotations that can be used by existing verification systems.
  • How can we combine the high level work here with unikernels and other low level systems techniques?

Aggelos Biboudis

  • Works on compiler engineering and has a research focus on metaprogramming, currently at EPFL working on Scala 3.
  • Streaming APIs: started to benchmark new Java 8 features and how they compared with other language implementations. Realised just how many different approaches exist for different streaming APIs of different languages.
  • Staging stream fusion: the library would generate code so that the final code is the result of evaluating the streaming and not having any heap allocation or other expensive runtime operations in the output.
  • At EPFL started working on the new macro system in Scala to support the unification of multi-stage programming and macros in practise.
  • Scala 3: now with inlining (not simple C++ inlining, is a directive that guarantees expansion), match types (computing new types), macros (generating metaprogramming with quotes/splices), multi-stage programming (runtime code generation), and analytical metaprogramming with TASTy-Reflect.
  • Looking for inspiration for data-intensive problems with principled code generation. Inspiration for type-level meta-programming case studies (such as Yallop's Staging Generic Programming), how to work with heterogenous hardware (GPU-main memory transitions).

Christopher Meiklejohn

  • PhD student in CMU (3rd year), came from industry after 15 years to apply his views on what's hard about building distributed systems.
  • Worked at Basho (Riak: Dynamo clone, first "popular" CRDTs and geo-replication). Learnt runtime support for geo-distribution is very hard and difficult failure scenarios. Then at Machine Zone on chain replication (very difficult to build with existing components). Then at Mesosphere (Lashup).
  • Scalable transport layer for actor-based programming (ATC 19) and can parallelise network communication to avoid head-of-line blocking. Can alter comms topology at runtime and used as the basis for a very high performance CRDT system.
  • Filibuster is a fault-injection framework <...>
  • Reactive Machine: programming system for composing cloud services reliably. Compose activities across APIs with at-least-once guarantees. Supports stateful entities and locks during orchestrations. One example used was Uber who can double-allocate drivers since they want to stay eventually consistent in their data model. Compiles to deterministic state machine that is replayable and durable. Targeted to Ambrosia or Azure. Has been productised in "Azure Durable Functions v2.0" by Sebastian Burkhardt after his internship with Azure team.
  • Interested in deterministic programming as an essential component of any large scale data intensive system.

Annette Bieniusa

  • At TU Kaiserslautern, previously at Inria and Freiburg, working on shared-state in high-scalable distributed systems -- e.g. CRDTs or distributed STM. Is developing correct applications on weakly-consistent data stores, and also a framework for offline functionality in mobile apps.
  • AntidoteDB: a geo-replicated data store with low latency and high availability. Uses CRDTs as its data model, with transactional causal+ consistency. Has datacentre master-to-master since every DC stores full copies of transaction log. CRDTs prevent write-write conflicts using predefined conflict resolution strategies.
  • Lots of projects around it: Repliss (specification and verification and testing of AntidoteDB apps via data invariants and invariants over the history of invocations). AcGreGate (access control models for highly available data stores) and semantics for protection relations. Client side replication by extending consistency guarantees to the edge (e.g. browsers).
  • Its hard to replicate logs for object stores -- need efficient snapshot constructions and safe garbage collection across nodes. Objects that have inherently growing state such as sets are difficult.
  • Interesting in programming models for distributed computing: "just-right consistency" for shared-memory abstractions (synchronisation patterns for invariant preservation and verificaiton of them). Also working on flow-oriented programming for streaming and type-safe microservice composition.

Lindsey Kuper

  • At UC Santa Cruz, working on language-based approaches to building parallel and distributed systems that are correct and efficient.
  • Went from "databases are cool!" to "PL!" and then moved onto wider distributed systems usage from people who were interested in her PhD results. Thought of her PhD as PL though!
  • Lattice-based data structures for deterministic programming and then non-invasive DSLs for productive parallelism. Write portions of Julia code in a subset of Julia that compiles to fast parallel code. Looking for the right high-level abstractions to enable efficient computation. Also been working on SMT-based verification for neural networks -- another example of finding the right high level abstraction.
  • Now excited by domain-specific solvers for distributed consistency, presented a SNAPL talk about this recently.

Suresh Jagannathan

  • Systems that have many sources of data, and can pump out data at different rates. In this diverse environment, how can we practically apply verification tools?
  • Worked on MLton previously -- another approach to squash the software stack via whole-program compilation. Now we have even more tools to tackle this problem.
  • Reasoning about consistency: lots of domain-specific implementations over the years (Quelea, Q9, Acidifier, Craftsman, Quark). Deal with properties such as convergence, serializability, recency and user-level invariants.
  • Tries to parameterise the assumptions (e.g. characteristics of the storage systems, the form of provided specifications, the programming model, and application-level guarantees/assertiosn) so that the reasoning systems dont have to be rewritten whenever one of them changes (which happens frequently in modern systems).
  • Program+specifications that can be elaborated into verification conditions that go well beyond what's possible with just static type systems. Often handed off to a solver like Z3.
  • Also applying it to machine learning: how do we know that (e.g.) a self-driving car algorithms are doing the right thing. How do we even specify what "right" means? Working on converting machine-learning preconditions to apply to practical systems, by only requiring the programmer to supply post-conditions on the model (e.g. car must not crash) and then the system will infer and enforce all the internal preconditions to satisfy it.
  • Building large scale dataflow graphs that can work on reconfigurable hardware and checking that the transformations are sensible with respect to the goals of the developer down to what the application provides.
  • Challenges for this meeting: scalable whole-system verification and compositional reasoning, what synthesis methods can we use for correct-by-construction systems (as opposed to post-facto determination of correctness). Are these transformed systems correct and semantically equivalent? Can we go beyond functional correctness to other properties such as liveness? We need to build systems that move beyond individual procedures and up to verifying entire protocols automatically -- current techniques do not automatically scale up.

Gowtham Kaki

  • Just submitted his PhD thesis to the committee -- two members are present at Shonan!
  • Databases are really ACID in reality -- a study in HotOS 13 showed that 9 out of 18 COTS databases are fully ACID.
  • Bridging the gap between specifications (rely-guarantee logic to Hoare logic). Built a system called Acidifer to introduce minimal guarantees to ensure ACID in a program. Uses OCaml frontend to synthesise code.
  • Next step is to build a bounded verification tool (Q9) that does symbolic execution over execution traces. Provides a counter-example when a violating trace is observed. Demo of Q9 running a browser!
  • Bound is not the length of the program, but the amount of concurrency.
  • Took standard benchmarks and found the minimal consistency level automatically to ensure its ACID.
  • Verifies systems up to a certain amount of concurrent operations. Aim is to make this bound high enough such that there will not be a significant performance drop vs running unverified system.
  • Also working on mergeable data structures that are "democratised" so that they are as simpe to define as normal ADTs are. Derive a replication semantics from the type definition and the programmer-supplied invariants.
  • Also working on region-based memory management for stream processing systems (e.g. Naiad or Dryad).
  • Grand challenge: can we automatically derive the replication semantics of sequential applications? Develop it locally on a database and then (guided by invariants, test cases, traces, user input) to turn it into a distributed equivalent that preserves the sequential semantics with reasonable performance. Target applications include conventional db applications but also ones that are replicated by definition (e.g. blockchain or collaborative text editors).

KC Sivaramakrishnan

  • Programming languages for weakly consistent stores (not just databases, but also multicore shared memory systems).
  • Generally interested in PL for (concurrent / parallel / distributed). Aiming to make systems programming more boring (from a bug occurrence perspective!).
  • Working on upstreaming multicore OCaml and also hacking on Irmin to look at PL approaches to the systems problems around distributed data structures.
  • Quelea: a Haskell DSL that represents replicated datatypes as a Haskell program, and uses Burkhardt's POPL14 formalism for describing application-level guarantees. Encoded as a contract language the programmer writes to express application-level contracts and then map them down to the backend store (in this case, Cassandra). One drawback is that the contract author still needs to think about concurrency.
  • Instead talking about low-level relations, can we get to application-level specifications (uniqueness, foreign keys, integrity constraints). Implemented this in Q9 with Gowtham and Suresh.
  • Challenge: real world weakly consistent systems tolerate errors pretty well -- verification systems do not take this into account. How do we bridge this gap so that we can verify real systems that have some error tolerance built into them, but still use nice automated verification. Probabilistic verification is one possible route out (PLDI 19 - "Scalable Verification of Probablistic Networks" or "Bayonet: probablistic inference for networks" in PLDI 18).
  • Scalable replicated data types: CRDTs are a systems-level abstraction and not composable. Can we take run of the mill polymorphic algebraic data types and then make them composable? Can do so with a merge function that takes the lowest-common ancestor and two versions and returns a merged structures. This leads to nice relational properties and lets us inductively reason about multiple data structures. The verification only involves sequential datastructures (project is called Quark). LCA maintainance places strong expectations on the underlying store, but Git is terrible for this for a large number of objects. Need to figure out what a good network protocol for this might be (using a set difference).
  • What might a distributed backend for Irmin look like? AntidoteDB might be the right candidate! No need for strong consistency, good write throughput, manages a large numbers of objects, supports mutations. IPFS fits the bill on paper, but is rather slow per node.
  • Memory models: an unambiguous specifiction of program outcomes. Its more than just thread interleavings. Needs to be not too weak (good for programmers), not too strong (good for hardware), must admit optimisations (good for compilers) and mathematically rigorous (good for verification).
  • Is there an equivalent of a DRF-SC theorem for weakly consistent distributed stores? Lindsey points to a paper that answers this -- fastest challenge solution record?

Heather Miller

  • Making it easier to piece together distributed systems. Started doing this at EPFL and then founded and directed the Scala Center. Changed her opinion about how to conduct PL research after seeing how big companies used Scala. PhD at the intersection of programming languages and data-centric distributed systems. Feedback loop with big industrial users to establish context and problem domains.
  • PhD at the time when trend is the shift towards in-memory distributed computing and the availability of low-cost memory in data centres. Apache Spark appeared and made systems much more interactive, and Scala made the system much more expressive. Worked at the level between Spark and Scala during this time, trying to solve issues at the level of the PL and runtime.
  • Everyone is becoming a distributed programmer now (Spark can easily be used from notebooks by people with no programming experience). And yet, issues related to distribution and even code generation trickle up to these novice users. Aspects of distribution cannot be hidden, low-level complexities of scheduling distribution move up the stack, and developers must be prepared to debug issues at the level of the runtime.
  • Projects generally seek to improve 3 concerns central to distributed programming: latency, reliability (from a distribution perspective) and customisability.
  • Worked on generating efficient pickling (serialisation), spores (distributed closures), FP programming models (generalising Spark and MapReduce), and implementing futures and promises. This was all tightly integrated with the development of the metaprogramming system in Scala. For example, the pickling project was the development of static type class derivation.
  • Currently working on: typing configurations of microservices, type systems for tracking properties for consistency (monotonicity), and improving distributed actor runtimes (deterministic replay, tracing and equivalent fault injection).
  • Questions: do we really need new languages? Why not use existing tricks from the world of PL/static analyis to manage our existing languages? Although we talk about design, we often forget people in the language design that we do. We solve what we think is intuitive, but that's only intuitive to us or people like us.

Yukiyoshi Kameyama

  • Works on metaprogrmming and its application, and functional programming and type theory, control theory and continuations.
  • Also researches program generation with type-safe and scope-safe program generation, and offshoring from OCaml to C and LLVM.
  • Language integrated queries to convert normal language code into SQL queries using program generation. The problem is that the PL abstraction power is too strong for LIQ. PL allow arbitrary nesting of control and data structures. Other SQL constructs are hard to treat in this framework, such as ORDER-BY and LIMIT or GROUP-BY.
  • Borrow Cheney's approach (Generate and Transform/Normalise) and use typed multi-stage programming. "Grouping as an effect"

Peter Braam

  • Started at a physicist working on string theory and then moved over to computing and distributed systems. Built Lustre and then went back to academia and got involved with a large radio telescope project (SKA at Cambridge).
  • Got involved in file systems in Linux quite early. Distributed filesystems such as Coda and also local filesystem such as ext3/ext4. Built Lustre due to a DoE contract, and it became the most widely used storage systems on very large HPC systems. Companies such as Netflix (in the 2000s) and Uber (today), by Fujitsu (K-systems) and even offered by Amazon as a cloud service.
  • Telescope basically run imaging pipelines very similar to normal photo services, but the data layout is radically different (e.g. most photos are black and so highly sparse).
  • Challenge: Program + Data + Hardware -- to what degree can they be held separate, and how can they be combined?

Wilmer Ricciotti

  • Working on "Skye: a language bridging the theory and practice of data provenance". Did PhD in Bologna on the implementation of interactive theorem provers. Background in type theory and the mechanisation and verification of programming languages.
  • Started working on language-integrated provenance and how to derive provenance for functional languages and the related field of slicing based on the tracing semantics of programs.
  • Curated scientific data: typically multi-tier web applications with SQL or RDF data models. Provenance is ad-hoc code or via modified DB engines.
  • Need a new programming language for data curation that works across tiers and can bridge computational islands implemented in different languages. Provide dialogs and translations and implement provenance tracking as a special (endomorphic) function.
  • Skye uses language-integrated queries, NRC-style normalisation (using Links as the basis for this). Links lacks extensibility, so trying to add these as libraries. Using execution traces as a general form of provenance and type-directed generic programming (Typerec and typecase). Uses heterogeneous metaprogramming to compose several dialects in the same binary. Has mechanised semantics (using nullSQL) with the goal of adding provenance information to the semantics over the next few months.
  • LinksT extends Links with traces (Lit/Op/Cell/If/For) and conversion of queries to self-traces.
  • In a standard setup, you send a query to a database which responds with a result. In a provenance setup, start with a query-trace to the database which responds with a large traced result. Apply trace analysis to this and the result and provenace are result. But this requires a lot of intermediate information. Skye does automatic prining of irrelevant trace information and does not require large intermediate results (all computation is efficiently performed within the database itself)

Hideyuki Kawashima

  • Works in centre for computational science, now an associate professor in Keio University. Subaru Telescope (4km high) based in Hawaii and can image other galaxies using 100s of CCDs. Generates 300GB of data per day which is processed and then stored in a distributed filesystem. Challenge is how to scale transactions to the metadata server which becomes a bottleneck.
  • Looking at the breakdown of workload in the server, about 10% is spent on logging. Proposed P-WAL to speed this up, using isolated buffers instead of shared buffers and this helps with logging.
  • The remaining overhead is ~20 in parallel control. There are many solutions proposed in the literate in recent SIGMOD/VLDB/SOSP. Implemented all of them to analyse which approach is better! Performance varies depending on the percentage balance of reads vs writes.

Stack Squashing Working Group

Diagrams about kernel components can be found in https://en.wikipedia.org/wiki/Linux_kernel

Here's the initial system stack we drew together:

  • Initial system stack diagram

And then here's the picture after we added some applications that motivate stack squashing and some other notes:

Enriched stack squashing wall

(TODO: rrnewton can replace above Evernote links by pushing images into wiki repo when direct push access is granted...)

Day 2 (2nd July 2019)

Squashing WG show and tell

(TODO notes -- avsm was presenting)

Talked about wasm WASI. Discussion points:

  • one WASI API or modules?
  • app-specific scheduling?
  • CDN usecase with Fastly, what else?
  • incremental machine learning? how is it tied to wasi?

Peter Braam

SKA radio telescope to tackle astronomical problems, due for launch in 2027 or so. Is a cluster of telescopes in South Africa and Perth with beamformed antennas forming an interferometer. Around 1 TB/sec input of so-called visibility data.

Why FPGA and not ASIC? Not enough volume for custom ASICs. Also need techniques for new things such as using signatures of rough paths, and so need the ability to upgrade FPGA.

Processing is complex since it is not as simple as just dealing with a flat plane. Sky is a sphere, earth rotates and atmosphere distorts, so there is a lot of non-linear processing.

The imaging pipeline is complex; needs to iterate until converging about 10 times, compare with an already known model of the syk and then subtract everything known before using what is left. Steps include standard things like fourier transforms and convolutions, and then unusual things such as image cleanup to remove temporary obstructions and so on.

Two data types: irregular spare uvw grid of baselines and data read continuously at 10TB/sec. Need a lot of RAM to process. System engineers at SKA prefer to stick to known computer science to reduce project risk. Published a document called the parametric model, which requires 100 PF/sec but requires 200 PB/sec across memory to CPU.

Many small variations in the fourier transforms used. Some aligned, some not. The two supercomputers are different specs due to differences in the parametric model.

The processing pipeline needs to be generic to any processing framework and defer the pipeline execution framework selection for later. The SKA team mandated that everything is written in Python since they found that other DSLs were too complicated to modify.

From maths to physics to algorithsm: the mathematical abstractions become complex formulas and matrix-related code. Need systematic transformations here.

Many OSS projects looked promising but lacked essential features for the SKA pipeline. Evaluated things like Halide, Legion, Parsec, Swift/T, Cloud Haskell, MPI, DASK, Spark. Had about 50 high level requirements (e.g. "the system should be able to schedule and maximise memory usage to avoid underuse or overload"). Although many of the requirements were promised, most fell over when faced with a scaling problem (e.g. how to multiply two matrices larger than available memory). See braam.io for the collection of documentation about this evaluation.

Opportunities for array processing are abundant. New floating point formats (posithub.org) decouple range and precision and have significant data savings with higher precision calculations. Possible savings for 2x via lower bandwidth and memory and storage capacity for the SKA. The data array that results need special accessors for the compressed representations, so a[n] in C cant work. Decompressing at 1TB/sec isn't easy either.

Tensorflow is doing some useful framework evolution via MLIR. There is an opportunity for a re-use strategy for other array computations; for example HPC numerical linear algebra be implemented in TensorFlow and reduce duplication of implementation.

There is also no real standard for scheduling jobs - there may be dozens running in SKA and they need to be harmonised. Processing for SKA requires up to 6 hours of ingest, which works out to 21,600 TB of data as the unit of data that must be ingested to work on. This requires double buffering due to the large dataset.

If a computation is too large to fit into addressable memory. Cache hierachies cannot be hidden away from HPC code, and not be hidden away from the GC runtime or else performance suffers badly.

Discussion: can we build smaller versions of the problem spaces to run at research scale? Several of the frameworks didn't have the same results when run distributed.

Ryan Newton

  • Gibbon Project: specialise and monomorphize with whole program optimisation. Data needs to be flat, dense and application specific. There is no memory model needed since the subset is pure and linear, and has first class arenas. No evaluation order as it uses Liquid Haskell for termination checking. There is bounded parallelism, and it is a non-invasive DSL since it is a subset of Haskell.

  • Compact, isolated heap regions (ICFP 2015) take a disparate heap and turn it into a distinct address space where objects are contiguous and phased into a single shared region. Saves on GC scanning, merged into GHC 8.2 and used by projects such as HaXL in Facebook. But it also gives data representation savings since they can be used to directly access heap data using mmap without serialisation. There can be pointers within the heap; the system tries to map them into the same address space in the other heap or does an O(n) fixup if it cant.

  • The new work gets rid of the pointers within the compact region for even more density.

  • A dense preorder representation might squash by getting rid of pointers. A microbenchmark on traversing a tree showed that the packed representation is a significant improvement over OCaml/Rust/Haskell/GCC. Now how do we compute over the packed data?

  • Gibbon is a prototype compiler and runtime that runs an accelerated subset of Haskell, using Liquid Haskell separately (for now) to verify termination. Gibbon will generate code that operates directly over code without ever deserialising it.

  • Eval: since this is about performance, lets look at how it works vs Capnproto and Haskell Compact Normal Form. Heather Miller: Tungsten project from Spark is relevant; also does similar things to avoid serialisation.

  • LoCal (Locatio Calculus) is a type safe language with data types. But the types are annotated with symbolic locations, and data constructors take location arguments. Each region is a serialised data structures, and locations are positions inside regions. Locations are written to only once.

  • In the future, aiming to teach the Gibbon compiler about preexisting formats to use existing datasets, as well as integrate with GHC.

  • QA: still working on adding mutation (in-place, without changing the structure). Q: Are there programmer controller regions? A: There are first class arena interfaces.

Lindsey Kuper

  • Three way tradeoff between performance, productivity and generality. This leads to a "two language" problem (e.g. Python or C), although some languages (e.g. Julia) claim to solve the three way problem. Sometimes it's worth trading for productivity and performance by giving up generality. Going to talk about domain-specific solvers for verifying distributed consistency properties.

  • Motivating example of a domain-specific solver: Reluplex for neural network verification. Goal is to reduce the storage requirements for an aircraft avoidance system in order to fit onboard on an aircraft. The neural-network-based implementation took a 1000x less memory than the original implementation. But is it correct?

  • How Reluplex works: Take a trained neural network, and every unit on the neural net becomes part of a huge formula. Then use an SMT solver to determine whether the formula is satisfiable with respect to some theory (like linear real arithmetic). An eager approach is to convert the whole SMT formula to SAT formula and then use a SAT solver. A lazy approach preserves the underlying structure of the theory though, so we can have a top layer of theory solvers which are each specific to a particular theory. For example, addition is commutative you can exploit that but its hard if you only have boolean formula. Oleg: notes that error messages are also problematic if you go to an eager implementation.

  • Theory of Linear Real Arithmetic, which lets us try multiple representations. With activation functions we can also learn complex non-linear functions. Unfortunately the LP solver cannot deal with disjunctions as it only takes conjunctions of all its pieces. This is bad for its solving efficiency, since you need to drop down to the underlying SAT solver. Replaced LP solver by adding ReLU primitive to the solver code, and the result was a solver that allowed verifying properties of networks 1-2 orders of magnitude better than previous tools. KC: is LP decidable? A: Decidable but performance was slow (intractable)

  • How to apply this to distributed consistency? Long term aim is to democratise the development of these solvers. Replication is the solution to many distributed systems problems but also the cause of them! There are many many ways in which replicas can disagree to lead inconsistecy. What is the minimum consistency based on your app requirements?

  • Consistency order is not expressible in first order logic, so Quelea had to fix it up. Consistency aware solvers for existing contract language are a good starting point. Want to get a framework for building high performance domain-specific solvers. "Delite for domain-specific solvers"

Gowtham Kaki

  • Some distributed data structures aren't commutative and hence not a CRDT. However, we can still some get some useful convergence semantics despite this. Start by capturing the effect of multiplication through the commutative addition operation. Can we program and reason about replicated data types as an extension of their sequential counterparts?

  • Add a merge function to each data structure. Takes least-common ancestor and two versions as argument and returns a merged version. Three way merge function makes them suitable for commutative distribution. Does this generalise?

  • Consider a polymorphic Queue with at-least once semantics. Just replicating queues with async transmission of operations results in errors (for example pop might happen twice if two nodes pop simultaneously). Discussion about CRDTs and how they use vector clocks. Convergence is also not sufficient, but intent must be preserved.

  • What is intent preservation? Need to formalise the intent of operations on a data structure. E.g. for a replicated queue, any element popped in either v1 or v2 does not remain in v; and other similar rules. Introduce relational specifications as the way to define relations to capture membership and ordering (Rmem and Rob for membership and occurs-before).

  • Discussion: Is this use of relation useful? Oleg suggests its too heavyweight for developers to use. Suresh points out that the relations are derived automatically and can also drive theorem provers (not intended to be seen by developers using the data strucfture).

  • The result is that compositionality is much easier to implement with (for example) Pair. The merge of a pair is the merge of the corresponding constituents.

  • Q Heather: how is this relevant to Irmin? With these three way merges, can add these merge functions to the data structure. No optimality results but the merge functions "just work" using this.

Day 3 (2nd July 2019)

Starting with a wrapup of the previous day's working groups.

Show and tell and between-session notes

Stack Squashing

There was a twitter thread on OS "multicalls", which is one thing we talked about in the "Stack Squashing" session. The thread cites this paper which discusses the increasing relevance of these ideas in light of Meltdown/Spectre. There's also a recent blog post by John Regehr on related topics.

Language-Determined Data Representation (notes by Peter Braam)

To solve a data-intensive problem typically one needs Problem Specification (not just Program Specification):

Algorithm + Data + System Architecture

PL is primarily addressing the study of algorithms. For data-intensive problems, these are married, not separate.

Examples of relevance:

  • Algorithm - focused on complexity, efficiency, parallelization
  • Ryan’s talk – compact data representations minimize data cost
  • Adaptive Mesh Refinement – system architecture hugely influential
  • SKA – illustrates difficulties at multiple levels: suitability of SW for data, sufficiently clear problem specifications

Opportunity in solution architecture of data intensive computing. Where we typically see 4 quite separate layers:

  • Problem specification. Often PDE’s or calculus, mathematics, physics, … or engineering
  • Numerical solution. Discretization, proofs of or belief in convergence.
  • Algorithm
  • System

Challenge: design a language that includes data, architectures, problems, and algorithmic information. Automate much of the above architecture and optimize for metrics along the lines shown above: computational, data, energy, cost efficiency.

Notes:

  1. Paul Kelley’s group and collaborators did this for fluid dynamics
  2. Are there other good opportunities?

Application Drivers

TODO

WG on Verification under Weak Consistency and RDTs (notes by Gowtham Kaki)

We started the discussion with a review of some related work on weak consistency verification, which included MixT by Milano et al from Cornell, and Consistency Types from Holt et al from Washington. In both these works, data is partitioned based on the requirements of consistency, and an emphasis is placed on not letting weakly consistent data flow into strongly consistent data banks.

On this note we asked the question of whether there are any statistics on real-world applications manifestly changing the consistency levels from one that is the default in the data store. Chris pointed out that CosmosDB offers 5 different levels of consistency, including some of the session guarantees such as monotonic reads. Applications are required to choose one among these levels for every table or view they define, and most applications, i.e., almost 90% of them, end up choosing between EC and SC, most likely because they have no idea what the semantics of other consistency levels are, and how they relate to their applications. There are indeed applications that use weakly consistent stores that care deeply about correctness, such as the National Health Service (NHS) of UK, which uses Riak (I think) because it never wants to lose a write.

However, even among such applications, there hasn’t been any published evidence that choosing an inappropriate consistency level has led to a manifest violation of application integrity. Taking these observations into consideration, In my opinion, maybe we are focusing on the wrong problem of verifying applications under weak consistency. Maybe we should focus on synthesizing weakly consistency applications that are soundy or truthy w.r.t their sequential specification.

While discussing RDTs, Suresh pointed out that CRDTs and notions of consistency are not orthogonal concerns as it is often thought. There exist CRDTs whose correctness is crucially dependent on assumptions about the machine model, which are never made explicit. For instance, OR-set requires the underlying machine to be causally-consistent, which is not part of its Observe-Remove semantics. On the related note, Suresh lamented the balkanization of weak consistency research which is not letting us take advantage of the problems already solved in the weak memory models world. As several authors have observed, there is strong correlation between weak memory and weak consistency, yet our vocabulary is so different from the weak memory world, that there is not straightforward way to reason about how an algorithm A working under assumptions X in weak memory world works in the weak consistency world with a different set of assumptions Y. If we can solve this problem by inventing a unified model to reason about weak memory and weak consistency, then we can bring to bear the benefits of all the hardwork that went into designing correct lock-free concurrent data structures (such as those in Java.Util.Concurrent) on weak consistency.

(Lindsey: I propose replacing "machine" above with "system model of message delivery")

(Lindsey: I think that Burckhardt's RDT specification framework might already be this unified model. Sohum Banerjea had a blog post about this in my seminar last fall.)

With respect to the practical problems, there has been a discussion on the lack of formal models that account for partial replication. A sample system with partial replication is PRACTI (Belaramani et al). KC proposed research on probabilistic consistency enforcement that takes into account the prior distribution of consistency in the system. A relevant paper in this direction is Ramabaja’s The Bloom Clock.

Tiark Rompf

  • Tutorial about scala-lms.org

  • Big data systems like Spark and Tensorflow are shifting to a deferred execution model. Allows the system to look at a query and run an analysis to figure out the right data processing strategy before diving in. This helps ensure that the data layer at a high performance, whereas the control layer can remain flexible.

Annette Bieniusa

  • Access control for weakly consistent systems. Every replica is acting autonomously and accepting both ACL changes and data changes simultaneously. Therefore temporary inconsistency is possible when concurrent modifications are not synchronised. What should access control look like in such an environment?

  • One approach is to use a strongly consistent server (either centralised or decentralised). This is nice clear semantics, but is a synchronisation bottleneck. Therefore can we have a weakly consistent adhoc server with much greater throughput? Lets take a systematic approach to this.

  • Need to enforce confidentiality, to prevent access to some data. Start with an intuition, then formally define in temporal logic, then do ACL modelling and evaluate the system built from the formalism.

  • Basis of ACLs are to grant a subject the right to execute an operation on an object, to revoke the right and to execute an operation on an object as a subject. The default policy is that no subject has the right to execute anything, so rights have to be granted first.

  • The intuition was specified in Isabelle. Uses notion of linear time, and so temporal logic is a natural step to interpret over possible traces. Used Abstract Execution (Burckhardt et al, POPL14) as the specific temporal logic. LTL lists all possible traces, and CTL gives tree-shape time with existential properties.

  • EPTL is an event-based parallel temporal logic with the possibility to express systems that may never converge (as opposed to starting with a presupposition of eventual convergence of the system). Oleg: suggests hybrid logic as a way of expressing tense and may also be relevant. EPTL has propositions over events for operations.

  • Conflict resolution on conflict updates is well defined. For example if the profile is set to private and then someone removes all private numbers in parallel, and then a private number is added and the profile is published then some private data is leaked. Revokes need to take priority over grants to ensure confidentiality can be preserved when system is inconsistent.

  • Given an app-specific decide-function as an interpretation of the rights, an abstract execution A of a distributed access control system translates this policy.

  • System was implemented in AntidoteDB, based on a modified version of a multi-value register. The read operations returns the minimum over all currently assigned rights. It is represented as an Antidote CRDT.

  • Open problems: how can we ensure there is at least one administrative user in the system. Hard to enforce as its a property of the global state of the system. Also a malicious user should not be able to permanently take over a replica.

Day 4 (4th July 2019)

Starting with talks in the morning.

Chris Meiklejohn

Equivalence partitioned fault injection (with Heather Miller, Peter Alvaro)

Testing REAL distributed applications.

Motivation: Distributed programming is difficult due to network uncertainly. Abstract unrealistic assumptions often do not hold when put to practice. State-of-the-art: (1) Specification based model checking (AWS). Spec different from implementation. TLA+/SPIN. (2) Chaos engineering. No models. Test everything in production. Ryan: (You need the infra structure to do fault injection. Most chaos engineering approaches crash nodes. Are there other approaches. There are very few approaches that do message reordering. Intercept gRPC. And reorder.

Goal: Fault injection as part of the development process. Must be fast!

Example: 2pc. Assume 4 participants. If you only consider message omissions: 2^12 possible executions. 9 executions deadlock. Takes a long time to exhaustively test in practice. Another variants of TPC is Collaborative Termination Protocol (CTP). Extension of 2PC to resolve 9 deadlocks. 2 ^ 12 (2PC) + 2 ^ 27 (CTP) executions.

Instead, we want to do Equivalence Partitioned Fault Injection. Intuition: start working from correct execution and reason backwards. Generate a trace and omit things to trigger a fault. Dynamically lots of executions. Idea is to partition the execution using lightweight static analysis in to equivalence classes and then only test one execution in each equivalence class. Small dynamic component to remove false positive. We also provide code coverage metrics.

How to cut down the search space? Small static analysis over Erlang code to capture causal dependency between the messages to rule out impossible executions. In 2PC, if the coordinator never sends prepare message, then the participants don't send vote message. Rule that execution out. We allow static analysis refinements to express these application specific rules as the static analysis does not reason about number of message without annotations. Risk: Refinements are written by developers. How do you know a refinement is correct? We test the system to produce a counter example that violates the refinements. We only look at violates the weakest precondition. Dynamic analysis to only pick one member of each equivalence class. Execution is multiple nodes on a single machine. We get code coverage report from all of the nodes.

Prototype: Filibuster -- Deterministic distributed runtime for Erlang programming language. Built upon previous work, partisan, a high performance distribution layer for Erlang. Record all the messages and then replay using the message trace. Determinism only at the message level; If I send the same message, the application must behave the same.

Preliminary Results: Evaluated three well-known atomic commit protocol. Compared to the state of the art lineage-driven fault injection. Slower than SAT based techniques because we run actual code.

Drawbacks: (a) protocols that flood the network with the same messages such as gossip protocol, Demerara anti-entropy. (b) Amount of memory requirements for (> 30+ messages) becomes very high due to schedule explosion. (c) Relies on user-defined refinements.

Anil Madhavapeddy

TODO notes

## Aggelos

TODO notes

Suresh Jagannathan

  • Specifications should be sufficient to derive invariants. We are thinking at a level of sophistication which is different from the level a developer is working at (e.g. complex loop invariants with state machines that are attached to functions as pre or post conditions).
  • Then we have programs with machine learning components. This generates a non-linear non-convex neural network that's very different from the sorts of things we normally specify. But there clearly are reasonable post-conditions ("dont crash that self driving car into a tree") that should be mappable back to software.
  • There is a common thread through this software that there is some program invariants that we can discover.
  • Core idea: use the verifier as a source of data to draw from all parts of the system being examined. Then use examples of valid programs to construct counterexamples to refine the spec.
  • Data-driven invariant inference (PLDI 18, A Data Driven CHC Solver) describes a system that can figure out loop invariants from a few examples of data. The intuition is that we know how to differentiate between positive and negative samples without knowing more about the underlying structure, so can we use this to derive formal specs.
  • Use polyhedras to build a hypothesis domain -- a machine learning technique for invariants of functions. First challenge is that we must overcome is that we will need nonlinear classifiers for non trivial problem domain. First take to solve this is to break it down into a smaller set of linear problems (using linear classification).
  • Call linear classification by leveraging its ability to infer high quality classifiers even from data that isnt linearly separable. To be safe, call the linear classification recursively until all the samples are cleanly separated.
  • Cn we generalise the learnt invariant solely using the data from which the linear classifiers are produced? This is so we can throw the invariants at Z3 which will fall over if the learned classifiers are too complex. The machine learning community optimises techniques like decision tree learning to come up with the smallest explanation for a smallest condition. So we need to apply this to figure out how to separate our learned invariants to get the smallest set.
  • Intuitively, we want to find an inductive invariant that holds over arbitrary executions of a loop. We start knowing there is a bad state that we know about (lets assume we know there exists a bad state), and we are looking for an invariant that lets us live in a good state only. An example we pick might put us into a bad state, or it might put us in a region we know about. We ask Z3 to give us a particular region and asking for a transition from that region to the bad state. If it does, then we iteratively weaken the invariant until it cant.
  • To automate this, takes C code with a learner that reasons symbolically. It held up well against many other existing systems (see paper).
  • Machine learning is synergistic with many of our verification goals.
  • SynthML: can we also apply this to neural network models to verify that it is sound. Dont want to open up the network to learn it; we want to treat it as a black box to verify it.
  • Example of an inverted pendulum. If we have a state transition function, then can use that as a spec. This function is not directly used since the neural network is faster, but the state transition function can remain a useful spec. This is mainly true for cyber physical systems but not necessarily all uses of neural nets.
  • Build a network and a problem and generate a program using program synthesis and verify it using off-the-shelf techniques. This lets us approximate a neural policy as a program. As a result we dont have to replace the neural network, but we can still make asynchronous course corrections. We never run the program and the network together, but the program is just-in-time executed when there is an issue with the neural net.
  • Can overapproximate -- as long as we are anywhere close to a bad state and then run the synthesised progrm in the bad case. Counterexample Guided Inductive Synthesis (CEGIS).