Safe streaming with linear types
This is a fork of the original streaming library, modified for increased type safety through utilisation of linear types! I have been working on this in tight collaboration with software innovation lab Tweag I/O over the summer, and have been able to do this thanks to the Summer of Haskell stipend. I have also documented interesting happenings throughout the project in a blog, check it (and corresponding Reddit discussions) out!
This project is made possible by a new linear types extension to GHC that is currently in development by Tweag I/O with friends. For the curious, it can be found here. I have written instructions in the README of that branch on how you can get this running (today!) either by compiling this GHC version yourself, or using a precompiled build from a Docker image which integrates nicely with Stack. It also documents how you can get started with writing linearly typed programs, debugging tips, and some of the peculiarities that we have found so far. Remember this is still work in progress so some things are a bit weird, but it already works pretty well. This extension is required to build this project, since it uses novel type system features not present in vanilla GHC.
What's wrong with
Well, these streams are effectful, meaning that they can perform arbitrary monadic actions to conjure their elements. When you ask a stream for its next value, it will return you this together with a reference to the rest of the stream. The old stream reference (that you just used) should now be considered dead; only the new one is to be used. This is because if you again pull from the first stream reference, it will repeat whatever monadic action that was performed to produce that last element! This is capable of resulting both in arbitrarily weird and unexpected results, as well as runtime errors. In my project blog I have documented this issue in better detail.
In adition to this, when we stream from a file handle, something requiring a lock, or any similar "borrow-and-return"-abstraction, it would be nice to have the type system help you with this responsibility of cleanup. This would mean that errors of the classical use-after-free and double-free kind would not even compile.
As you might have figured out, these are both problems of linearity, and can be solved with linear types. A value of linear type has to be used exactly once. So for the first problem, if we return linear streams, the stream variables can't be re-used. For the other, we can stick a function that closes your filehandle (or whatever you need to do) as a linear value at the end of the stream. Now the program will not compile if that is not called in your code. Both of these solutions give you compile time guarantees of things that you before had to keep track of yourself, which is super neat!
So what am I doing?
The main point of this project is to see if I can use the linear type system
to solve these problems with
streaming, and how it affects the semantics of
the library. The internals
contains the functor-general library "backend", and the
consists of more specialised functions that mimics the behaviour of the
API as well as other common streaming models like
List of interesting parts to check out
If you aren't in the mood to dig through 4K SLOC and guess where I have hid other interesting things, here is a list of noteworthy stuff that has come from this project:
A linear State monad which can be parametrised with mixed-linearity state types: well documented code
Discovery of the steep price we had to pay in
streamingto solve the repeatable effects issue: blog post
A summary of the discovered usability issues with linear Haskell: blog post
A less-than-extensive test suite that at least provides a light demo on the linear state monad and a safe implementation of the main issues.
This is not my work, but the underlying paper for the linear types extension to GHC is an interesting read.