Vinyl is now in beta. Feedback is welcome!
Inspired by SQL, built for Java. Vinyl extends Java Streams with relational operations, based around its central Record type. It aims to integrate smoothly and minimally on top of existing Streams, while staying efficient, safe, and easy-to-use.
Vinyl requires a Java version of at least 9.
Vinyl's Maven group ID is io.avery
, and its artifact ID is vinyl
.
To add a dependency on Vinyl using Maven, use the following:
<dependency>
<groupId>io.avery</groupId>
<artifactid>vinyl</artifactid>
<version>0.1</version>
</dependency>
The package documentation gives a couple examples and introduces core concepts.
To make use of Vinyl, we first need to declare some "fields" that we will use later in relational operations.
Field<Integer> number = new Field<>("number");
Field<Integer> times2 = new Field<>("times2");
Field<Integer> square = new Field<>("square");
Notice that we give each field a name and a type argument. The name is the field's toString()
representation. The type
argument says what type of values a "record" can associate with this field.
Now we can start writing streams. For a simple example, we'll enrich a sequence of numbers:
RecordStream numbers = RecordStream.aux(IntStream.range(0, 10).boxed())
.mapToRecord(into -> into
.field(number, i -> i)
.field(times2, i -> i + i)
.field(square, i -> i * i)
);
We first wrap a normal stream in a RecordStream.Aux
, an "auxiliary" stream that extends Stream
with the
mapToRecord()
method. With mapToRecord()
, we describe how to create each field on an outgoing record, from an
incoming element. The API ensures that all outgoing records share the same set of fields, or "header". This is a key
aspect of a RecordStream
.
At this point, the data conceptually looks like:
number | times2 | square |
0 | 0 | 0 |
1 | 2 | 1 |
2 | 4 | 4 |
3 | 6 | 9 |
4 | 8 | 16 |
5 | 10 | 25 |
6 | 12 | 36 |
7 | 14 | 49 |
8 | 16 | 64 |
9 | 18 | 81 |
Let's try this one more time. This time, instead of generating numbers, we'll convert some data we already have as a
list of Child
objects into a record stream.
Field<String> firstName = new Field<>("firstName");
Field<String> lastName = new Field<>("lastName");
Field<Integer> favoriteNumber = new Field<>("favoriteNumber");
RecordStream favoriteNumbers = RecordStream.aux(children.stream())
.mapToRecord(into -> into
.field(firstName, Child::getFirstName)
.field(lastName, Child::getLastName)
.field(favoriteNumber, Child::getFavoriteNumber)
);
This data conceptually looks like:
firstName | lastName | favoriteNumber |
Amelia | Rose | 7 |
James | Johnson | 7 |
Maria | Cabrero | 4 |
Lisa | Woods | 100 |
Marc | Vincent | 3 |
Tyler | Laine | 9 |
Olivia | Pineau | 2 |
Sunder | Suresh | 22 |
Megan | Alis | 4 |
If we wanted, we could relate children's favorite numbers with more info about those numbers, by joining our two data sets together:
RecordStream joinedNumbers = favoriteNumbers
.leftJoin(numbers,
on -> on.match((left, right) -> Objects.equals(left.get(favoriteNumber), right.get(number))),
select -> select
.leftAllFields()
.rightAllFieldsExcept(number)
);
We left-join the favoriteNumbers
to the numbers
, providing a join condition that matches when the left-side record's
favoriteNumber
is equal to the right-side record's number
. For our outgoing records, we select all fields from the
left-side record and all fields from the right-side record (excluding number
, since it will be redundant with
favoriteNumber
). The resulting data conceptually looks like:
firstName | lastName | favoriteNumber | times2 | square |
Amelia | Rose | 7 | 14 | 49 |
James | Johnson | 7 | 14 | 49 |
Maria | Cabrero | 4 | 8 | 16 |
Lisa | Woods | 100 | null | null |
Marc | Vincent | 3 | 6 | 9 |
Tyler | Laine | 9 | 18 | 81 |
Olivia | Pineau | 2 | 4 | 4 |
Sunder | Suresh | 22 | null | null |
Megan | Alis | 4 | 8 | 16 |
While this join yields the results we expect, it is not efficient for larger input data. The problem is that the
match()
lambda is opaque to Vinyl, so there is not enough information for Vinyl to optimize the join. When the join is
evaluated, for each left-side record, we will loop over the whole right side searching for records that match - a nested
loop. We could provide more information by writing the join condition differently:
RecordStream joinedNumbers = favoriteNumbers
.leftJoin(numbers,
on -> on.eq(on.left(favoriteNumber), on.right(number)),
select -> select
.leftAllFields()
.rightAllFieldsExcept(number)
);
Now, Vinyl knows we are doing an equality test between the left and right sides. When the join is evaluated, we will
first index the right side, grouping records by their number
value. Then, for each left-side record, we will look up
its favoriteNumber
value in the index, quickly finding all right-side records that match.
Since a RecordStream
is a Stream
, we can still use any of the usual stream operations:
int sumOfMSquares = joinedNumbers
.filter(record -> record.get(firstName).startsWith("M"))
.mapToInt(record -> record.get(square))
.sum();
Or, if we need to use the same records again, we may store them in a RecordSet
:
RecordSet data = joinedNumbers.toRecordSet();
Like a RecordStream
, a RecordSet
has a single header shared by all its records. This means we can easily get back to
a RecordStream
from a RecordSet
:
RecordStream reStream = data.stream();
Here is our full example again, with streams inlined:
Field<Integer> number = new Field<>("number");
Field<Integer> times2 = new Field<>("times2");
Field<Integer> square = new Field<>("square");
Field<String> firstName = new Field<>("firstName");
Field<String> lastName = new Field<>("lastName");
Field<Integer> favoriteNumber = new Field<>("favoriteNumber");
RecordSet data = RecordStream.aux(children.stream())
.mapToRecord(into -> into
.field(firstName, Child::getFirstName)
.field(lastName, Child::getLastName)
.field(favoriteNumber, Child::getFavoriteNumber)
)
.leftJoin(RecordStream.aux(IntStream.range(0, 10).boxed())
.mapToRecord(into -> into
.field(number, i -> i)
.field(times2, i -> i + i)
.field(square, i -> i * i)
),
on -> on.eq(on.left(favoriteNumber), on.right(number)),
select -> select
.leftAllFields()
.rightAllFieldsExcept(number)
)
.toRecordSet();