New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast relation traversals for rule materialisation #6382
Fast relation traversals for rule materialisation #6382
Conversation
graph/edge/impl/ThingEdgeImpl.java
Outdated
@@ -220,6 +222,87 @@ public final int hashCode() { | |||
} | |||
} | |||
|
|||
public static class Virtual extends ThingEdgeImpl implements ThingEdge { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we create a virtual edge that may not exist when we want to use a seek
operation on an edge iterator! Pretty interesting concept!
import static com.vaticle.typedb.core.common.exception.ErrorMessage.Internal.ILLEGAL_ARGUMENT; | ||
import static com.vaticle.typedb.core.common.exception.ErrorMessage.Internal.ILLEGAL_STATE; | ||
|
||
public class FlatMergeSortedIterator<T, U extends Comparable<? super U>> extends AbstractFunctionalIterator.Sorted<U> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this iterator merges N sorted iterators into another sorted iterator using a Min-Heap data structure
@@ -203,7 +206,7 @@ public ThingVertex convertToReadable(VertexIID.Thing iid) { | |||
VertexIID.Thing iid = generate(keyGenerator, typeVertex.iid(), typeVertex.properLabel()); | |||
ThingVertex.Write vertex = new ThingVertexImpl.Write.Buffered(this, iid, isInferred); | |||
thingsByIID.put(iid, vertex); | |||
thingsByTypeIID.computeIfAbsent(typeVertex.iid(), t -> new ConcurrentSet<>()).add(vertex); | |||
thingsByTypeIID.computeIfAbsent(typeVertex.iid(), t -> new ConcurrentSkipListSet<>()).add(vertex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support using sorted iterators, we must use natively navigable data structures - in this case we use a ConcurrentSkipList to read a sorted iterator from
graph/TypeGraph.java
Outdated
@@ -406,8 +406,8 @@ public References references() { | |||
|
|||
public FunctionalIterator<RuleStructure> all() { | |||
Encoding.Prefix index = IndexIID.Rule.prefix(); | |||
FunctionalIterator<RuleStructure> persistedRules = storage.iterate(index.bytes(), (key, value) -> | |||
convert(StructureIID.Rule.of(value))); | |||
FunctionalIterator<RuleStructure> persistedRules = storage.iterate(index.bytes(), Storage.SortedPair::new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't want necessarily make StructureIID
implements ByteComparable
, so have a sortable pair that we can then use a normal map
operation over it, discarding the sortedness property we don't care about in this case
graph/adjacency/ThingAdjacency.java
Outdated
default ThingIteratorSortedBuilder edge(Encoding.Edge.Thing encoding, IID... lookAhead) { | ||
if (encoding == Encoding.Edge.Thing.HAS) return edgeHas(lookAhead); | ||
else if (encoding == Encoding.Edge.Thing.PLAYING) return edgeHas(lookAhead); | ||
else if (encoding == Encoding.Edge.Thing.RELATING) return edgeHas(lookAhead); | ||
else if (encoding == Encoding.Edge.Thing.ROLEPLAYER) { | ||
if (lookAhead.length > 0) return edgeRolePlayer(lookAhead[0], Arrays.copyOfRange(lookAhead, 1, lookAhead.length)); | ||
else throw TypeDBException.of(ILLEGAL_OPERATION); | ||
} | ||
else throw TypeDBException.of(ILLEGAL_STATE); | ||
} | ||
ThingIteratorSortedBuilder edgeHas(IID... lookAhead); | ||
ThingIteratorSortedBuilder edgePlaying(IID... lookAhead); | ||
ThingIteratorSortedBuilder edgeRelating(IID... lookAhead); | ||
ThingIteratorSortedBuilder edgeRolePlayer(IID roleType, IID... lookAhead); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only under certain circumstances can we provide a sorted edge iterator - specifically any has
/relating
/playing
edge iterator, or a roleplayer
edge when also given a roleType
lookahead. We enforce this via the function signature
graph/adjacency/ThingAdjacency.java
Outdated
abstract class EdgeDirected implements Comparable<EdgeDirected> { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a new construct: we wrap an Edge which is undirected but therefore not sortable -- if it is an In
adjacency, it may be sortable but in an Out
. In other words, we can only sort in the direction of the adjacency. This wrapper enables us to return sorted edge iterators
@Override | ||
public int compareTo(ThingVertex o) { | ||
return iid.bytes().compareTo(o.iid().bytes()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thing vertices, which we want to sort, and directed edges are all sortable and therefore require compareTo
rocks/RocksIterator.java
Outdated
public void seek(T target) { | ||
if (state == State.INIT) initialise(target.getBytes()); | ||
else internalRocksIterator.seek(target.getBytes().getArray()); | ||
state = State.SEEKED_EMPTY; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we seek but don't want to immediately check if there's a valid next, we introduce a new state SEEKED_EMPTY
which will just check the current key without calling next()
(which progresses the rocksDB iterator)
import static com.vaticle.typedb.core.common.iterator.Iterators.iterate; | ||
import static java.util.stream.Collectors.toMap; | ||
|
||
public class RelationIterator extends AbstractFunctionalIterator<VertexMap> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the new Fast relation lookup algorithm. Did my best to make it as simple as possible, but still good to go through I think!
3cd9da0
to
e992496
Compare
@@ -30,6 +30,7 @@ java_library( | |||
|
|||
# External Maven Dependencies | |||
"@maven//:com_github_ben_manes_caffeine_caffeine", | |||
"@maven//:com_google_guava_guava", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use guava for lexicographical byte array comparison -- it contains an optimised implementation that is OS dependent
common/collection/ByteArray.java
Outdated
public ByteArray getBytes() { | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ByteComparable
classes must implement getBytes()
which pushed the old getBytes()
to become getArray()
common/collection/Bytes.java
Outdated
public interface ByteComparable<T extends ByteComparable<T>> extends Comparable<T> { | ||
|
||
ByteArray getBytes(); | ||
|
||
@Override | ||
default int compareTo(T o) { | ||
return getBytes().compareTo(o.getBytes()); | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this interface ensures that anything that is byte comparable is compared the exact same way, using our lexicographical ordering (which matches RocksDB's comparator exactly)
common/iterator/BaseIterator.java
Outdated
private T next; | ||
private T last; | ||
|
||
public Sorted(NavigableSet<T> source) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "basis" of a Sorted
functional iterator is a NavigableSet
, which can critically be used to implement seek()
graph/iid/EdgeIID.java
Outdated
@@ -113,7 +113,7 @@ public static Type of(VertexIID.Type start, Encoding.Infix infix, VertexIID.Type | |||
} | |||
} | |||
|
|||
public static class Thing extends EdgeIID<Encoding.Edge.Thing, InfixIID.Thing, VertexIID.Thing, VertexIID.Thing> { | |||
public static class Thing extends EdgeIID<Encoding.Edge.Thing, InfixIID.Thing, VertexIID.Thing, VertexIID.Thing> implements Comparable<Thing> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything we want to use in sortable iterators must be comparable
@@ -570,7 +572,7 @@ public Relation asRelation() { | |||
playersWithIIDs.add(playerId); | |||
} | |||
}); | |||
return traversalEng.iterator(traversal).map(conceptMgr::conceptMap) | |||
return traversalEng.relations(traversal).map(conceptMgr::conceptMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our new type of traversal for fast relation lookups is on traversalEng.relations()
!
rocks/RocksIterator.java
Outdated
case SEEKED_EMPTY: | ||
return checkValidNext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after we have done a forward()
, we want to lazily check whether there is a valid next element without actually calling next()
as this moves the rocks iterator forward again. To introduce this we create a new state SEEKED_EMPTY
which does half the work of EMPTY
, just checking iterator validity without progressing the rocks iterator to the next element
rocks/RocksStorage.java
Outdated
@@ -185,7 +186,7 @@ public ByteArray get(ByteArray key) { | |||
} | |||
|
|||
@Override | |||
public <G> FunctionalIterator<G> iterate(ByteArray key, BiFunction<ByteArray, ByteArray, G> constructor) { | |||
public <G extends Bytes.ByteComparable<G>> FunctionalIterator.Sorted<G> iterate(ByteArray key, BiFunction<ByteArray, ByteArray, G> constructor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now anything we iterate from storage must be ByteComparable
. We can mapSorted
it later to convert it to other types as needed, all of which must be comparable as well
traversal/TraversalEngine.java
Outdated
public FunctionalIterator<VertexMap> relations(Traversal traversal) { | ||
return traversal.relations(graphMgr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new fast relation lookup!
graph/ThingGraph.java
Outdated
FunctionalIterator.Sorted<ThingVertex> storageIterator = storage.iterate(prefix) | ||
.mapSorted( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be one line
public final FunctionalIterator.Sorted.Forwardable<T> merge(FunctionalIterator.Sorted.Forwardable<T>... iterators) { | ||
return Iterators.Sorted.merge(this, iterators); | ||
} | ||
|
||
@Override | ||
public <U extends Comparable<? super U>> FunctionalIterator.Sorted.Forwardable<U> mapSorted(Function<T, U> mappingFn, Function<U, T> reverseMappingFn) { | ||
return Iterators.Sorted.mapSorted(this, mappingFn, reverseMappingFn); | ||
} | ||
|
||
@Override | ||
public FunctionalIterator.Sorted.Forwardable<T> distinct() { | ||
return Iterators.Sorted.distinct(this); | ||
} | ||
|
||
@Override | ||
public FunctionalIterator.Sorted.Forwardable<T> filter(Predicate<T> predicate) { | ||
return Iterators.Sorted.filter(this, predicate); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have to re-implement these in each Forwardable
class because we can't do multiple inheritance
2b86352
to
304d3de
Compare
expose edge iterator as a sorted edge iterator fix sorted iterator bugs fix merge iterator split ThingAdjacency into sorted and unsorted builders fix arraycopy wip Implement discussed comparators and sorted iterator interface implement BaseIterator.Sorted from NavigableSet rebase master and back to buildable fix sortability of vertex based on edge fix remove() call fix rule retrieval sortedness implement edges sorted correctly ByteArrays use unsigned byte-by-byte ordering byte array comparison use unsigned byte comparisons loading rules obeys sortedness of storage iteration expose EdgeSortable outwardly EdgeSortable implement hashcode and equals clean up TODOs and implement seek() in FlatMerge fix license rebase successful cleanup ByteComparable to be only in minmal places fix sortable type edges fix build make distinct() in ThingGraph.getReadable() efficient fix null ptr cleanup cleanup revert NavigableSet in ThingGraph cleanup Revert "revert NavigableSet in ThingGraph" This reverts commit f9f48d4. parent a35c269 author joshua <joshuasend1@gmail.com> 1625158655 +0100 committer joshua <joshuasend1@gmail.com> 1625675337 +0100 implement fast relation lookup algorithm add missing impls and fix bugs refactor RelationIterator, add assertions use relation type information in lookahead fix filtered iterator fetch cleaning up refactor RocksIterator changes refactor RelationIterator All sorted iterators validate no backward seeks are allowed cleanup
LOG not print
RocksIterator always returns KeyValue fix build error refactor ByteArray method names cleanup comments rename FlatMerge to MergeMap implement reverse mapping assertion and rename to past tense iterator remove newline and add parameter make constructors non-public Add KeyValue equality and hide public constructors of iterators inline inner classes rename edge builders fix adjacency build wip stash decisions
Replace previous usages of Sorted with Sorted.Forwardable Create optimised and non-optimised Thing edge encodings Revert split api of edgeHas/edgePlaying/edgeRelating/edgeRolePlayer cleanup cleanup reorganise RelationIterator
cleanup encoding enum split rename Edge.Thing.Data to Edge.Thing.Base rename ThingEdgeImpl.Virtual to Target remove merge() varargs in Sorted iterator simplify ThingAdjacencyImpl.SortedIteratorBuilder reorder and rename RocksIterator apis cleanup imports cleanup cleanup simplify matchRelation add hasNext catch
WIP fix iterator lost argument
RelationIterator can handle multiple roles and relation types tighten up RelationTraversal API to only allow star shaped relations cleanup redudant variable move some APIs to RelationTraversal Cleaned up implementation of RelationIterator address comment
2b7e74c
to
69f3e07
Compare
## What is the goal of this PR? Using a RocksIterator after it is closed can cause a `SIGSEGV`, which crashes the server. #6382 introduced a new API on RocksIterator that was not protected against `close()` race conditions, which is now corrected. ## What are the changes implemented in this PR? * since `forward()` is synchronized, we only add a check to see if the iterator is completed before performing a seek() operation on the rocks iterator. This prevents utilising a closed rocksDB resouce, which would cause the SIGSEGV
What is the goal of this PR?
We dramatically improve the performance of a rule materialisation by implementing a new traversal type and traversal algorithm for relation lookups (when all role players and roles are known)
What are the changes implemented in this PR?
FunctionalIterator.Sorted<T>
which always returns an element greater (by it's well-defined comparison) than the prior oneDistinct
/Filtered
/Mapped
/FlatMerged
iterators which transform sorted iterators into other sorted iteratorsFunctionalIterator.Sorted
calledForwardable
which implement an efficientforward(T target)
method, which supports forward seeksByteArray
is now sortable based on lexicographical ordering, which is what RocksDB keys use tooThingAdjacency.DirectedEdge
which is a sortable wrapper around undirectedThingEdge
. We then create a new set of APIs for retrieving sorted/forwardable edge iterators fromThingAdjacency
ThingEdge.Target
, which may or may not exist between twoThingVertex
with a given encoding. This is used for seeking a sorted iterator ofThingEdge
//traversal/iterator/RelationIterator
, used in combination with aRelationTraversal
, which only allows connecting a single relation over multiple role-player edges (eg. star-shaped)