Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement edge type indices #1542

Merged
merged 45 commits into from
Mar 8, 2024
Merged

Implement edge type indices #1542

merged 45 commits into from
Mar 8, 2024

Conversation

gvolfing
Copy link
Contributor

@gvolfing gvolfing commented Nov 24, 2023

Currently there are two different kinds of indices in the database indices for label and indices for label+property based look-ups on
vertices. With the edge-type indexing capabilities it should be possible to create and deal with indices on edge types in the same way as with indices on vertex labels and properties. The case for edge-type indexes is, that in the case if the user would like to query all the edges based on a specific edge-type, the created query-plan has to traverse the entire graph, looking for the specific edge-type. The edge-type indexing feature solves this issue, however the flag, --storage-properties-on-edges has to be enabled.

Currently there are two different kinds of indices in the database
indices for label and indiced for label+property based lookups on
vertices. With this feature it should be possible to create and deal
with indices on edge types in the same way as with indices on vertices.
The already existing implementations of indices are good as a start for
indices for edge-types, however they are not mappable one to one. Issue
with edge-type based indices is that it is not always clearly defines
what en edge is. If the flag --storage-properties-enabled-on-edges is
False, then there is no singular datastrucrture that is the collection
of all the edges present in the database. The only way to get the edges
is to gather all the information runtime using the source and
destination vertices of the given edge, and compute everything we need
to be able to construct the EdgeAccessor objects. This might be
computationally to expensive, especially in the presence of supernodes.
If the datastructure that is the collection of all the edges is enabled
by setting the --storage-properties-enabled-on-edges to True, it is
still not trivial to get all the data that is needed to create the
EdgeAccessor objects. Solution for these issues should come in
subsequent commits.
The indexing datastructures that are holding the relevant information
about the edge-type indices are in place. This commit adds the
functionality to be able to CREATE and DROP edge-type indices.
@gvolfing gvolfing added the Docs needed Docs needed label Nov 24, 2023
@gvolfing gvolfing added this to the mg-v2.13.0 milestone Nov 24, 2023
@gvolfing gvolfing self-assigned this Nov 24, 2023
Copy link

sonarcloud bot commented Nov 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@gitbuda gitbuda modified the milestones: mg-v2.13.0, mg-v2.14.0 Dec 2, 2023
Implement the planned side of edge-type indexing by creating a new
operator that represents the edge-type indexing operation itself and
update the rewriter logic to make it understand when the new operator
should replace the scanall + expand operations.
Previously abstract base classes were used to implement the on-disk
related stubs as we do not plan on implementing this functionality for
now for the on-disk storage mode. With this commit the relevant virtual
functions have been made pure virtual and the stuv implementations have
been moved to dedicated subclasses.
The edge-type index creation itself is scanning through the skiplist and
building up the indices based on that, however it does not take into
account those edges that are being created after the index itself is
created. With this commit the index is updated upon adding edges to the
database if needed.
@gitbuda gitbuda modified the milestones: mg-v2.14.0, mg-v2.15.0 Jan 3, 2024
The complexity brought by storage::View::NEW and OLD are making some
edgecases hard to deal with. Add neccessary metadata, needed by the edge
indexing process to work correctly in these edge-cases. Currently that
is an additional pointer to the Edge constructs in the Edges skiplist.
Add more comprehensive tests for the indexing and planning.
This may be a leaksanitizer fluke. If the loop is rewrtitten this way,
the sanitizer stops reporting leaked memory. Pushing it now to see if
that is the case on CI as well. Further investigation is needed for this
case.
@katarinasupe
Copy link
Contributor

Can you link #662 to development if this PR is fixing that issue?

@gvolfing
Copy link
Contributor Author

@katarinasupe
These are two different indexing strategies.
This one is about indexing the edge-types themselves. Not properties on edges. However I can see this one being used to ease the use-case that the user is talking about. Once this is done, I will link this PR in the issue you tagged here, but again, this is not a direct solution to the issue the user presents in #662.

@katarinasupe
Copy link
Contributor

Okay, thanks, good to know!

@katarinasupe
Copy link
Contributor

@gvolfing, I realized that the linked issue automatically closes when the pull request merges, so it's enough that we mentioned it in the comment since it's visible on both PR and issue.

@gvolfing gvolfing marked this pull request as ready for review February 13, 2024 19:06
src/query/frontend/semantic/symbol.hpp Show resolved Hide resolved
src/query/db_accessor.hpp Outdated Show resolved Hide resolved
src/query/plan/planner.hpp Show resolved Hide resolved
src/query/plan/operator.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/storage.cpp Show resolved Hide resolved
src/storage/v2/inmemory/storage.cpp Show resolved Hide resolved
src/storage/v2/inmemory/storage.cpp Outdated Show resolved Hide resolved
src/storage/v2/durability/snapshot.cpp Show resolved Hide resolved
src/storage/v2/durability/snapshot.cpp Show resolved Hide resolved
src/storage/v2/durability/snapshot.cpp Show resolved Hide resolved
src/storage/v2/replication/replication_client.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@antoniofilipovic antoniofilipovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work, I left few comments where I think we might have bug

src/query/plan/rewrite/edge_type_index_lookup.hpp Outdated Show resolved Hide resolved
src/storage/v2/vertices_iterable.hpp Outdated Show resolved Hide resolved
src/storage/v2/disk/storage.cpp Show resolved Hide resolved
src/storage/v2/disk/storage.cpp Show resolved Hide resolved
src/storage/v2/inmemory/storage.hpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/durability/wal.hpp Show resolved Hide resolved
src/storage/v2/inmemory/edge_type_index.cpp Show resolved Hide resolved
Copy link
Collaborator

@antoniofilipovic antoniofilipovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, all comments addressed, I don't see anything new, good work 💪

@gvolfing
Copy link
Contributor Author

Docs
memgraph/documentation#502

@gvolfing gvolfing modified the milestones: mg-v2.15.0, mg-v2.16.0 Feb 26, 2024
src/storage/v2/inmemory/edge_type_index.cpp Outdated Show resolved Hide resolved
src/storage/v2/inmemory/storage.cpp Outdated Show resolved Hide resolved
@gitbuda gitbuda added the Priority - Now Priority - Now label Feb 28, 2024
Copy link
Contributor

@Darych Darych left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gvolfing gvolfing merged commit 619b01f into master Mar 8, 2024
6 checks passed
@gvolfing gvolfing deleted the Implement-edge-type-indices branch March 8, 2024 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs needed Docs needed Priority - Now Priority - Now
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants