Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port table_slice API to FlatBuffers #1140

Merged
merged 60 commits into from Nov 16, 2020

Conversation

dominiklohmann
Copy link
Member

@dominiklohmann dominiklohmann commented Nov 4, 2020

📔 Description

Transform the table_slice API to use FlatBuffers for persistence, enabling versioning of table slice encodings and sharing ownership between segments and contained table slices.

🗺️ Implementation Status

📝 Checklist

  • All user-facing changes have changelog entries.
  • The changes are reflected on docs.tenzir.com/vast, if necessary.
  • The PR description contains instructions for the reviewer, if necessary.

🎯 Review Instructions

Because of the sheer size of this change, this PR is split into multiple PRs that will be merged into this epic branch. After all merge commits are through, this PR will contain fixups only, which are best reviewed commit-by-commit.

The corresponding row_major_matrix_table_slice_builder.hpp file was
removed in an earlier commit, but this file was left in by accident.
This is a first step towards removing the header altogether.
We previously had both `table_slice::column_view` and
`table_slice_column` with slightly different behavior. This removes the
`table_slice::column_view` and makes it behave as expected with the new
API. Additionally this removes the `table_slice::column(...)` functions
from the `table_slice` API, because they are just covered by the
constructors of `table_slice_column`.
This supersedes `table_slice::row_view` and allows for removing
`table_slice::row(...)`, because the function is covered by the
constructor of `table_slice_row`.
It was not even used anywhere—we just forgot to remove it.
Additionally, this sorts the functions in table_slice.hpp and
table_slice.cpp in a consistent way.
The agreed upon design for the new table_slice API has
`table_slice::layout()` return by value. This removes some code that
would then probably not work correctly.
At this point in time this is strictly a pessimization, but we need to
change this down the road and at this point it is easy to detect the
required changes.
The `table_slice::inspect` function must go through the constructor that
handles the instance counting when loading table slices from a
`caf::message`.
The offset being in the `table_slice` encoding-specific FlatBuffers
tables was the only reason why we required support for mutable
FlatBuffers. This commit changes `table_slice` to hold the offset
in-memory. The offset is now assigned by the importer on import and by
the segment on lookup.
The FlatBuffers API guarantees that the encoding pointers are not
`nullptr`, so we might as well dereference them in the visitor function
already.
@dominiklohmann dominiklohmann force-pushed the epic/flatbuffers-persistence-table-slices branch from 4e48f8e to f3fc26b Compare November 13, 2020 21:07
This was referenced Nov 15, 2020
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this effort, it was certainly not an easy endeavor. The explicit layout will make many things easier in the future!

@dominiklohmann dominiklohmann merged commit 5f4e072 into master Nov 16, 2020
@dominiklohmann dominiklohmann deleted the epic/flatbuffers-persistence-table-slices branch November 16, 2020 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality performance Improvements or regressions of performance refactoring Restructuring of existing code
Projects
None yet
2 participants