Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utils: Move cstore column and frame implementation to utils #15889

Merged
merged 3 commits into from
Jan 4, 2024

Conversation

Lazin
Copy link
Contributor

@Lazin Lazin commented Dec 22, 2023

This PR simply moving the code from cloud_storage/segment_meta_cstore.h to the utils/delta_for.h so it could be reused more easily.

The tests/benchmarks are also separated and moved to the new location in utils.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

  • none

dotnwat
dotnwat previously approved these changes Dec 22, 2023
Comment on lines 842 to 843
deltafor_column<value_t, details::delta_xor, max_frame_size>,
max_frame_size>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if this duplication of max_frame_size points at a slightly different decomposition of the types being possible?

for example, would it make logical sense for the deltafor_column to expose its configured max size as a constexpr field and then the column_impl would use that?

i'm assuming that it would be problematic for the two max sizes to be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's problematic because the value is pushed from top to bottom during instantiation. The deltafor_column instantiates deltafor_column_impl which actually uses the value. The stuff in the column_store doesn't really care about the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the actual impl of append() is in the _impl class, I don't think making a constexpr field will change much, since it still needs to get passed to the _impl. I dont feel strongly about it at all, but perhaps an alternative is something like:

  • deltafor column: owns the _frames, is also an appender and a seeker, and instantiates them to point at _frames
  • frame appender: references the frames, appends to the column, templatized on max_frame_size, uses it to roll new frames, identical impl regardless of algo
  • frame seeker: references the frames, templatized on algo, seeks based on which algo is templatized

Though I'm not convinced further decomposition adds much beyond maybe it clarifies the roll of the frame size, and solidifies the difference the algos' different seek implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to change the logic much. This PR only moves things around. Changes like this can be done in a followup.

@vbotbuildovich
Copy link
Collaborator

@dotnwat
Copy link
Member

dotnwat commented Dec 23, 2023

what changed?

@Lazin
Copy link
Contributor Author

Lazin commented Dec 23, 2023

what changed?

I forgot to git add the delt_for_bench.cc in first push.

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple nits and some clarifying questions, but mostly lgtm

@@ -865,15 +862,15 @@ class segment_meta_column<value_t, details::delta_xor>
/// in the column. The actual decoding is only performed for a single frame.
/// The access by index is also fast (same order of magnitued as search).
template<class value_t>
class segment_meta_column<value_t, details::delta_delta<value_t>>
: public segment_meta_column_impl<
class deltafor_column<value_t, details::delta_delta<value_t>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment here needs an update too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

class segment_meta_column<value_t, details::delta_delta<value_t>>
: public segment_meta_column_impl<
class deltafor_column<value_t, details::delta_delta<value_t>>
: public deltafor_column_impl<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column for monotonic sequences.

Curious, what guarantee or failure modes exist with regards to monotonicity of appends? If the caller doesn't append in order, what breaks? If it's unexpected usage, it's probably worth adding a warning in the description, maybe even an assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that currently it will trigger an assertion

value_t,
details::delta_xor{},
segment_meta_column<value_t, details::delta_xor>>;
deltafor_column<value_t, details::delta_xor>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we're trying to make this a bit more generic, perhaps it's worth calling out the expected limitations that non-segment-cstore users may need to consider, like the fact that pred_search will return the first frame that matches the predicate.

Comment on lines 842 to 843
deltafor_column<value_t, details::delta_xor, max_frame_size>,
max_frame_size>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the actual impl of append() is in the _impl class, I don't think making a constexpr field will change much, since it still needs to get passed to the _impl. I dont feel strongly about it at all, but perhaps an alternative is something like:

  • deltafor column: owns the _frames, is also an appender and a seeker, and instantiates them to point at _frames
  • frame appender: references the frames, appends to the column, templatized on max_frame_size, uses it to roll new frames, identical impl regardless of algo
  • frame seeker: references the frames, templatized on algo, seeks based on which algo is templatized

Though I'm not convinced further decomposition adds much beyond maybe it clarifies the roll of the frame size, and solidifies the difference the algos' different seek implementations.

@Lazin Lazin force-pushed the pr/move-cstore-column-and-frame-impl branch from ef4159d to 1ddfc51 Compare January 3, 2024 14:57
@Lazin Lazin requested a review from andrwng January 3, 2024 14:59
@Lazin Lazin merged commit a06ea91 into redpanda-data:dev Jan 4, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants