Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): record first_key in block_index_builder #598

Merged
merged 3 commits into from
Apr 9, 2022

Conversation

adlternative
Copy link
Contributor

This patch want to complete #589 first step, which record first_key in order to support seek.

Signed-off-by: ZheNing Hu adlternative@gmail.com

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, first_key is only useful when the column is fully sorted. So we can have a parameter on column builder whether to store the first_key or not.

@@ -83,4 +87,8 @@ impl BlockIndexBuilder {
pub fn into_index(self) -> Vec<BlockIndex> {
self.indexes
}

pub fn set_first_key(&mut self, first_key: Vec<u8>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer having the signature as

pub fn set_first_key(&mut self, first_key: &[u8]) {
  self.first_key.clear();
  self.first_key.extend(first_key);
}

So as to reduce allocation as much as possible.

By the way, it seems that first_key will only be used on finish_block. Is it possible to make it a parameter of finish_block instead of having a separate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, first_key is only useful when the column is fully sorted. So we can have a parameter on column builder whether to store the first_key or not.

Agree. But I am curious about where can we know that column need sort or not? Maybe pass from ColumnBuilderOptions later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer having the signature as

pub fn set_first_key(&mut self, first_key: &[u8]) {
  self.first_key.clear();
  self.first_key.extend(first_key);
}

So as to reduce allocation as much as possible.

By the way, it seems that first_key will only be used on finish_block. Is it possible to make it a parameter of finish_block instead of having a separate function?

This means that we need to modify PrimitiveColumnBuilder<T>::finish_builder() interface, that's ok...

@@ -121,7 +121,7 @@ pub fn append_one_by_one<'a, A: Array>(
impl<T: PrimitiveFixedWidthEncode> ColumnBuilder<T::ArrayType> for PrimitiveColumnBuilder<T> {
fn append(&mut self, array: &T::ArrayType) {
let mut iter = array.iter().peekable();

let mut first_flag = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_flag should be an inner state of ColumnBuilder instead of a local variable. One append doesn't necessarily produce a block.

I think a better way is to store the first_key in the struct on switching builder (when self.current_builder.is_none())?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_flag should be an inner state of ColumnBuilder instead of a local variable. One append doesn't necessarily produce a block.

I think a better way is to store the first_key in the struct on switching builder (when self.current_builder.is_none())?

Yes, the condition self.current_builder.is_none() can be reused.

@adlternative
Copy link
Contributor Author

By the way, should we apply this feature to CharColumnBuilder or BlobColumnBuilder, they have similar interface...

@skyzh
Copy link
Member

skyzh commented Apr 4, 2022

By the way, should we apply this feature to CharColumnBuilder or BlobColumnBuilder, they have similar interface...

Yes, indeed. But we can work on primitive types first, and add support for other columns later.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
@adlternative
Copy link
Contributor Author

adlternative commented Apr 6, 2022

Hi, now we can use a new option record_first_key to control if record first key of the block, is this change reasonable?
I am not sure about if named it as is_sorted or something else will be better?
@skyzh

@skyzh
Copy link
Member

skyzh commented Apr 6, 2022

Hi, now we can use a new option record_first_key to control if record first key of the block, is this change reasonable?
I am not sure about if named it as is_sorted or something else will be better?

I think record_first_key is clear enough! I'll take a look at this PR soon. Thanks!

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, but there're still some misc fixes to do. Thanks for your contribution!

) {
self.indexes.push(BlockIndex {
offset: column_data.len() as u64,
length: block_data.len() as u64 + BLOCK_HEADER_SIZE as u64,
first_rowid: self.last_row_count as u32,
row_count: (self.row_count - self.last_row_count) as u32,
/// TODO(chi): support sort key
first_key: "".into(),
first_key: first_key.drain(..).collect(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to get a full ownership of first_key in this function. Otherwise this interface would look confusing 🤣

&mut self.data,
&mut block_data,
stats,
&mut self.first_key,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drain here: self.first_key.drain(..).collect_vec()

@@ -80,6 +89,11 @@ impl ColumnBuilder<BlobArray> for BlobColumnBuilder {
PlainBlobBlockBuilder::new(self.options.target_block_size - 16),
));
}
if let Some(to_be_appended) = iter.peek() {
if self.options.record_first_key && to_be_appended.is_some() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None is also a valid first_key. I think we can decide on encoding of None keys.

Maybe we can use \0 to represent null keys, and \1 + BlobEncode to represent non-null keys?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for the encoding of the key, we can use utilities provided in https://github.com/risinglightdb/risinglight/blob/main/src/storage/secondary/encode.rs

For blob types, simply call to_be_appended.to_byte_slice().to_vec(), and store it in first_key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also assert!(self.first_key.is_empty()) here, for an extra layer of sanity check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None is also a valid first_key. I think we can decide on encoding of None keys.

Maybe we can use \0 to represent null keys, and \1 + BlobEncode to represent non-null keys?

So we use \1 for CharColumnBuilder too, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we use this as a temporary encoding for nullable elements, for both char, blob and primitive types.

Copy link
Member

@skyzh skyzh Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can have two fields: is_first_key_null and first_key in proto. Maybe this could be more intuitive? If first_key is null, first_key field is set to empty, and is_first_key_null is set to true?

@skyzh
Copy link
Member

skyzh commented Apr 6, 2022

By the way, please have some basic unit tests, at least for primitive type builders 🥰

@adlternative adlternative force-pushed the record_first_key branch 2 times, most recently from ef44350 to 107f9da Compare April 8, 2022 12:02
@adlternative
Copy link
Contributor Author

By the way, please have some basic unit tests, at least for primitive type builders smiling_face_with_three_hearts

Hi, I have followed your suggestions to add is_first_key_null to proto and add some tests for primitive type builders. 😉
@skyzh

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some small issues. Thanks again for your contribution!

if self.options.record_first_key && to_be_appended.is_some() {
self.is_first_key_null = false;
self.first_key = to_be_appended.unwrap().as_bytes().to_vec();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that once is_first_key_null is set to false, it will never be set back to true?

We can define self.first_key as Option<Vec<u8>>, so as to make the logic clearer.

if self.options.record_first_key {
   self.first_key = to_be_appended.map(|x| x.as_bytes().to_vec());
}

And we can modify finish_block to accept first_key: Option<Vec<u8>>. Inside the block builder, we match first_key. If null, set is_first_key_null to true. Otherwise, set first_key to the inner value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some small issues. Thanks again for your contribution!

Thanks also for your review too! 😭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that once is_first_key_null is set to false, it will never be set back to true?

We can define self.first_key as Option<Vec<u8>>, so as to make the logic clearer.

if self.options.record_first_key {
   self.first_key = to_be_appended.map(|x| x.as_bytes().to_vec());
}

And we can modify finish_block to accept first_key: Option<Vec<u8>>. Inside the block builder, we match first_key. If null, set is_first_key_null to true. Otherwise, set first_key to the inner value.

Maybe finish_block() will be like this?

    pub fn finish_block(
        &mut self,
        block_type: BlockType,
        column_data: &mut Vec<u8>,
        block_data: &mut Vec<u8>,
        stats: Vec<BlockStatistics>,
        first_key: &mut Option<Vec<u8>>,
    ) {
        self.indexes.push(BlockIndex {
            offset: column_data.len() as u64,
            length: block_data.len() as u64 + BLOCK_HEADER_SIZE as u64,
            first_rowid: self.last_row_count as u32,
            row_count: (self.row_count - self.last_row_count) as u32,
            /// TODO(chi): support sort key
            first_key: match first_key {
                Some(x) => x.drain(..).collect(),
                None => Default::default(),
            },
            stats,
            is_first_key_null: first_key.is_none(),
        });
        ...

Copy link
Member

@skyzh skyzh Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! And I’d prefer to have Option<Vec<u8>> as parameter instead of &mut. As block is typically large, this re-allocation is acceptable.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good work!

@skyzh skyzh enabled auto-merge (squash) April 9, 2022 04:09
@skyzh skyzh merged commit f591970 into risinglightdb:main Apr 9, 2022
@adlternative
Copy link
Contributor Author

LGTM, good work!

Thanks a lot! @skyzh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants