Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Optimize the implementation of IndexIter #190

Closed
KKould opened this issue Mar 30, 2024 · 1 comment · Fixed by #193
Closed

Perf: Optimize the implementation of IndexIter #190

KKould opened this issue Mar 30, 2024 · 1 comment · Fixed by #193
Assignees
Labels
enhancement New feature or request

Comments

@KKould
Copy link
Member

KKould commented Mar 30, 2024

Feature Request

Currently, the main performance issue: #162 (comment) in the benchmark is the execution of IndexIter

The current implementation of IndexIter is relatively redundant and cumbersome. Its main principle is to use three iterators in order: expression -> index value -> tuple (the primary key is expression -> tuple)
, the current implementation is relatively poor in readability, is there a better implementation?

FnckSQL/src/storage/mod.rs

Lines 146 to 356 in 173c395

impl IndexIter<'_> {
fn offset_move(offset: &mut usize) -> bool {
if *offset > 0 {
offset.sub_assign(1);
true
} else {
false
}
}
fn bound_key(&self, val: &ValueRef, is_upper: bool) -> Result<Vec<u8>, DatabaseError> {
match self.index_meta.ty {
IndexType::PrimaryKey => TableCodec::encode_tuple_key(&self.table.name, val),
IndexType::Unique => {
let index =
Index::new(self.index_meta.id, slice::from_ref(val), self.index_meta.ty);
TableCodec::encode_index_key(&self.table.name, &index, None)
}
IndexType::Normal => {
let index =
Index::new(self.index_meta.id, slice::from_ref(val), self.index_meta.ty);
TableCodec::encode_index_bound_key(&self.table.name, &index, is_upper)
}
IndexType::Composite => {
let values = if let DataValue::Tuple(Some(values)) = val.as_ref() {
values.as_slice()
} else {
slice::from_ref(val)
};
let index = Index::new(self.index_meta.id, values, self.index_meta.ty);
TableCodec::encode_index_bound_key(&self.table.name, &index, is_upper)
}
}
}
fn get_tuple_by_id(&mut self, tuple_id: &TupleId) -> Result<Option<Tuple>, DatabaseError> {
let key = TableCodec::encode_tuple_key(&self.table.name, tuple_id)?;
Ok(self.tx.get(&key)?.map(|bytes| {
TableCodec::decode_tuple(
&self.table.types(),
&self.projections,
&self.tuple_schema_ref,
&bytes,
)
}))
}
fn is_empty(&self) -> bool {
self.scope_iter.is_none() && self.index_values.is_empty() && self.ranges.is_empty()
}
}
/// expression -> index value -> tuple
impl Iter for IndexIter<'_> {
fn next_tuple(&mut self) -> Result<Option<Tuple>, DatabaseError> {
// 1. check limit
if matches!(self.limit, Some(0)) || self.is_empty() {
self.scope_iter = None;
self.ranges.clear();
return Ok(None);
}
// 2. try get tuple on index_values and until it empty
while let Some(value) = self.index_values.pop_front() {
if Self::offset_move(&mut self.offset) {
continue;
}
let tuple = match value {
IndexValue::PrimaryKey(tuple) => {
if let Some(num) = self.limit.as_mut() {
num.sub_assign(1);
}
return Ok(Some(tuple));
}
IndexValue::Normal(tuple_id) => {
self.get_tuple_by_id(&tuple_id)?.ok_or_else(|| {
DatabaseError::NotFound("index's tuple_id", tuple_id.to_string())
})?
}
};
return Ok(Some(tuple));
}
assert!(self.index_values.is_empty());
// 3. If the current expression is a Scope,
// an iterator will be generated for reading the IndexValues of the Scope.
if let Some(iter) = &mut self.scope_iter {
while let Some((_, value_option)) = iter.try_next()? {
if let Some(value) = value_option {
let index = if matches!(self.index_meta.ty, IndexType::PrimaryKey) {
let tuple = TableCodec::decode_tuple(
&self.table.types(),
&self.projections,
&self.tuple_schema_ref,
&value,
);
IndexValue::PrimaryKey(tuple)
} else {
IndexValue::Normal(TableCodec::decode_index(&value, &self.index_meta.pk_ty))
};
self.index_values.push_back(index);
break;
}
}
if self.index_values.is_empty() {
self.scope_iter = None;
}
return self.next_tuple();
}
// 4. When `scope_iter` and `index_values` do not have a value, use the next expression to iterate
if let Some(binary) = self.ranges.pop_front() {
match binary {
Range::Scope { min, max } => {
let table_name = &self.table.name;
let index_meta = &self.index_meta;
let bound_encode =
|bound: Bound<ValueRef>, is_upper: bool| -> Result<_, DatabaseError> {
match bound {
Bound::Included(val) => {
Ok(Bound::Included(self.bound_key(&val, is_upper)?))
}
Bound::Excluded(val) => {
Ok(Bound::Excluded(self.bound_key(&val, is_upper)?))
}
Bound::Unbounded => Ok(Bound::Unbounded),
}
};
let check_bound = |value: &mut Bound<Vec<u8>>, bound: Vec<u8>| {
if matches!(value, Bound::Unbounded) {
let _ = mem::replace(value, Bound::Included(bound));
}
};
let (bound_min, bound_max) = if matches!(index_meta.ty, IndexType::PrimaryKey) {
TableCodec::tuple_bound(table_name)
} else {
TableCodec::index_bound(table_name, &index_meta.id)
};
let mut encode_min = bound_encode(min, false)?;
check_bound(&mut encode_min, bound_min);
let mut encode_max = bound_encode(max, true)?;
check_bound(&mut encode_max, bound_max);
let iter = self.tx.iter(
encode_min.as_ref().map(Vec::as_slice),
encode_max.as_ref().map(Vec::as_slice),
)?;
self.scope_iter = Some(iter);
}
Range::Eq(val) => {
match self.index_meta.ty {
IndexType::PrimaryKey => {
let bytes =
self.tx.get(&self.bound_key(&val, false)?)?.ok_or_else(|| {
DatabaseError::NotFound(
"secondary index",
format!("value -> {}", val),
)
})?;
let tuple = TableCodec::decode_tuple(
&self.table.types(),
&self.projections,
&self.tuple_schema_ref,
&bytes,
);
self.index_values.push_back(IndexValue::PrimaryKey(tuple));
self.scope_iter = None;
}
IndexType::Unique => {
let bytes =
self.tx.get(&self.bound_key(&val, false)?)?.ok_or_else(|| {
DatabaseError::NotFound(
"secondary index",
format!("value -> {}", val),
)
})?;
self.index_values.push_back(IndexValue::Normal(
TableCodec::decode_index(&bytes, &self.index_meta.pk_ty),
));
self.scope_iter = None;
}
IndexType::Normal | IndexType::Composite => {
let min = self.bound_key(&val, false)?;
let max = self.bound_key(&val, true)?;
let iter = self.tx.iter(
Bound::Included(min.as_slice()),
Bound::Included(max.as_slice()),
)?;
self.scope_iter = Some(iter);
}
}
}
_ => (),
}
}
self.next_tuple()
}
}

@KKould KKould self-assigned this Mar 30, 2024
@KKould KKould added the enhancement New feature or request label Mar 30, 2024
@KKould
Copy link
Member Author

KKould commented Mar 30, 2024

The previous implementation was temporary. Since Index is now almost complete, I will implement iterators for each index to avoid prediction difficulties caused by a large number of branches.

@KKould KKould closed this as completed Apr 4, 2024
@KKould KKould linked a pull request Apr 4, 2024 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant