Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(813): Avoid materialization of product values in subscriptions #959

Merged
merged 1 commit into from
Mar 15, 2024

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Mar 12, 2024

Description of Changes

Benchmarking full-scan: Collecting 100 samples in estimated 
full-scan               time:   [162.84 ms 164.37 ms 166.88 ms]
                        change: [-47.765% -47.089% -46.260%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking query-indexes-multi: Collecting 100 samples in 
query-indexes-multi     time:   [1.1162 µs 1.1202 µs 1.1274 µs]
                        change: [-6.4601% -5.3782% -4.0002%] (p = 0.00 < 0.05)
                        Performance has improved.

API and ABI breaking changes

None

Expected complexity level and risk

2

@Centril Centril changed the title [WIP] Splitting this up Don't clone QueryExpr; Make eval + iter_filtered_chunks avoid PVs; Related cleanup Mar 12, 2024
@Centril Centril changed the title Don't clone QueryExpr; Make eval + iter_filtered_chunks avoid PVs; Related cleanup Don't clone QueryExpr; Make eval + iter_filtered_chunks avoid PVs Mar 12, 2024
@joshua-spacetime
Copy link
Collaborator

Here are the numbers I get compared to latest master:

full-scan               time:   [82.200 ms 82.378 ms 82.581 ms]
                        change: [-16.491% -16.056% -15.598%] (p = 0.00 < 0.05)
                        Performance has improved.

full-join               time:   [236.58 µs 236.81 µs 237.07 µs]
                        change: [-8.2742% -8.0187% -7.7901%] (p = 0.00 < 0.05)
                        Performance has improved.

incr-select             time:   [666.93 ns 667.28 ns 667.64 ns]
                        change: [+53.585% +53.879% +54.170%] (p = 0.00 < 0.05)
                        Performance has regressed.

incr-join               time:   [3.6968 µs 3.7051 µs 3.7123 µs]
                        change: [-18.685% -18.431% -18.173%] (p = 0.00 < 0.05)
                        Performance has improved.

I haven't reviewed this patch yet, so I don't know where the regression in incr-select is coming from. I'll get back to you once I have more information. But pretty good numbers otherwise.

@jdetter
Copy link
Collaborator

jdetter commented Mar 13, 2024

Master benchmark on 14900k:

    Finished bench [optimized + debuginfo] target(s) in 0.15s
     Running benches/subscription.rs (/home/ubuntu/SpacetimeDB/target/release/deps/subscription-7c24f5921892496c)
Benchmarking full-scan: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 29.7s, or reduce sample count to 10.
full-scan               time:   [243.00 ms 243.98 ms 244.95 ms]

full-join               time:   [686.00 µs 686.16 µs 686.33 µs]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high severe

incr-select             time:   [481.07 ns 481.26 ns 481.45 ns]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

incr-join               time:   [6.4075 µs 6.4135 µs 6.4199 µs]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

query-indexes-multi     time:   [925.65 ns 925.95 ns 926.25 ns]

crntril/borrow-eq:

Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.0s, or reduce sample count to 40.
full-scan               time:   [113.10 ms 113.30 ms 113.50 ms]
                        change: [-53.768% -53.563% -53.363%] (p = 0.00 < 0.05)
                        Performance has improved.

full-join               time:   [573.18 µs 573.49 µs 573.83 µs]
                        change: [-16.468% -16.422% -16.372%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

incr-select             time:   [455.01 ns 455.35 ns 455.72 ns]
                        change: [-5.4337% -5.3479% -5.2623%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

incr-join               time:   [5.0606 µs 5.0649 µs 5.0692 µs]
                        change: [-21.097% -21.008% -20.923%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

query-indexes-multi     time:   [843.18 ns 843.48 ns 843.79 ns]
                        change: [-8.9223% -8.8640% -8.8035%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

@Centril Centril requested a review from gefjon March 13, 2024 23:47
Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance numbers are good, so I would like to merge this as soon as possible.

However I'm particularly concerned about the extra lifetime that was added to the datastore traits. I'm totally fine with it, if the perf justifies it. But I am not aware of it being an issue, and so I really think we should revert those changes.

pub struct IndexCursor<'a, R: RangeBounds<AlgebraicValue>> {
pub table: DbTable,
pub iter: IterByColRange<'a, R>,
pub struct IndexCursor<'a, 'c, R: RangeBounds<AlgebraicValue>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know by how much this improves performance?

@@ -71,13 +71,13 @@ impl StateView for CommittedState {
/// Returns an iterator,
/// yielding every row in the table identified by `table_id`,
/// where the values of `cols` are contained in `range`.
fn iter_by_col_range<'a, R: RangeBounds<AlgebraicValue>>(
fn iter_by_col_range<'a, 'c, R: RangeBounds<AlgebraicValue>>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here?

crates/core/src/host/instance_env.rs Outdated Show resolved Hide resolved
lhs: DatabaseTableUpdate,
) -> Result<impl Iterator<Item = ProductValue>, DBError> {
) -> Result<(usize, impl 'a + Iterator<Item = ResPV>), DBError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usize for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usize is the size estimate that RelOps has. It doesn't exactly fit the size_hint method of the Iterator trait, so it's propagated here to be used in with_capacity.

Comment on lines 563 to 566
join_2
.into_iter()
.chain(join_4)
.chain(join_6)
.map(TableOp::delete)
.chain(join_1.into_iter().chain(join_5).map(TableOp::insert)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find this easier to rectify with the above formula.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean; the reason this now has to use .push and why we cannot .chain is because we don't store temporary Vecs. It isn't pretty, but temporary allocations seem wasteful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What temporary allocations exactly? These are iterators so we should be able to chain them still right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are iterators over Result<PV, ErrorVM> whereas the other ones are iterators over PV.

Comment on lines 577 to 578
let mut updates =
Vec::with_capacity(join_2.len() + join_4_estimate + join_6.len() + join_1_estimate + join_5.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it, but I personally think it makes the code less readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the two helper functions make the code slightly more readable but I don't disagree that this with_capacity makes things less readable. Yet, we have these size estimates, so let's use them or lose them.

crates/core/src/vm.rs Outdated Show resolved Hide resolved
crates/core/src/vm.rs Outdated Show resolved Hide resolved
/// The column id for which the index is defined.
pub index_col: ColId,
/// The column ids for which the index is defined.
pub index_cols: &'c ColList,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we support multi-column index joins, we need a test for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't; the field is always a singleton. This is just because a borrowed ColList is needed now so we need to propagate it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably something I would revert. I think it's fine to keep changes related to #813 and #833 in this patch, but this change, that is, passing a borrowed ColList, if we are going to make it, probably should be in a separate patch.

@@ -602,7 +602,9 @@ impl IndexJoin {
// In other words, when an index join has two delta tables.
pub fn to_inner_join(self) -> QueryExpr {
if self.return_index_rows {
let col_lhs = self.index_side.head().fields[usize::from(self.index_col)].field.clone();
let col_lhs = self.index_side.head().fields[self.index_cols.head().idx()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a potential correctness issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field is a singleton list so it's semantically equivalent.

@cloutiertyler
Copy link
Contributor

I, too, am a bit concerned about modifying the datastore trait in this way, if only because it may have implications on MVCC which I need to be aware of. Could you please make a post here (or better yet a comment in the codebase) which explains the purpose of this lifetime and why it is necessary?

@gefjon
Copy link
Contributor

gefjon commented Mar 14, 2024

I'd really like for these two tickets to get separate PRs so we can measure the performance impact of the solutions separately.

@Centril Centril added the abi-break A PR that makes an ABI breaking change label Mar 14, 2024
@Centril Centril removed the abi-break A PR that makes an ABI breaking change label Mar 14, 2024
@Centril Centril force-pushed the centril/borrow-qe branch 2 times, most recently from 0d3f1c8 to 9154d83 Compare March 14, 2024 23:45
@Centril Centril changed the title Don't clone QueryExpr; Make eval + iter_filtered_chunks avoid PVs Don't clone QueryExpr; Make eval avoid PVs Mar 14, 2024
@joshua-spacetime joshua-spacetime changed the base branch from centril/fix-933 to master March 15, 2024 22:06
@joshua-spacetime joshua-spacetime changed the title Don't clone QueryExpr; Make eval avoid PVs perf(813): Avoid materialization of product values in subscriptions Mar 15, 2024
@joshua-spacetime
Copy link
Collaborator

full-scan               time:   [84.373 ms 84.635 ms 84.909 ms]
                        change: [-16.825% -16.324% -15.837%] (p = 0.00 < 0.05)
                        Performance has improved.

full-join               time:   [241.59 µs 242.01 µs 242.56 µs]
                        change: [-13.306% -12.923% -12.537%] (p = 0.00 < 0.05)
                        Performance has improved.

Closes #813.

A subscription will no longer materialize product values,
for queries with read-only row operations.
but instead it will serialize from bflatn straight to bsatn.

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
@joshua-spacetime joshua-spacetime added this pull request to the merge queue Mar 15, 2024
Merged via the queue into master with commit 755457a Mar 15, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query evaluators should return iterators
5 participants