Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't do alignment computations during BFLATN ser/de #986

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

gefjon
Copy link
Contributor

@gefjon gefjon commented Mar 15, 2024

Description of Changes

AlgebraicTypeLayout and friends already include full layout information, including properly-aligned offsets for ProductTypeElementLayouts. As such, there's no need to do any alignment computation during serialize_value or write_value.

Instead, while traversing a ProductTypeLayout, we can use each element's offset to update the curr_offset.

API and ABI breaking changes

N/a

Expected complexity level and risk

3 - minor change, but in close contact with some unsafe code.

Testing

  • table tests pass, including proptests.
  • Wait for full CI to pass, incl. core tests and smoketests.

`AlgebraicTypeLayout` and friends already include full layout information,
including properly-aligned offsets for `ProductTypeElementLayout`s.
As such, there's no need to do any alignment computation
during `serialize_value` or `write_value`.

Instead, while traversing a `ProductTypeLayout`,
we can use each element's `offset` to update the `curr_offset`.
@gefjon gefjon requested a review from Centril March 15, 2024 15:21
@gefjon
Copy link
Contributor Author

gefjon commented Mar 15, 2024

benchmarks please

Copy link

github-actions bot commented Mar 15, 2024

Criterion benchmark results

Criterion benchmark report

YOU SHOULD PROBABLY IGNORE THESE RESULTS.

Criterion is a wall time based benchmarking system that is extremely noisy when run on CI. We collect these results for longitudinal analysis, but they are not reliable for comparing individual PRs.

Go look at the callgrind report instead.

empty

db on disk new latency old latency new throughput old throughput
sqlite 💿 - 412.2±1.39ns - -
sqlite 🧠 - 406.1±1.22ns - -
stdb_raw 💿 707.1±1.14ns 713.9±1.10ns - -
stdb_raw 🧠 684.0±0.88ns 680.0±0.74ns - -

insert_1

db on disk schema indices preload new latency old latency new throughput old throughput

insert_bulk

db on disk schema indices preload count new latency old latency new throughput old throughput
sqlite 💿 u32_u64_str btree_each_column 2048 256 - 530.0±49.57µs - 1886 tx/sec
sqlite 💿 u32_u64_str unique_0 2048 256 - 134.7±0.36µs - 7.2 Ktx/sec
sqlite 💿 u32_u64_u64 btree_each_column 2048 256 - 421.7±0.73µs - 2.3 Ktx/sec
sqlite 💿 u32_u64_u64 unique_0 2048 256 - 121.1±0.73µs - 8.1 Ktx/sec
sqlite 🧠 u32_u64_str btree_each_column 2048 256 - 443.9±0.67µs - 2.2 Ktx/sec
sqlite 🧠 u32_u64_str unique_0 2048 256 - 120.7±0.78µs - 8.1 Ktx/sec
sqlite 🧠 u32_u64_u64 btree_each_column 2048 256 - 366.8±0.45µs - 2.7 Ktx/sec
sqlite 🧠 u32_u64_u64 unique_0 2048 256 - 103.0±0.76µs - 9.5 Ktx/sec
stdb_raw 💿 u32_u64_str btree_each_column 2048 256 709.7±0.61µs 725.6±1.60µs 1409 tx/sec 1378 tx/sec
stdb_raw 💿 u32_u64_str unique_0 2048 256 617.2±0.56µs 625.7±0.57µs 1620 tx/sec 1598 tx/sec
stdb_raw 💿 u32_u64_u64 btree_each_column 2048 256 417.7±0.68µs 435.4±0.93µs 2.3 Ktx/sec 2.2 Ktx/sec
stdb_raw 💿 u32_u64_u64 unique_0 2048 256 376.2±0.52µs 393.9±0.68µs 2.6 Ktx/sec 2.5 Ktx/sec
stdb_raw 🧠 u32_u64_str btree_each_column 2048 256 487.6±0.44µs 500.8±0.22µs 2.0 Ktx/sec 1996 tx/sec
stdb_raw 🧠 u32_u64_str unique_0 2048 256 398.2±1.19µs 407.1±0.65µs 2.5 Ktx/sec 2.4 Ktx/sec
stdb_raw 🧠 u32_u64_u64 btree_each_column 2048 256 315.8±0.57µs 328.6±0.42µs 3.1 Ktx/sec 3.0 Ktx/sec
stdb_raw 🧠 u32_u64_u64 unique_0 2048 256 278.2±0.38µs 295.9±0.99µs 3.5 Ktx/sec 3.3 Ktx/sec

iterate

db on disk schema indices new latency old latency new throughput old throughput
sqlite 💿 u32_u64_str unique_0 - 20.4±0.18µs - 47.9 Ktx/sec
sqlite 💿 u32_u64_u64 unique_0 - 19.4±0.16µs - 50.3 Ktx/sec
sqlite 🧠 u32_u64_str unique_0 - 18.7±0.15µs - 52.3 Ktx/sec
sqlite 🧠 u32_u64_u64 unique_0 - 18.1±0.26µs - 53.8 Ktx/sec
stdb_raw 💿 u32_u64_str unique_0 18.7±0.00µs 18.7±0.00µs 52.3 Ktx/sec 52.3 Ktx/sec
stdb_raw 💿 u32_u64_u64 unique_0 15.9±0.00µs 15.8±0.00µs 61.6 Ktx/sec 61.6 Ktx/sec
stdb_raw 🧠 u32_u64_str unique_0 18.7±0.00µs 18.6±0.00µs 52.3 Ktx/sec 52.4 Ktx/sec
stdb_raw 🧠 u32_u64_u64 unique_0 15.8±0.00µs 15.8±0.00µs 61.7 Ktx/sec 61.8 Ktx/sec

find_unique

db on disk key type preload new latency old latency new throughput old throughput

filter

db on disk key type index strategy load count new latency old latency new throughput old throughput
sqlite 💿 string index 2048 256 - 68.3±0.28µs - 14.3 Ktx/sec
sqlite 💿 u64 index 2048 256 - 64.0±0.19µs - 15.2 Ktx/sec
sqlite 🧠 string index 2048 256 - 65.5±0.17µs - 14.9 Ktx/sec
sqlite 🧠 u64 index 2048 256 - 59.7±0.29µs - 16.4 Ktx/sec
stdb_raw 💿 string index 2048 256 5.6±0.00µs 5.6±0.00µs 175.5 Ktx/sec 175.5 Ktx/sec
stdb_raw 💿 u64 index 2048 256 5.5±0.00µs 5.5±0.00µs 178.1 Ktx/sec 178.2 Ktx/sec
stdb_raw 🧠 string index 2048 256 5.5±0.00µs 5.5±0.01µs 176.6 Ktx/sec 176.3 Ktx/sec
stdb_raw 🧠 u64 index 2048 256 5.4±0.00µs 5.5±0.00µs 179.3 Ktx/sec 179.0 Ktx/sec

serialize

schema format count new latency old latency new throughput old throughput
u32_u64_str bsatn 100 2.4±0.00µs 2.4±0.00µs 39.6 Mtx/sec 39.0 Mtx/sec
u32_u64_str json 100 5.2±0.04µs 5.0±0.04µs 18.2 Mtx/sec 19.3 Mtx/sec
u32_u64_str product_value 100 623.8±1.08ns 651.3±0.91ns 152.9 Mtx/sec 146.4 Mtx/sec
u32_u64_u64 bsatn 100 1690.0±28.78ns 1722.3±37.03ns 56.4 Mtx/sec 55.4 Mtx/sec
u32_u64_u64 json 100 3.4±0.07µs 3.4±0.09µs 28.4 Mtx/sec 27.9 Mtx/sec
u32_u64_u64 product_value 100 598.0±0.40ns 603.2±0.27ns 159.5 Mtx/sec 158.1 Mtx/sec

stdb_module_large_arguments

arg size new latency old latency new throughput old throughput
64KiB 78.7±4.33µs 96.2±6.65µs - -

stdb_module_print_bulk

line count new latency old latency new throughput old throughput
1 39.5±8.01µs 43.2±5.37µs - -
100 346.6±5.52µs 359.8±7.97µs - -
1000 2.9±0.15ms 3.1±0.10ms - -

remaining

name new latency old latency new throughput old throughput
sqlite/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 - 45.9±0.29µs - 21.3 Ktx/sec
sqlite/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 - 40.6±0.19µs - 24.1 Ktx/sec
sqlite/🧠/update_bulk/u32_u64_str/unique_0/load=2048/count=256 - 38.3±0.31µs - 25.5 Ktx/sec
sqlite/🧠/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 - 34.8±0.29µs - 28.0 Ktx/sec
stdb_module/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 3.1±0.04ms 3.1±0.01ms 319 tx/sec 321 tx/sec
stdb_module/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 2.0±0.01ms 2.2±0.02ms 490 tx/sec 462 tx/sec
stdb_raw/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 1108.9±0.59µs 1121.6±1.34µs 901 tx/sec 891 tx/sec
stdb_raw/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 744.5±0.66µs 759.1±0.62µs 1343 tx/sec 1317 tx/sec
stdb_raw/🧠/update_bulk/u32_u64_str/unique_0/load=2048/count=256 784.8±0.52µs 795.9±0.53µs 1274 tx/sec 1256 tx/sec
stdb_raw/🧠/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 552.6±0.40µs 564.2±0.98µs 1809 tx/sec 1772 tx/sec

Copy link

github-actions bot commented Mar 15, 2024

Callgrind benchmark results

Callgrind Benchmark Report

These benchmarks were run using callgrind,
an instruction-level profiler. They allow comparisons between sqlite (sqlite), SpacetimeDB running through a module (stdb_module), and the underlying SpacetimeDB data storage engine (stdb_raw). Callgrind emulates a CPU to collect the below estimates.

Measurement changes larger than five percent are in bold.

In-memory benchmarks

callgrind: empty transaction

db total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw 6154 6171 -0.28% 6986 6943 0.62%
sqlite 5568 5558 0.18% 6100 6072 0.46%

callgrind: filter

db schema indices count preload _column data_type total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str no_index 64 128 1 u64 84053 91094 -7.73% 84997 92180 -7.79%
stdb_raw u32_u64_str no_index 64 128 2 string 126745 132378 -4.26% 128009 133838 -4.36%
stdb_raw u32_u64_str btree_each_column 64 128 2 string 29442 29443 -0.00% 30638 30779 -0.46%
stdb_raw u32_u64_str btree_each_column 64 128 1 u64 28430 28401 0.10% 29398 29499 -0.34%
sqlite u32_u64_str no_index 64 128 1 u64 118033 118033 0.00% 119267 119439 -0.14%
sqlite u32_u64_str no_index 64 128 2 string 138612 138612 0.00% 140118 140264 -0.10%
sqlite u32_u64_str btree_each_column 64 128 2 string 128475 128475 0.00% 130187 130249 -0.05%
sqlite u32_u64_str btree_each_column 64 128 1 u64 125350 125350 0.00% 126906 126834 0.06%

callgrind: insert bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 64 128 1464239 1484619 -1.37% 1492735 1513867 -1.40%
stdb_raw u32_u64_str btree_each_column 64 128 1605629 1631435 -1.58% 1646437 1679011 -1.94%
sqlite u32_u64_str unique_0 64 128 396372 396342 0.01% 414614 420392 -1.37%
sqlite u32_u64_str btree_each_column 64 128 981664 981664 0.00% 1019010 1027218 -0.80%

callgrind: iterate

db schema indices count total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 179195 179196 -0.00% 179627 179696 -0.04%
stdb_raw u32_u64_str unique_0 64 18805 18806 -0.01% 19125 19182 -0.30%
sqlite u32_u64_str unique_0 1024 1044679 1044669 0.00% 1047857 1047835 0.00%
sqlite u32_u64_str unique_0 64 74741 74741 0.00% 75917 75977 -0.08%

callgrind: serialize_product_value

count format total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
64 json 49521 49521 0.00% 52207 52143 0.12%
64 bsatn 26627 26627 0.00% 28905 28905 0.00%
16 json 12643 12643 0.00% 14581 14513 0.47%
16 bsatn 8357 8357 0.00% 9683 9717 -0.35%

callgrind: update bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 1024 46256448 46821607 -1.21% 47592230 48311273 -1.49%
stdb_raw u32_u64_str unique_0 64 128 2709207 2744552 -1.29% 2800241 2845820 -1.60%
sqlite u32_u64_str unique_0 1024 1024 1801965 1801965 0.00% 1811333 1811231 0.01%
sqlite u32_u64_str unique_0 64 128 128479 128479 0.00% 131427 131369 0.04%
On-disk benchmarks

callgrind: empty transaction

db total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw 6396 6405 -0.14% 7252 7177 1.05%
sqlite 5600 5600 0.00% 6128 6144 -0.26%

callgrind: filter

db schema indices count preload _column data_type total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str no_index 64 128 1 u64 84295 91336 -7.71% 85259 92494 -7.82%
stdb_raw u32_u64_str no_index 64 128 2 string 126987 132620 -4.25% 128495 134280 -4.31%
stdb_raw u32_u64_str btree_each_column 64 128 1 u64 28642 28643 -0.00% 29630 29733 -0.35%
stdb_raw u32_u64_str btree_each_column 64 128 2 string 29684 29685 -0.00% 30872 30969 -0.31%
sqlite u32_u64_str no_index 64 128 1 u64 119954 119954 0.00% 121532 121660 -0.11%
sqlite u32_u64_str no_index 64 128 2 string 140533 140533 0.00% 142399 142493 -0.07%
sqlite u32_u64_str btree_each_column 64 128 2 string 130525 130525 0.00% 132587 132545 0.03%
sqlite u32_u64_str btree_each_column 64 128 1 u64 127446 127446 0.00% 129360 129284 0.06%

callgrind: insert bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 64 128 2274331 2294563 -0.88% 2304955 2325197 -0.87%
stdb_raw u32_u64_str btree_each_column 64 128 2424892 2448052 -0.95% 2466430 2493708 -1.09%
sqlite u32_u64_str unique_0 64 128 413848 413848 0.00% 431508 437264 -1.32%
sqlite u32_u64_str btree_each_column 64 128 1019854 1019854 0.00% 1058018 1066370 -0.78%

callgrind: iterate

db schema indices count total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 179437 179438 -0.00% 179889 179998 -0.06%
stdb_raw u32_u64_str unique_0 64 19047 19048 -0.01% 19363 19480 -0.60%
sqlite u32_u64_str unique_0 1024 1047737 1047737 0.00% 1051503 1051523 -0.00%
sqlite u32_u64_str unique_0 64 76507 76507 0.00% 77857 77857 0.00%

callgrind: serialize_product_value

count format total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
64 json 49521 49521 0.00% 52207 52143 0.12%
64 bsatn 26627 26627 0.00% 28905 28905 0.00%
16 json 12643 12643 0.00% 14581 14513 0.47%
16 bsatn 8357 8357 0.00% 9683 9717 -0.35%

callgrind: update bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 1024 71503262 72071065 -0.79% 73169820 73908125 -1.00%
stdb_raw u32_u64_str unique_0 64 128 4147562 4183139 -0.85% 4248708 4294189 -1.06%
sqlite u32_u64_str unique_0 1024 1024 1809726 1809726 0.00% 1818670 1818596 0.00%
sqlite u32_u64_str unique_0 64 128 132627 132627 0.00% 135811 135701 0.08%

@bfops bfops added performance release-any To be landed in any release window labels Mar 15, 2024
Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full-scan               time:   [59.269 ms 59.444 ms 59.639 ms]
                        change: [-28.386% -28.129% -27.845%] (p = 0.00 < 0.05)
                        Performance has improved.

full-join               time:   [210.61 µs 210.80 µs 211.01 µs]
                        change: [-13.686% -13.497% -13.312%] (p = 0.00 < 0.05)
                        Performance has improved.

@Centril
Copy link
Contributor

Centril commented Mar 16, 2024

I'll review this on Monday :)

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@gefjon gefjon added this pull request to the merge queue Mar 18, 2024
Merged via the queue into master with commit 02aeac7 Mar 18, 2024
9 checks passed
@Centril Centril deleted the phoebe/layout-absolute-offset branch March 20, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-any To be landed in any release window
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants