Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track variable sized memory manager allocations through the buffer manager #3564

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

benjaminwinger
Copy link
Collaborator

To follow up the missing part of #3243, this adds BufferManager tracking for variable-sized memory allocations made through the MemoryManager.

I also noticed that the buffer manager wasn't freeing the memory related to evictions if it fails to allocate the total amount of memory requested when claiming a frame. I don't know how much of our code at the moment can safely recover from failing to claim memory and continue allocations though, so it may not make a difference at the moment. I had noticed subsequent failures which seemed to be related to data not being successfully written in the multi copy test that was failing due to not having enough memory in #3557 (e.g. accessing out of bounds in a hash index disk array, which I think would have been in bounds if the data had been updated correctly), so at the moment it may not always be possible to continue using the database after such a failure.

Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can we add some tests over this? We should add a new call function memory_usage(), which returns currently used memory in bm.

@benjaminwinger
Copy link
Collaborator Author

For the tests, do we just want to make sure it's not increasing after queries? In the e2e tests, is there a way of storing and comparing variables? Otherwise it seems like it would be easier to write a C++ test that could check both during and after the memory is allocated, but that wouldn't need a call function (we could have both I guess).

Page caching will also be an issue with that, as unless we also add a way to drop those caches we won't be able to guarantee that the memory usage won't go up or down after a query if new pages are cached, or if old pages are dropped to make space.

@ray6080
Copy link
Contributor

ray6080 commented Jun 3, 2024

For the tests, do we just want to make sure it's not increasing after queries? In the e2e tests, is there a way of storing and comparing variables? Otherwise it seems like it would be easier to write a C++ test that could check both during and after the memory is allocated, but that wouldn't need a call function (we could have both I guess).

Page caching will also be an issue with that, as unless we also add a way to drop those caches we won't be able to guarantee that the memory usage won't go up or down after a query if new pages are cached, or if old pages are dropped to make space.

Yeah, I think we just want to make sure that the mem usage is not going up after re-run same small queries. The case could be we record the mem_usage() after we open the database, then run a small query should bump the mem_usage() a bit higher, but if we run the small query again, the mem_usage should not go up assuming the tables the query touched is small to be fully cached already, and MM always releases its block buffers immediately after it's unpinned.

There isn't a way of storing variables now in testing framework, might be easier to do through CPP instead of Cypher.

@benjaminwinger
Copy link
Collaborator Author

I've added a simple test.

Copy link

github-actions bot commented Jun 4, 2024

Benchmark Result

Master commit hash: 35eee929fc96706a163dc03b4f7706978d78ccb8
Branch commit hash: dcfb07b85cd708a0715b19d79e92982e615c7e9d

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 650.08 652.25 -2.16 (-0.33%)
aggregation q28 13892.40 13072.70 819.70 (6.27%)
filter q14 131.82 132.51 -0.69 (-0.52%)
filter q15 143.12 137.05 6.07 (4.43%)
filter q16 309.60 310.13 -0.53 (-0.17%)
filter q17 449.36 450.91 -1.56 (-0.35%)
filter q18 1915.61 1953.11 -37.51 (-1.92%)
fixed_size_expr_evaluator q07 571.91 571.74 0.17 (0.03%)
fixed_size_expr_evaluator q08 801.23 799.51 1.73 (0.22%)
fixed_size_expr_evaluator q09 797.58 796.27 1.32 (0.17%)
fixed_size_expr_evaluator q10 253.15 247.65 5.50 (2.22%)
fixed_size_expr_evaluator q11 246.25 242.89 3.36 (1.38%)
fixed_size_expr_evaluator q12 242.26 241.68 0.57 (0.24%)
fixed_size_expr_evaluator q13 1480.11 1482.80 -2.69 (-0.18%)
fixed_size_seq_scan q23 123.42 123.69 -0.27 (-0.22%)
join q29 664.25 704.94 -40.69 (-5.77%)
join q30 1359.06 1395.87 -36.82 (-2.64%)
join q31 48.98 38.77 10.22 (26.36%)
ldbc_snb_ic q35 3139.28 3223.53 -84.25 (-2.61%)
ldbc_snb_ic q36 121.13 131.44 -10.31 (-7.84%)
ldbc_snb_is q32 13.01 11.62 1.40 (12.02%)
ldbc_snb_is q33 94.70 98.06 -3.35 (-3.42%)
ldbc_snb_is q34 101.66 100.23 1.43 (1.42%)
order_by q25 134.92 132.79 2.13 (1.61%)
order_by q26 437.71 446.48 -8.76 (-1.96%)
order_by q27 1411.08 1423.13 -12.04 (-0.85%)
scan_after_filter q01 176.33 174.51 1.82 (1.05%)
scan_after_filter q02 156.46 155.28 1.19 (0.77%)
shortest_path_ldbc100 q39 55.28 55.90 -0.62 (-1.10%)
var_size_expr_evaluator q03 2028.22 2049.14 -20.92 (-1.02%)
var_size_expr_evaluator q04 2216.08 2273.39 -57.31 (-2.52%)
var_size_expr_evaluator q05 2646.25 2545.55 100.71 (3.96%)
var_size_expr_evaluator q06 1366.59 1394.92 -28.33 (-2.03%)
var_size_seq_scan q19 1448.08 1468.85 -20.77 (-1.41%)
var_size_seq_scan q20 3008.57 3045.50 -36.94 (-1.21%)
var_size_seq_scan q21 2373.73 2375.60 -1.86 (-0.08%)
var_size_seq_scan q22 128.92 131.82 -2.91 (-2.20%)

@benjaminwinger benjaminwinger merged commit f0fe8cb into master Jun 5, 2024
19 checks passed
@benjaminwinger benjaminwinger deleted the track-variable-sized-allocs branch June 5, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants