Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: coalesce read paths #7477

Merged
merged 9 commits into from
Apr 25, 2024
Merged

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Apr 23, 2024

Problem

We are currently supporting two read paths. No bueno.

Summary of changes

High level: use vectored read path to serve get page requests - gated by get_impl config
Low level:

  1. Add ps config, get_impl to specify which read path to use when serving get page requests
  2. Fix base cached image handling for the vectored read path. This was subtly broken: previously we
    would not mark keys that went past their cached lsn as complete. This is a self standing change which
    could be its own PR, but I've included it here because writing separate tests for it is tricky.
  3. Fork get page to use either the legacy or vectored implementation
  4. Validate the use of vectored read path when serving get page requests against the legacy implementation.
    Controlled by validate_vectored_get ps config.
  5. Use the vectored read path to serve get page requests in tests (with validation).

Note

Since the vectored read path does not go through the page cache to read buffers, this change also
amounts to a removal of the buffer page cache. Materialized page cache is still used.

Deployment Plan

  1. Make sure this is correct.
    1.1. Run test suite with validation - DONE
    1.2. Run a prodlike bench with validation in staging

  2. Understand perf degradation (if any)
    2.1. Run checked in benchmarks - DONE here - looks ok, didn't notice significant degradations
    2.2. I'll manually run test_pageserver_max_throughput_getpage_at_latest_lsn - DONE here - looks better than the current read path
    2.2. Run prodlike bench without validation in staging

  3. Staged rollout in prod without validation

Related #7381

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@VladLazar VladLazar force-pushed the vlad/coalesce-read-paths-test branch from d52eb01 to 0edc368 Compare April 23, 2024 11:51
Copy link

github-actions bot commented Apr 23, 2024

2766 tests run: 2645 passed, 0 failed, 121 skipped (full report)


Code coverage* (full report)

  • functions: 28.1% (6486 of 23064 functions)
  • lines: 47.0% (46102 of 98113 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
fed8579 at 2024-04-24T15:30:19.671Z :recycle:

@VladLazar VladLazar force-pushed the vlad/coalesce-read-paths-test branch 3 times, most recently from 064d1c8 to 1826b13 Compare April 23, 2024 17:45
@VladLazar VladLazar changed the title [DNM] coalescing read paths testing pageserver: coalesce read paths Apr 23, 2024
@VladLazar VladLazar added run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label and removed run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label labels Apr 24, 2024
@VladLazar VladLazar force-pushed the vlad/coalesce-read-paths-test branch from ff94f16 to fed8579 Compare April 24, 2024 14:44
@VladLazar VladLazar marked this pull request as ready for review April 24, 2024 17:04
@VladLazar VladLazar requested a review from a team as a code owner April 24, 2024 17:04
@VladLazar VladLazar requested review from skyzh and jcsp April 24, 2024 17:04
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@VladLazar
Copy link
Contributor Author

I reran test_pageserver_max_throughput_getpage_at_latest_lsn to compare the implementations:
Interestingly when vectored get is enabled, the benchmark achieves higher throughput and lower mean latencies, but higher tail latencies. I think this can be explained by the fact that we are not using the buffer cache in the vectored implementation. What's surprising is that the vectored implementation performs well even with a cold linux page cache.

Non vectored get - baseline

[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 1
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 6
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 114762
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.260 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.416 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.450 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 0.536 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 1.018 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 1
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 108648
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.275 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.376 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.409 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 0.510 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 0.979 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 6
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 425238
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.704 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 1.162 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 1.404 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 1.869 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 3.939 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 473836
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.632 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.939 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 1.128 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 1.542 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 2.875 ms

Vectored get - cold page page cache

[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 1
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 136567
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.218 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.516 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.747 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 1.307 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 2.443 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 6
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 504282
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.594 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 1.042 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 1.544 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 2.751 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 5.391 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 498062
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.601 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 1.005 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 1.394 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 2.271 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 4.207 ms

Vectored get - hot page cache

[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 1
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 6
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 360067
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.082 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.130 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.425 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 0.492 ms
[release-pg15-1-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 0.921 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 1
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 156694
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.190 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.384 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.421 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 0.479 ms
[release-pg15-1-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 0.911 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 6
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 547575
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.547 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.897 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 1.101 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 1.605 ms
[release-pg15-10-6-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 5.243 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.n_tenants: 10
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pgbench_scale: 13
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.duration: 30 s
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.page_cache_size: 134217728 byte
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.pageserver_config_override.max_file_descriptors: 500000
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.request_count: 575608
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_mean: 0.520 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p95: 0.782 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99: 0.966 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.9: 1.458 ms
[release-pg15-10-13-30].pageserver_max_throughput_getpage_at_latest_lsn.latency_percentiles.p99.99: 2.963 ms

@VladLazar
Copy link
Contributor Author

At this point I think we should merge and try it out in staging.

@VladLazar VladLazar merged commit e4a279d into main Apr 25, 2024
57 checks passed
@VladLazar VladLazar deleted the vlad/coalesce-read-paths-test branch April 25, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants