test_runner/performance: skip test_bulk_insert #7238

bayandin · 2024-03-25T23:05:48Z

Problem

test_bulk_insert becomes too slow and it fails constantly: #7124

Summary of changes

Skip test_bulk_insert until it's fixed

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist

github-actions · 2024-03-25T23:55:44Z

2718 tests run: 2581 passed, 0 failed, 137 skipped (full report)

Code coverage* (full report)

functions: 28.1% (6285 of 22345 functions)
lines: 47.0% (44210 of 94093 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
28470cf at 2024-03-25T23:55:43.882Z :recycle:}

@jcsp

part of #7124 # Problem (Re-stating the problem from #7124 for posterity) The `test_bulk_ingest` benchmark shows about 2x lower throughput with `tokio-epoll-uring` compared to `std-fs`. That's why we temporarily disabled it in #7238. The reason for this regression is that the benchmark runs on a system without memory pressure and thus std-fs writes don't block on disk IO but only copy the data into the kernel page cache. `tokio-epoll-uring` cannot beat that at this time, and possibly never. (However, under memory pressure, std-fs would stall the executor thread on kernel page cache writeback disk IO. That's why we want to use `tokio-epoll-uring`. And we likely want to use O_DIRECT in the future, at which point std-fs becomes an absolute show-stopper.) More elaborate analysis: https://neondatabase.notion.site/Why-test_bulk_ingest-is-slower-with-tokio-epoll-uring-918c5e619df045a7bd7b5f806cfbd53f?pvs=4 # Changes This PR increases the buffer size of `blob_io` and `EphemeralFile` from PAGE_SZ=8k to 64k. Longer-term, we probably want to do double-buffering / pipelined IO. # Resource Usage We currently do not flush the buffer when freezing the InMemoryLayer. That means a single Timeline can have multiple 64k buffers alive, esp if flushing is slow. This poses an OOM risk. We should either bound the number of frozen layers (#7317). Or we should change the freezing code to flush the buffer and drop the allocation. However, that's future work. # Performance (Measurements done on i3en.3xlarge.) The `test_bulk_insert.py` is too noisy, even with instance storage. It varies by 30-40%. I suspect that's due to compaction. Raising amount of data by 10x doesn't help with the noisiness.) So, I used the `bench_ingest` from @jcsp 's #7409 . Specifically, the `ingest-small-values/ingest 128MB/100b seq` and `ingest-small-values/ingest 128MB/100b seq, no delta` benchmarks. | | | seq | seq, no delta | |-----|-------------------|-----|---------------| | 8k | std-fs | 55 | 165 | | 8k | tokio-epoll-uring | 37 | 107 | | 64k | std-fs | 55 | 180 | | 64k | tokio-epoll-uring | 48 | 164 | The `8k` is from before this PR, the `64k` is with this PR. The values are the throughput reported by the benchmark (MiB/s). We see that this PR gets `tokio-epoll-uring` from 67% to 87% of `std-fs` performance in the `seq` benchmark. Notably, `seq` appears to hit some other bottleneck at `55 MiB/s`. CC'ing #7418 due to the apparent bottlenecks in writing delta layers. For `seq, no delta`, this PR gets `tokio-epoll-uring` from 64% to 91% of `std-fs` performance.

test_runner/performance: skip test_bulk_insert

28470cf

bayandin requested a review from problame March 25, 2024 23:05

problame approved these changes Mar 26, 2024

View reviewed changes

problame mentioned this pull request Mar 26, 2024

test_bulk_insert / walingest generally is slower with tokio-epoll-uring #7124

Closed

2 tasks

bayandin merged commit 3426619 into main Mar 26, 2024
53 checks passed

bayandin deleted the bayandin/skip-test_bulk_insert branch March 26, 2024 15:10

problame mentioned this pull request Mar 28, 2024

perf(walingest): mitigate bulk ingest throughput regression through larger EphemeralFile in-memory buffer #7273

Closed

5 tasks

problame mentioned this pull request Apr 23, 2024

perf!: use larger buffers for blob_io and ephemeral_file #7485

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_runner/performance: skip test_bulk_insert #7238

test_runner/performance: skip test_bulk_insert #7238

bayandin commented Mar 25, 2024

github-actions bot commented Mar 25, 2024

test_runner/performance: skip test_bulk_insert #7238

test_runner/performance: skip test_bulk_insert #7238

Conversation

bayandin commented Mar 25, 2024

Problem

Summary of changes

Checklist before requesting a review

Checklist before merging

github-actions bot commented Mar 25, 2024

2718 tests run: 2581 passed, 0 failed, 137 skipped (full report)

Code coverage* (full report)