Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a new spine that does compaction in a thread. #1863

Merged
merged 1 commit into from
Jun 20, 2024
Merged

Adds a new spine that does compaction in a thread. #1863

merged 1 commit into from
Jun 20, 2024

Conversation

gz
Copy link
Collaborator

@gz gz commented Jun 11, 2024

  • The idea is to have one thread per spine. Maybe later we can relax this to n background compactor thread and load-balance work across multiple spines (this is e.g., what RocksDB does)
  • The thread just does the compaction on behalf of the spine and sends the results back. For now it will just follow a simple FIFO order and compact one by one. In future improvements we can add n-way merges, and round-robin or other strategies to merge multiple things at the same time.
  • Having compaction in a different thread meant slightly changing the way that the backend meta-data about read-only files is kept. Before it was strictly per-thread and stored by the thread that creates them. Now they need to be shareable among compactor and threads that hold spines so this meta-data lives in a concurrent hashmap.
  • There was also some issue with not dropping the lockfile on the storage dir in tests. This was resolved by moving the StorageLocation to the RuntimeHandle.
  • Had to so some small changes to tests due to asserts that were too strong assumed behavior of the old spine but werent necessary.

@gz gz requested a review from blp June 11, 2024 06:52
@gz gz force-pushed the async-merge branch 4 times, most recently from 9f9efaa to 7892010 Compare June 12, 2024 16:44
@gz gz changed the title Adds a new spine that does compacts in a thread. Adds a new spine that does compaction in a thread. Jun 12, 2024
Copy link

Benchmark results

Nexmark

  • 1 out of 21 queries have regressed ❗
  • Compared results from 23eb7d2 (main) with 7892010 (PR)
name main [kOp/s] PR [kOp/s] Tput change [%] Assessment Peak RSS diff
Q0 7500.9 7874.11 5 ✔️ 15.7 MB
Q1 5949.06 6023.69 1 ✔️ 39.3 MB
Q2 7551.22 7941.39 5 🌲 -64.9 MB
Q3 7594.42 7076.16 -7 🔻 15.1 MB
Q4 3086.23 4044.67 31 🌲 -2.5 GB
Q5 7169.56 7079.89 -1 ✔️ -2.5 GB
Q6 2890.47 3876.3 34 🌲 -3.6 GB
Q7 2326.24 2213.08 -5 ✔️ -3.7 GB
Q8 7575.75 7353.06 -3 ✔️ -3.7 GB
Q9 448.593 593.633 32 🌲 -3.9 GB
Q12 7143.48 6986.96 -2 ✔️ -3.9 GB
Q13 2734.95 4061.92 49 🌲 -3.9 GB
Q14 7510.28 7297.41 -3 ✔️ -3.9 GB
Q15 3646.44 3553.4 -3 ✔️ -3.9 GB
Q16 828.606 1082.05 31 🌲 -3.9 GB
Q17 1993.92 2559.88 28 🌲 -3.9 GB
Q18 702.525 952.903 36 🌲 -9.4 GB
Q19 634.268 767.434 21 🌲 -9.4 GB
Q20 678.023 1054.98 56 🌲 -9.4 GB
Q21 7332.77 7477.42 2 ✔️ -9.4 GB
Q22 7349.03 7138.49 -3 ✔️ -9.4 GB

Galen

  • 1 out of 1 benchmarks have regressed ❗
  • Compared results from 23eb7d2 (main) with 7892010 (PR)
name main [s] PR [s] Runtime change [%] Assessment
galen 32.2619 29.3539 -9 🔻

@gz
Copy link
Collaborator Author

gz commented Jun 14, 2024

The comment from above shows that it seems to improve performance for DRAM.

I'll add/run the results with storage enabled too.

@blp
Copy link
Member

blp commented Jun 14, 2024

I'm reading the code this morning.

Copy link
Member

@blp blp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the code and I like how much simpler it is than the fueled spine. I think that the assertion failure we're having in the fueled spine code is going to be hard to confidently fix.

This is going to create a lot of threads in cases where we have a lot of spines. I suspect we'll have to do something about that, but this is a fine place to start.

Thank you!

crates/dbsp/src/trace/spine_async/merger.rs Outdated Show resolved Hide resolved
crates/dbsp/src/trace/spine_async/merger.rs Outdated Show resolved Hide resolved
crates/dbsp/src/trace/spine_async/merger.rs Outdated Show resolved Hide resolved
crates/dbsp/src/trace/spine_async/merger.rs Outdated Show resolved Hide resolved
crates/dbsp/src/trace/spine_async/merger.rs Outdated Show resolved Hide resolved
@gz
Copy link
Collaborator Author

gz commented Jun 15, 2024

some performance results with storage:

fueled spine

│ Query │ #Events   │ Cores │ Elapsed  │ Cores * + │ Throughp+ │ Total Us+ │ Total S+ │ Current+ │ Peak RSS │ Current+ │ Peak Co+ │ Page Fa+ │ # Files │ # Writes │ # Reads │ Avg WrSz │ Avg RdSz │ Writes   │ Reads    │ Cache H+ │ Cache M+ │ Compact+ │
├───────┼───────────┼───────┼──────────┼───────────┼───────────┼───────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼─────────┼──────────┼─────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Q0    │ 10,000,0+ │ 16    │ 1.583s   │ 25.321s   │ 394.924 + │ 17.005s   │ 633.000+ │ 1.17 GiB │ 716.00 + │ 1.17 GiB │ 1.17 GiB │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q1    │ 10,000,0+ │ 16    │ 1.293s   │ 20.684s   │ 483.474 + │ 19.893s   │ 181.000+ │ 11.31 M+ │ 957.00 + │ 11.31 M+ │ 11.32 M+ │ 2        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q2    │ 10,000,0+ │ 16    │ 1.403s   │ 22.443s   │ 445.581 + │ 17.007s   │ 235.000+ │ 4.63 MiB │ 1.04 GiB │ 4.63 MiB │ 4.64 MiB │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q3    │ 10,000,0+ │ 16    │ 18.738s  │ 299.807s  │ 33.355 K+ │ 30.131s   │ 7.536s   │ 32.37 M+ │ 1.10 GiB │ 32.37 M+ │ 32.44 M+ │ 5        │ 15859   │ 15817    │ 0       │ 14.60 K+ │ 0 B      │ 225.54 + │ 0 B      │ 1740832  │ 0        │ 7827     │
│ Q4    │ 10,000,0+ │ 16    │ 34.794s  │ 556.698s  │ 17.963 K+ │ 111.813s  │ 75.080s  │ 771.75 + │ 1.62 GiB │ 771.75 + │ 771.89 + │ 0        │ 28773   │ 29554    │ 0       │ 110.97 + │ 0 B      │ 3.13 GiB │ 0 B      │ 31511872 │ 0        │ 17608    │
│ Q5    │ 10,000,0+ │ 16    │ 23.057s  │ 368.911s  │ 27.107 K+ │ 27.552s   │ 14.991s  │ 16.00 E+ │ 1.64 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 28000   │ 27984    │ 0       │ 9.30 KiB │ 0 B      │ 254.10 + │ 0 B      │ 258449   │ 0        │ 3904     │
│ Q6    │ 10,000,0+ │ 16    │ 41.139s  │ 658.220s  │ 15.192 K+ │ 122.151s  │ 82.347s  │ 320.00 + │ 1.86 GiB │ 320.00 + │ 460.00 + │ 0        │ 31701   │ 32462    │ 0       │ 106.13 + │ 0 B      │ 3.29 GiB │ 0 B      │ 33878394 │ 0        │ 23453    │
│ Q7    │ 10,000,0+ │ 16    │ 22.670s  │ 362.715s  │ 27.570 K+ │ 48.027s   │ 32.302s  │ 1.00 GiB │ 2.56 GiB │ 1.00 GiB │ 1.00 GiB │ 0        │ 23984   │ 29184    │ 0       │ 282.63 + │ 0 B      │ 7.87 GiB │ 0 B      │ 1221426  │ 0        │ 3904     │
│ Q8    │ 10,000,0+ │ 16    │ 29.501s  │ 472.012s  │ 21.186 K+ │ 27.823s   │ 15.229s  │ 16.00 E+ │ 2.56 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 31904   │ 31876    │ 0       │ 9.31 KiB │ 0 B      │ 289.75 + │ 0 B      │ 447156   │ 0        │ 7808     │
│ Q9    │ 10,000,0+ │ 16    │ 39.615s  │ 633.841s  │ 15.777 K+ │ 172.757s  │ 104.421s │ 2.00 GiB │ 4.41 GiB │ 2.00 GiB │ 2.00 GiB │ 0        │ 23766   │ 36770    │ 140455  │ 603.96 + │ 8.36 KiB │ 21.18 G+ │ 1.12 GiB │ 43490386 │ 140455   │ 15621    │
│ Q12   │ 10,000,0+ │ 16    │ 9.480s   │ 151.676s  │ 65.930 K+ │ 37.575s   │ 12.978s  │ 16.00 E+ │ 4.41 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 7935    │ 7926     │ 0       │ 17.07 K+ │ 0 B      │ 132.11 + │ 0 B      │ 4924492  │ 0        │ 7834     │
│ Q13   │ 10,000,0+ │ 16    │ 16.870s  │ 269.916s  │ 37.049 K+ │ 78.633s   │ 46.438s  │ 16.00 E+ │ 4.42 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 11942   │ 13402    │ 0       │ 252.35 + │ 0 B      │ 3.23 GiB │ 0 B      │ 25176917 │ 0        │ 3910     │
│ Q14   │ 10,000,0+ │ 16    │ 1.362s   │ 21.794s   │ 458.851 + │ 17.732s   │ 62.000ms │ 16.00 E+ │ 4.42 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q15   │ 10,000,0+ │ 16    │ 336.229s │ 5379.661s │ 1.859 K/s │ 99.950s   │ 149.748s │ 16.00 E+ │ 4.42 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 293727  │ 293617   │ 0       │ 7.12 KiB │ 0 B      │ 2.00 GiB │ 0 B      │ 19567987 │ 0        │ 47723    │
│ Q16   │ 10,000,0+ │ 16    │ 661.049s │ 10576.78+ │ 0.945 K/s │ 578.757s  │ 394.532s │ 16.00 E+ │ 4.43 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 529903  │ 529398   │ 0       │ 19.60 K+ │ 0 B      │ 9.90 GiB │ 0 B      │ 2299174+ │ 0        │ 309499   │
│ Q17   │ 10,000,0+ │ 16    │ 206.289s │ 3300.621s │ 3.030 K/s │ 620.605s  │ 773.481s │ 16.00 E+ │ 4.58 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 221721  │ 221750   │ 0       │ 39.12 K+ │ 0 B      │ 8.27 GiB │ 0 B      │ 2540660+ │ 0        │ 136580   │
│ Q18   │ 10,000,0+ │ 16    │ 37.308s  │ 596.921s  │ 16.753 K+ │ 219.290s  │ 183.617s │ 1.50 GiB │ 5.78 GiB │ 1.50 GiB │ 1.50 GiB │ 0        │ 7918    │ 18885    │ 573     │ 806.77 + │ 11.49 K+ │ 14.53 G+ │ 6.43 MiB │ 88982112 │ 573      │ 7806     │
│ Q19   │ 10,000,0+ │ 16    │ 19.746s  │ 315.938s  │ 31.652 K+ │ 100.878s  │ 51.193s  │ 16.00 E+ │ 5.99 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 7929    │ 19024    │ 534     │ 809.78 + │ 11.62 K+ │ 14.69 G+ │ 6.06 MiB │ 19572484 │ 534      │ 7815     │
│ Q20   │ 10,000,0+ │ 16    │ 27.281s  │ 436.499s  │ 22.910 K+ │ 96.672s   │ 44.742s  │ 16.00 E+ │ 6.14 GiB │ 16.00 E+ │ 76.00 K+ │ 0        │ 15848   │ 26460    │ 579     │ 583.66 + │ 13.29 K+ │ 14.73 G+ │ 7.52 MiB │ 17637787 │ 579      │ 7816     │
│ Q21   │ 10,000,0+ │ 16    │ 1.325s   │ 21.208s   │ 471.522 + │ 19.028s   │ 0.000ns  │ 16.00 E+ │ 6.17 GiB │ 16.00 E+ │ 76.00 K+ │ 9        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q22   │ 10,000,0+ │ 16    │ 1.440s   │ 23.038s   │ 434.062 + │ 19.423s   │ 0.000ns  │ 16.00 E+ │ 6.18 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
└───────┴───────────┴───────┴──────────┴───────────┴───────────┴───────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴─────────┴──────────┴─────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘

async spine

│ Query │ #Events   │ Cores │ Elapsed  │ Cores * + │ Throughp+ │ Total Us+ │ Total S+ │ Current+ │ Peak RSS │ Current+ │ Peak Co+ │ Page Fa+ │ # Files │ # Writes │ # Reads │ Avg WrSz │ Avg RdSz │ Writes   │ Reads    │ Cache H+ │ Cache M+ │ Compact+ │
├───────┼───────────┼───────┼──────────┼───────────┼───────────┼───────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼─────────┼──────────┼─────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Q0    │ 10,000,0+ │ 16    │ 1.560s   │ 24.966s   │ 400.551 + │ 16.778s   │ 636.000+ │ 1.17 GiB │ 748.00 + │ 1.17 GiB │ 1.17 GiB │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q1    │ 10,000,0+ │ 16    │ 1.329s   │ 21.268s   │ 470.193 + │ 19.952s   │ 298.000+ │ 7.75 MiB │ 949.00 + │ 7.75 MiB │ 7.76 MiB │ 5        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q2    │ 10,000,0+ │ 16    │ 1.372s   │ 21.959s   │ 455.389 + │ 16.633s   │ 238.000+ │ 4.94 MiB │ 1.04 GiB │ 4.94 MiB │ 4.95 MiB │ 1        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q3    │ 10,000,0+ │ 16    │ 6.631s   │ 106.103s  │ 94.248 K+ │ 49.738s   │ 101.102s │ 1.26 GiB │ 1.79 GiB │ 1.26 GiB │ 1.26 GiB │ 3        │ 15832   │ 15826    │ 40661   │ 40.55 K+ │ 5.05 KiB │ 626.72 + │ 200.43 + │ 13264726 │ 40661    │ 7794     │
│ Q4    │ 10,000,0+ │ 16    │ 22.515s  │ 360.247s  │ 27.759 K+ │ 153.699s  │ 1007.28+ │ 4.52 GiB │ 6.01 GiB │ 4.52 GiB │ 4.52 GiB │ 0        │ 28330   │ 29973    │ 114832  │ 148.02 + │ 6.39 KiB │ 4.23 GiB │ 716.84 + │ 50089825 │ 114832   │ 17186    │
│ Q5    │ 10,000,0+ │ 16    │ 15.075s  │ 241.195s  │ 41.460 K+ │ 32.582s   │ 28.217s  │ 1.02 MiB │ 6.55 GiB │ 1.02 MiB │ 1.59 MiB │ 0        │ 28064   │ 28288    │ 19712   │ 26.06 K+ │ 4.80 KiB │ 719.89 + │ 92.38 M+ │ 2110209  │ 19712    │ 4012     │
│ Q6    │ 10,000,0+ │ 16    │ 25.443s  │ 407.085s  │ 24.565 K+ │ 169.980s  │ 1465.59+ │ 3.75 GiB │ 10.07 G+ │ 3.75 GiB │ 3.75 GiB │ 0        │ 31276   │ 32797    │ 136735  │ 147.45 + │ 6.32 KiB │ 4.61 GiB │ 844.30 + │ 54616537 │ 136735   │ 23113    │
│ Q7    │ 10,000,0+ │ 16    │ 13.570s  │ 217.123s  │ 46.057 K+ │ 61.164s   │ 56.653s  │ 16.00 E+ │ 10.40 G+ │ 16.00 E+ │ 56.00 K+ │ 0        │ 24032   │ 33332    │ 419636  │ 383.24 + │ 7.80 KiB │ 12.18 G+ │ 3.12 GiB │ 2947467  │ 419636   │ 3999     │
│ Q8    │ 10,000,0+ │ 16    │ 14.874s  │ 237.977s  │ 42.021 K+ │ 34.029s   │ 25.443s  │ 1.43 MiB │ 10.48 G+ │ 1.43 MiB │ 1.82 MiB │ 0        │ 32048   │ 32048    │ 39435   │ 34.55 K+ │ 4.79 KiB │ 1.06 GiB │ 184.52 + │ 1917593  │ 39435    │ 7952     │
│ Q9    │ 10,000,0+ │ 16    │ 39.003s  │ 624.047s  │ 16.024 K+ │ 294.341s  │ 1940.71+ │ 4.75 GiB │ 14.42 G+ │ 4.75 GiB │ 4.75 GiB │ 0        │ 23422   │ 43377    │ 4332292 │ 676.81 + │ 8.00 KiB │ 28.00 G+ │ 33.06 G+ │ 68334873 │ 4325764  │ 15307    │
│ Q12   │ 10,000,0+ │ 16    │ 7.417s   │ 118.678s  │ 84.262 K+ │ 58.846s   │ 212.334s │ 16.00 E+ │ 14.57 G+ │ 16.00 E+ │ 68.00 K+ │ 0        │ 7717    │ 8425     │ 793772  │ 113.34 + │ 7.40 KiB │ 932.51 + │ 5.60 GiB │ 14126873 │ 793779   │ 7690     │
│ Q13   │ 10,000,0+ │ 16    │ 13.640s  │ 218.247s  │ 45.820 K+ │ 87.936s   │ 155.763s │ 16.00 E+ │ 14.74 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 11984   │ 14853    │ 76606   │ 337.82 + │ 7.32 KiB │ 4.79 GiB │ 547.52 + │ 25886979 │ 76606    │ 3965     │
│ Q14   │ 10,000,0+ │ 16    │ 1.239s   │ 19.823s   │ 504.456 + │ 17.239s   │ 459.000+ │ 444.00 + │ 14.77 G+ │ 444.00 + │ 584.00 + │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q15   │ 10,000,0+ │ 16    │ 161.079s │ 2577.267s │ 3.880 K/s │ 100.348s  │ 121.810s │ 2.25 GiB │ 16.64 G+ │ 2.25 GiB │ 2.25 GiB │ 0        │ 294182  │ 294182   │ 161006  │ 17.74 K+ │ 4.95 KiB │ 4.98 GiB │ 778.76 + │ 10261080 │ 161006   │ 48178    │
│ Q16   │ 10,000,0+ │ 16    │ 195.022s │ 3120.348s │ 3.205 K/s │ 910.725s  │ 3590.75+ │ 27.25 G+ │ 34.41 G+ │ 27.25 G+ │ 27.25 G+ │ 0        │ 533445  │ 534151   │ 2241293 │ 42.78 K+ │ 6.18 KiB │ 21.79 G+ │ 13.22 G+ │ 3299019+ │ 2241293  │ 313763   │
│ Q17   │ 10,000,0+ │ 16    │ 559.018s │ 8944.283s │ 1.118 K/s │ 2897.206s │ 66092.5+ │ 26.00 G+ │ 60.39 G+ │ 26.00 G+ │ 26.00 G+ │ 0        │ 220426  │ 221397   │ 755074  │ 128.49 + │ 5.74 KiB │ 27.13 G+ │ 4.13 GiB │ 1099429+ │ 755074   │ 134856   │
│ Q18   │ 10,000,0+ │ 16    │ 63.385s  │ 1014.162s │ 9.860 K/s │ 390.905s  │ 1452.95+ │ 1.50 GiB │ 64.73 G+ │ 1.50 GiB │ 1.50 GiB │ 0        │ 7708    │ 19991    │ 6915171 │ 822.42 + │ 11.60 K+ │ 15.68 G+ │ 76.50 G+ │ 1254668+ │ 6905528  │ 7854     │
│ Q19   │ 10,000,0+ │ 16    │ 20.038s  │ 320.611s  │ 31.190 K+ │ 199.113s  │ 562.032s │ 16.00 E+ │ 65.48 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 7690    │ 20651    │ 4361963 │ 833.34 + │ 11.25 K+ │ 16.41 G+ │ 46.79 G+ │ 28329846 │ 4355917  │ 7636     │
│ Q20   │ 10,000,0+ │ 16    │ 11.539s  │ 184.630s  │ 54.162 K+ │ 115.231s  │ 390.020s │ 255.82 + │ 65.67 G+ │ 255.82 + │ 256.00 + │ 0        │ 14755   │ 23592    │ 1069280 │ 577.63 + │ 10.92 K+ │ 13.00 G+ │ 11.13 G+ │ 20772286 │ 1068811  │ 6721     │
│ Q21   │ 10,000,0+ │ 16    │ 1.527s   │ 24.432s   │ 409.305 + │ 20.015s   │ 1.267s   │ 255.87 + │ 65.92 G+ │ 255.87 + │ 256.02 + │ 1        │ 0       │ 214      │ 61670   │ 984.92 + │ 10.96 K+ │ 205.83 + │ 660.12 + │ 223228   │ 61670    │ 17       │
│ Q22   │ 10,000,0+ │ 16    │ 1.361s   │ 21.777s   │ 459.205 + │ 18.831s   │ 74.000ms │ 16.00 E+ │ 65.98 G+ │ 16.00 E+ │ 76.00 K+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
└───────┴───────────┴───────┴──────────┴───────────┴───────────┴───────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴─────────┴──────────┴─────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘
  • looks like everything except q17/q18 got faster (or remained the same).. q17 slowed down quite a bit which is unfortunate, on the plus side 15/16 are much faster now
  • I did some more tests with q9 as it writes the most .. I ran it with two disks in raid0 and a single disk.. the time spent on the query didn't improve at all when I used the raid0 (but the write bandwidth did double and it wrote at ~2GiB at peak times) with both spine fueled or async spine and it was a bit surprising
  • I don't quite understand why there are no reads in the metrics with spine fueled and lots of bytes read with async spine. Maybe it's because the per-thread cache and merging happening on the other thread or some misconfiguration with running in the spine-fueled mode.. If the metrics are true it doesn't seem to impact things much (things are mostly served from the linux page-cache and nothing goes to disk)

here is a graph showing disk bw during nexmark queries. the biggest writes happen for q9 the rest is all very low writes bw but still much slower than dram.. I will try to get look a bit at perf again after merging this.
Screenshot 2024-06-14 at 18-15-23 gz-desktop-pro Netdata Agent Console

@lalithsuresh
Copy link
Collaborator

@gz any idea why the peak RSS seems to be very high for async spine? Is it accumulating work faster than it is able to process?

@gz
Copy link
Collaborator Author

gz commented Jun 20, 2024

│ Query │ #Events   │ Cores │ Elapsed  │ Cores * + │ Throughp+ │ Total U+ │ Total S+ │ Current+ │ Peak RSS │ Current+ │ Peak Co+ │ Page Fa+ │ # Files │ # Writes │ # Reads  │ Avg WrSz │ Avg RdSz │ Writes   │ Reads    │ Cache H+ │ Cache M+ │ Compact+ │
├───────┼───────────┼───────┼──────────┼───────────┼───────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼─────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Q0    │ 10,000,0+ │ 16    │ 1.495s   │ 23.913s   │ 418.184 + │ 16.868s  │ 603.000+ │ 1.11 GiB │ 728.00 + │ 1.11 GiB │ 1.11 GiB │ 0        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q1    │ 10,000,0+ │ 16    │ 1.268s   │ 20.295s   │ 492.732 + │ 19.888s  │ 223.000+ │ 15.37 M+ │ 932.00 + │ 15.37 M+ │ 15.57 M+ │ 6        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q2    │ 10,000,0+ │ 16    │ 1.257s   │ 20.113s   │ 497.192 + │ 17.233s  │ 147.000+ │ 4.19 MiB │ 998.00 + │ 4.19 MiB │ 4.20 MiB │ 0        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q3    │ 10,000,0+ │ 16    │ 5.375s   │ 86.000s   │ 116.279 + │ 39.686s  │ 24.304s  │ 1.07 GiB │ 1.57 GiB │ 1.07 GiB │ 1.07 GiB │ 2        │ 15677   │ 15677    │ 39401    │ 32.81 K+ │ 5.06 KiB │ 502.38 + │ 194.64 + │ 9923725  │ 39401    │ 7645     │
│ Q4    │ 10,000,0+ │ 16    │ 22.735s  │ 363.752s  │ 27.491 K+ │ 155.418s │ 371.730s │ 2.26 GiB │ 3.78 GiB │ 2.26 GiB │ 2.26 GiB │ 0        │ 28368   │ 29688    │ 133522   │ 136.77 + │ 6.35 KiB │ 3.87 GiB │ 828.62 + │ 52193332 │ 133522   │ 17019    │
│ Q5    │ 10,000,0+ │ 16    │ 16.678s  │ 266.851s  │ 37.474 K+ │ 27.163s  │ 17.410s  │ 4.06 MiB │ 4.12 GiB │ 4.06 MiB │ 4.26 MiB │ 0        │ 28064   │ 28064    │ 19760    │ 18.86 K+ │ 4.80 KiB │ 517.02 + │ 92.62 M+ │ 339960   │ 19760    │ 3968     │
│ Q6    │ 10,000,0+ │ 16    │ 24.654s  │ 394.465s  │ 25.351 K+ │ 167.625s │ 435.500s │ 1.25 GiB │ 5.50 GiB │ 1.25 GiB │ 1.25 GiB │ 0        │ 30963   │ 32182    │ 129107   │ 129.59 + │ 6.31 KiB │ 3.98 GiB │ 795.75 + │ 56319582 │ 129107   │ 22357    │
│ Q7    │ 10,000,0+ │ 16    │ 15.246s  │ 243.936s  │ 40.994 K+ │ 56.063s  │ 47.345s  │ 767.87 + │ 6.27 GiB │ 767.87 + │ 768.07 + │ 0        │ 24032   │ 33150    │ 419477   │ 380.18 + │ 7.80 KiB │ 12.02 G+ │ 3.12 GiB │ 1336694  │ 419477   │ 3952     │
│ Q8    │ 10,000,0+ │ 16    │ 16.573s  │ 265.173s  │ 37.711 K+ │ 36.037s  │ 26.728s  │ 16.00 E+ │ 6.29 GiB │ 16.00 E+ │ 140.00 + │ 0        │ 32048   │ 32048    │ 39343    │ 34.55 K+ │ 4.79 KiB │ 1.06 GiB │ 184.04 + │ 1917417  │ 39343    │ 7952     │
│ Q9    │ 10,000,0+ │ 16    │ 36.049s  │ 576.779s  │ 17.338 K+ │ 270.367s │ 596.775s │ 1.75 GiB │ 7.95 GiB │ 1.75 GiB │ 1.75 GiB │ 7        │ 22914   │ 40038    │ 5284549  │ 654.95 + │ 10.91 K+ │ 25.01 G+ │ 54.99 G+ │ 60522316 │ 5284549  │ 14488    │
│ Q12   │ 10,000,0+ │ 16    │ 6.108s   │ 97.724s   │ 102.329 + │ 49.531s  │ 124.668s │ 16.00 E+ │ 8.06 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 7720    │ 7720     │ 18730    │ 30.98 K+ │ 5.02 KiB │ 233.55 + │ 91.89 M+ │ 13805151 │ 18730    │ 7408     │
│ Q13   │ 10,000,0+ │ 16    │ 14.407s  │ 230.515s  │ 43.381 K+ │ 90.785s  │ 159.859s │ 255.87 + │ 8.39 GiB │ 255.87 + │ 256.07 + │ 0        │ 11984   │ 14832    │ 76683    │ 337.40 + │ 7.31 KiB │ 4.77 GiB │ 547.73 + │ 25841685 │ 76683    │ 3952     │
│ Q14   │ 10,000,0+ │ 16    │ 1.228s   │ 19.655s   │ 508.769 + │ 17.018s  │ 284.000+ │ 16.00 E+ │ 8.41 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q15   │ 10,000,0+ │ 16    │ 169.810s │ 2716.953s │ 3.681 K/s │ 107.065s │ 128.287s │ 16.00 E+ │ 8.42 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 294182  │ 294182   │ 161309   │ 17.74 K+ │ 4.96 KiB │ 4.98 GiB │ 781.23 + │ 10262333 │ 161309   │ 48178    │
│ Q16   │ 10,000,0+ │ 16    │ 196.087s │ 3137.385s │ 3.187 K/s │ 888.424s │ 1861.96+ │ 16.00 E+ │ 8.46 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 533439  │ 534142   │ 2316079  │ 42.82 K+ │ 6.23 KiB │ 21.81 G+ │ 13.77 G+ │ 3299075+ │ 2316079  │ 313760   │
│ Q17   │ 10,000,0+ │ 16    │ 400.611s │ 6409.774s │ 1.560 K/s │ 2128.84+ │ 4362.22+ │ 2.50 GiB │ 8.46 GiB │ 2.50 GiB │ 2.50 GiB │ 0        │ 223029  │ 224284   │ 23824724 │ 102.91 + │ 6.82 KiB │ 22.01 G+ │ 155.05 + │ 8341747+ │ 23824724 │ 138238   │
│ Q18   │ 10,000,0+ │ 16    │ 69.002s  │ 1104.031s │ 9.058 K/s │ 406.906s │ 1449.91+ │ 1023.88+ │ 10.14 G+ │ 1023.88+ │ 1.00 GiB │ 0        │ 7680    │ 20322    │ 9429849  │ 832.59 + │ 11.63 K+ │ 16.14 G+ │ 104.59 + │ 1262396+ │ 9429849  │ 7325     │
│ Q19   │ 10,000,0+ │ 16    │ 16.085s  │ 257.357s  │ 38.857 K+ │ 157.926s │ 299.624s │ 16.00 E+ │ 11.24 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 7701    │ 20953    │ 2357250  │ 834.82 + │ 11.04 K+ │ 16.68 G+ │ 24.83 G+ │ 26994009 │ 2357250  │ 7374     │
│ Q20   │ 10,000,0+ │ 16    │ 8.079s   │ 129.272s  │ 77.356 K+ │ 93.030s  │ 93.370s  │ 255.87 + │ 11.29 G+ │ 255.87 + │ 256.07 + │ 0        │ 14540   │ 22533    │ 733474   │ 559.21 + │ 10.32 K+ │ 12.02 G+ │ 7.22 GiB │ 14944457 │ 733474   │ 6508     │
│ Q21   │ 10,000,0+ │ 16    │ 1.209s   │ 19.339s   │ 517.100 + │ 18.438s  │ 146.000+ │ 16.00 E+ │ 11.47 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q22   │ 10,000,0+ │ 16    │ 1.416s   │ 22.651s   │ 441.483 + │ 19.429s  │ 23.000ms │ 16.00 E+ │ 11.50 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0        │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
└───────┴───────────┴───────┴──────────┴───────────┴───────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴─────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘

I changed the amount of background threads to the number of workers instead of #spines.

  • q17 has improved quite a bit with this change, though q17 and q18 are still a bit slower than with the fueled spine.
  • all other queries are either faster or the same (compared to fueld spine) and didn't suffer from this change. Some improved quite a bit like q16, by about ~450 seconds
  • memory utilization is also low again. the biggest chunk of the added expense is the per-thread cache which is set to 128MiB per active thread in the system wich is now 2*n instead of n..

@gz
Copy link
Collaborator Author

gz commented Jun 20, 2024

by popular demand, with min_storage_row set to 10_000 instead of 0 (previous tables)

-async spine

│ Query │ #Events   │ Cores │ Elapsed │ Cores * + │ Throughp+ │ Total Us+ │ Total Sy+ │ Current+ │ Peak RSS │ Current+ │ Peak Co+ │ Page Fa+ │ # Files │ # Writes │ # Reads │ Avg WrSz │ Avg RdSz │ Writes   │ Reads    │ Cache H+ │ Cache M+ │ Compact+ │
├───────┼───────────┼───────┼─────────┼───────────┼───────────┼───────────┼───────────┼──────────┼──────────┼──────────┼──────────┼──────────┼─────────┼──────────┼─────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Q0    │ 10,000,0+ │ 16    │ 1.418s  │ 22.689s   │ 440.736 + │ 17.164s   │ 587.000ms │ 1.11 GiB │ 648.00 + │ 1.11 GiB │ 1.11 GiB │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q1    │ 10,000,0+ │ 16    │ 1.336s  │ 21.381s   │ 467.702 + │ 20.201s   │ 268.000ms │ 11.12 M+ │ 896.00 + │ 11.12 M+ │ 11.19 M+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q2    │ 10,000,0+ │ 16    │ 1.387s  │ 22.198s   │ 450.492 + │ 17.003s   │ 121.000ms │ 4.56 MiB │ 979.00 + │ 4.56 MiB │ 4.63 MiB │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q3    │ 10,000,0+ │ 16    │ 1.431s  │ 22.891s   │ 436.862 + │ 20.786s   │ 420.000ms │ 1.03 GiB │ 1.21 GiB │ 1.03 GiB │ 1.03 GiB │ 12       │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 7965     │
│ Q4    │ 10,000,0+ │ 16    │ 7.669s  │ 122.701s  │ 81.499 K+ │ 75.975s   │ 78.967s   │ 3.32 GiB │ 3.35 GiB │ 3.32 GiB │ 3.32 GiB │ 0        │ 2641    │ 3927     │ 10410   │ 607.25 + │ 6.26 KiB │ 2.27 GiB │ 63.64 M+ │ 18414781 │ 10410    │ 17048    │
│ Q5    │ 10,000,0+ │ 16    │ 1.573s  │ 25.167s   │ 397.343 + │ 18.461s   │ 1.234s    │ 0 B      │ 3.80 GiB │ 0 B      │ 200.00 + │ 0        │ 144     │ 144      │ 432     │ 349.14 + │ 5.33 KiB │ 49.10 M+ │ 2.25 MiB │ 55772    │ 432      │ 3952     │
│ Q6    │ 10,000,0+ │ 16    │ 7.903s  │ 126.452s  │ 79.081 K+ │ 77.126s   │ 86.088s   │ 508.00 + │ 5.21 GiB │ 508.00 + │ 712.00 + │ 0        │ 2744    │ 4043     │ 10799   │ 605.40 + │ 6.26 KiB │ 2.33 GiB │ 66.02 M+ │ 18836997 │ 10799    │ 22823    │
│ Q7    │ 10,000,0+ │ 16    │ 2.464s  │ 39.431s   │ 253.607 + │ 39.370s   │ 19.953s   │ 512.69 + │ 5.84 GiB │ 512.69 + │ 512.89 + │ 0        │ 1354    │ 9007     │ 59515   │ 946.21 + │ 7.82 KiB │ 8.13 GiB │ 454.37 + │ 1106274  │ 59515    │ 3880     │
│ Q8    │ 10,000,0+ │ 16    │ 1.311s  │ 20.971s   │ 476.839 + │ 19.050s   │ 93.000ms  │ 124.00 + │ 5.86 GiB │ 124.00 + │ 328.00 + │ 1        │ 16      │ 16       │ 48      │ 328.00 + │ 5.33 KiB │ 5.12 MiB │ 256.00 + │ 7168     │ 48       │ 7952     │
│ Q9    │ 10,000,0+ │ 16    │ 15.921s │ 254.739s  │ 39.256 K+ │ 163.362s  │ 161.137s  │ 3.25 GiB │ 8.76 GiB │ 3.25 GiB │ 3.25 GiB │ 0        │ 2575    │ 16192    │ 4091312 │ 948.31 + │ 10.53 K+ │ 14.64 G+ │ 41.09 G+ │ 20607018 │ 4091312  │ 14576    │
│ Q12   │ 10,000,0+ │ 16    │ 1.337s  │ 21.386s   │ 467.601 + │ 21.565s   │ 1.571s    │ 16.00 E+ │ 8.87 GiB │ 16.00 E+ │ 200.00 + │ 0        │ 31      │ 31       │ 156     │ 349.68 + │ 6.41 KiB │ 10.59 M+ │ 1000.00+ │ 388336   │ 156      │ 7906     │
│ Q13   │ 10,000,0+ │ 16    │ 1.619s  │ 25.912s   │ 385.925 + │ 38.042s   │ 6.982s    │ 16.00 E+ │ 9.30 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 1415    │ 3904     │ 4895    │ 846.88 + │ 7.41 KiB │ 3.15 GiB │ 35.43 M+ │ 2471487  │ 4895     │ 3800     │
│ Q14   │ 10,000,0+ │ 16    │ 1.338s  │ 21.403s   │ 467.229 + │ 17.377s   │ 0.000ns   │ 16.00 E+ │ 9.34 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q15   │ 10,000,0+ │ 16    │ 2.165s  │ 34.635s   │ 288.725 + │ 35.867s   │ 6.933s    │ 16.00 E+ │ 9.34 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 435     │ 435      │ 886     │ 352.28 + │ 6.00 KiB │ 149.65 + │ 5.19 MiB │ 1951085  │ 886      │ 48083    │
│ Q16   │ 10,000,0+ │ 16    │ 11.761s │ 188.180s  │ 53.141 K+ │ 141.177s  │ 68.533s   │ 16.00 E+ │ 9.35 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 1456    │ 2162     │ 17415   │ 664.68 + │ 7.50 KiB │ 1.37 GiB │ 127.55 + │ 20247763 │ 17415    │ 313830   │
│ Q17   │ 10,000,0+ │ 16    │ 27.245s │ 435.917s  │ 22.940 K+ │ 195.347s  │ 195.316s  │ 16.00 E+ │ 9.40 GiB │ 16.00 E+ │ 72.00 K+ │ 0        │ 3773    │ 4951     │ 16450   │ 596.35 + │ 6.22 KiB │ 2.82 GiB │ 99.86 M+ │ 64400557 │ 16450    │ 138189   │
│ Q18   │ 10,000,0+ │ 16    │ 21.669s │ 346.712s  │ 28.842 K+ │ 162.422s  │ 385.173s  │ 2.75 GiB │ 11.05 G+ │ 2.75 GiB │ 2.75 GiB │ 0        │ 967     │ 7577     │ 2169903 │ 955.86 + │ 11.81 K+ │ 6.91 GiB │ 24.45 G+ │ 42032649 │ 2169903  │ 6563     │
│ Q19   │ 10,000,0+ │ 16    │ 6.014s  │ 96.224s   │ 103.924 + │ 93.739s   │ 71.152s   │ 16.00 E+ │ 11.42 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 1100    │ 9573     │ 884895  │ 964.38 + │ 11.35 K+ │ 8.80 GiB │ 9.58 GiB │ 10370420 │ 884895   │ 6742     │
│ Q20   │ 10,000,0+ │ 16    │ 3.647s  │ 58.356s   │ 171.362 + │ 71.169s   │ 27.435s   │ 16.00 E+ │ 11.62 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 1029    │ 7827     │ 424504  │ 953.78 + │ 11.60 K+ │ 7.12 GiB │ 4.70 GiB │ 6411146  │ 424504   │ 6885     │
│ Q21   │ 10,000,0+ │ 16    │ 1.361s  │ 21.780s   │ 459.131 + │ 18.857s   │ 229.000ms │ 16.00 E+ │ 11.62 G+ │ 16.00 E+ │ 76.00 K+ │ 2        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
│ Q22   │ 10,000,0+ │ 16    │ 1.158s  │ 18.524s   │ 539.835 + │ 18.221s   │ 250.000ms │ 16.00 E+ │ 11.62 G+ │ 16.00 E+ │ 72.00 K+ │ 0        │ 0       │ 0        │ 0       │ 0 B      │ 0 B      │ 0 B      │ 0 B      │ 0        │ 0        │ 0        │
└───────┴───────────┴───────┴─────────┴───────────┴───────────┴───────────┴───────────┴──────────┴──────────┴──────────┴──────────┴──────────┴─────────┴──────────┴─────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘

unfortunately the spine fueled fails with Attempted to insert batch into incomplete merge! so I guess it's good that this is fixed now

@gz gz force-pushed the async-merge branch 4 times, most recently from 356c12f to 333e287 Compare June 20, 2024 05:56
@gz
Copy link
Collaborator Author

gz commented Jun 20, 2024

ran q9 with 50M events (min_storage_rows=100_000)..

  • spine fueled DRAM
┌───────┬────────────┬───────┬─────────┬─────────────────┬──────────────────┬───────────────┬───────────────┬─────────────┬───────────┬────────────────┬─────────────┬─────────────┐
│ Query │ #Events    │ Cores │ Elapsed │ Cores * Elapsed │ Throughput/Cores │ Total Usr CPU │ Total Sys CPU │ Current RSS │ Peak RSS  │ Current Commit │ Peak Commit │ Page Faults │
├───────┼────────────┼───────┼─────────┼─────────────────┼──────────────────┼───────────────┼───────────────┼─────────────┼───────────┼────────────────┼─────────────┼─────────────┤
│ Q9    │ 50,000,000 │ 16    │ 38.181s │ 610.889s        │ 81.848 K/s       │ 382.155s      │ 24.837s       │ 26.44 GiB   │ 27.91 GiB │ 26.44 GiB      │ 41.73 GiB   │ 17          │
└───────┴────────────┴───────┴─────────┴─────────────────┴──────────────────┴───────────────┴───────────────┴─────────────┴───────────┴────────────────┴─────────────┴─────────────┘
  • async spine dram
┌───────┬────────────┬───────┬─────────┬─────────────────┬──────────────────┬───────────────┬───────────────┬───────────┬─────────────┐
│ Query │ #Events    │ Cores │ Elapsed │ Cores * Elapsed │ Throughput/Cores │ Total Usr CPU │ Total Sys CPU │ Peak RSS  │ Page Faults │
├───────┼────────────┼───────┼─────────┼─────────────────┼──────────────────┼───────────────┼───────────────┼───────────┼─────────────┤
│ Q9    │ 50,000,000 │ 16    │ 33.835s │ 541.363s        │ 92.359 K/s       │ 627.634s      │ 39.526s       │ 26.41 GiB │ 10          │
└───────┴────────────┴───────┴─────────┴─────────────────┴──────────────────┴───────────────┴───────────────┴───────────┴─────────────┘
  • spine fueled storage
┌───────┬───────────┬───────┬──────────┬───────────┬───────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┬─────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ Query │ #Events   │ Cores │ Elapsed  │ Cores * + │ Throughp+ │ Total U+ │ Total S+ │ Current+ │ Peak RSS │ Current+ │ Peak Co+ │ Page Fa+ │ # Files │ # Writes │ # Reads  │ Avg WrSz │ Avg RdSz │ Writes   │ Reads    │ Cache H+ │ Cache M+ │ Compact+ │
├───────┼───────────┼───────┼──────────┼───────────┼───────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼─────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Q9    │ 50,000,0+ │ 16    │ 112.238s │ 1795.813s │ 27.843 K+ │ 720.484s │ 382.237s │ 12.94 G+ │ 11.45 G+ │ 12.94 G+ │ 12.94 G+ │ 409      │ 1029    │ 49836    │ 36024055 │ 1016.16+ │ 7.77 KiB │ 48.30 G+ │ 266.93 + │ 43633836 │ 36024055 │ 79612    │
└───────┴───────────┴───────┴──────────┴───────────┴───────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴─────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘
  • async spine storage
┌───────┬────────────┬───────┬─────────┬─────────────────┬─────────────────┬───────────────┬───────────────┬───────────┬─────────────┬─────────┬─────────────┬──────────┬───────────┬────────────┬────────────────┬───────┬─────────────┬────────────────┐
│ Query │ #Events    │ Cores │ Elapsed │ Cores * Elapsed │ Throughput/Cor+ │ Total Usr CPU │ Total Sys CPU │ Peak RSS  │ Page Faults │ # Files │ Avg WrSz    │ Avg RdSz │ Writes    │ Reads      │ Cache Hit Rate │ Cpcts │ Cpct Saving │ Cpct Stall [s] │
├───────┼────────────┼───────┼─────────┼─────────────────┼─────────────────┼───────────────┼───────────────┼───────────┼─────────────┼─────────┼─────────────┼──────────┼───────────┼────────────┼────────────────┼───────┼─────────────┼────────────────┤
│ Q9    │ 50,000,000 │ 16    │ 96.064s │ 1537.021s       │ 32.530 K/s      │ 949.363s      │ 790.242s      │ 14.36 GiB │ 6           │ 1428    │ 1015.84 KiB │ 7.91 KiB │ 73.20 GiB │ 223.35 GiB │ 77%            │ 78092 │ 90476       │ 60             │
└───────┴────────────┴───────┴─────────┴─────────────────┴─────────────────┴───────────────┴───────────────┴───────────┴─────────────┴─────────┴─────────────┴──────────┴───────────┴────────────┴────────────────┴───────┴─────────────┴────────────────┘

  • I also now shared the buffer cache between the merger and the worker thread.
  • I think one remaining issue that could bring quite some wins is avoiding the head-of-line-blocking for big merges as indicated by the last column in async spine storage where I measure the amount of time stalled due to compaction (60seconds in total for all 16 cores). I'll probably fix this in a follow up PR.

- The idea is to have one background thread per worker to
  handle spine merges in the background.
  Maybe later the amount of background workers is
  made configurable.
- The thread just does the compaction on behalf of the
  spine and sends the results back. For now
  it will just follow a simple FIFO order and
  compact one by one.
  In future improvements we can add n-way merges,
  and round-robin or other strategies to merge multiple
  things at the same time.
- Having compaction in a different thread meant slightly
  changing the way that the backend meta-data about
  read-only files is kept. Before it was strictly
  per-thread and stored by the thread that creates them.
  Now they need to be shareable among compactor and
  threads that hold spines so this meta-data lives
  in a concurrent hashmap.
- There was also some issue with not dropping the lockfile on
  the storage dir in tests. This was resolved by moving the
  StorageLocation to the RuntimeHandle.
- Had to so some small changes to tests due to asserts that
  were too strong assumed behavior of the old spine but werent
  necessary.
- I changed the concurrency for the buffer cache (again),
  now it is shared and protected with a Mutex, but the
  sharing is limited and just between the worker and
  it's background thread.

Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
@gz gz merged commit 5e1c7bc into main Jun 20, 2024
5 checks passed
@gz gz deleted the async-merge branch June 20, 2024 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants