-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utils: compact-radix-tree: fix accidental cache line bouncing #9246
Conversation
@avikivity That's it. That's the entire NUMA slowness bug. Sharing By the way, I drilled down to this by adding a REST API for calling mbind() and bisecting the relevant ELF segment with it. (I stashed an array somewhere into Scylla and recompiled it with various sizes to bisect the segment with a page boundary, and send the right half to the other NUMA node with mbind()). Sampling with |
Also, I had to guess the relevant piece of memory by reading the assembly. Many symbols in the Scylla binary come out obfuscated for some reason (they look like this: |
Ah, so I guess the machine has directory-based coherence. If the cache line isn't local to the core, it asks the directory on the node that owns the memory that the line caches. So even though we cache line was always on node 0, we had to ask node 1 where it was. |
It also explains why we saw the radix tree in the profiles! |
Well done. The patch is fine, but I'll let @xemul admire it first. |
Maybe I'm mixing this with a report from @pveentjer |
if (_v != nullptr) { | ||
// nil_root is shared between shards, so this nil_root check is necessary | ||
// to avoid cache line bouncing caused by writing to _backref. | ||
if (_v != nullptr && _v != &nil_root) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkinf for node being nil_root
is done in many places to decide whether the node in question is "empty" and should be, e.g., populated, not visited, skipped on erase, etc. If nil_root assignment is skipped then all the above logic breaks. And actually if debug tests continue to pass after this that's bad sign as well.
The motivation behind having the nil_root
is to avoid extra node == nullptr
checks on lookup. It only need to exist and contain no slots inside. What if you make the nil_root
thread-local, does it help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the assignment is not skipped, is it? It's just the backref on nil_root that's not assigned and @michoecho wrote that backref on nil_root is never checked/read, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check it one more time, @haaawk ... yes, missed that. Then the patch is correct :)
I'm highly impressed by your detective work @michoecho . If I could, I would give you all my Monster coins this quarter. Unfortunately I can't grant my coins to my own team so it has to be enough for you to know how happy I am to have such a talented engineer on my team. |
@michoecho please file an issue and add a Fixes #, to satisfy the backport bureaucracy. |
84fc251
to
91e899a
Compare
Done. I also modified the code comment slightly. |
Whenever a node_head_ptr is assigned to nil_root, the _backref inside it is overwritten. But since nil_root is shared between shards, this causes severe cache line bouncing. (It was observed to reduce the total write throughput of Scylla by 90% on a large NUMA machine). This backreference is never read anyway, so fix this bug by not writing it. Fixes scylladb#9252
91e899a
to
5b42f8f
Compare
Whenever a node_head_ptr is assigned to nil_root, the _backref inside it is overwritten. But since nil_root is shared between shards, this causes severe cache line bouncing. (It was observed to reduce the total write throughput of Scylla by 90% on a large NUMA machine). This backreference is never read anyway, so fix this bug by not writing it. Fixes #9252 Closes #9246 (cherry picked from commit 126baa7)
Whenever a node_head_ptr is assigned to nil_root, the _backref inside it is
overwritten. But since nil_root is shared between shards, this causes severe
cache line bouncing. (It was observed to reduce the total write throughput
of Scylla by 90% on a large NUMA machine).
This backreference is never read anyway, so fix this bug by not writing it.
Fixes #9252