maint: Remove dependency on some elements of folly #1370

jjerphan · 2024-02-26T09:34:11Z

Reference Issues/PRs

What does this implement or fix?

Remove or replace elements of folly.

Any other comments?

Checklist

Checklist for code changes...

Have you updated the relevant docstrings, documentation and copyright notice?
Is this contribution tested against all ArcticDB's features?
Do all exceptions introduced raise appropriate error messages?
Are API changes highlighted in the PR description?
Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

folly recommends against not using it: https://github.com/facebook/folly/blob/main/folly/SpinLock.h#L18-L19 Since `std::mutex` is non-copyable, we use indirections via `std::unique`. Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

vasil-pashov · 2024-02-27T10:16:21Z

cpp/arcticdb/pipeline/read_frame.cpp

@@ -646,7 +645,7 @@ class StringReducer {
    virtual void reduce(PipelineContextRow& context_row, size_t column_index) = 0;
 };

-using LockType = folly::SpinLock;


Why was this required? Spin locks tend be faster than std::mutex and mutexes in general if the lock is held briefly, I'm not sure about the current state of the code, just pointing out that it can have an impact.

folly itself recommends against using SpinLock:
https://github.com/facebook/folly/blob/main/folly/SpinLock.h#L18-L19

I agree that it's the general case. I've seen cases where the spin lock increases the performance by a large factor because acquiring a mutex requires kernel call. Thus if you hold the mutex for short period of time the kernel call will negate any benefits. I was asking there was a reason for the spin lock in the first place and were there any tests to prove nothing changed significantly after the switch to a mutex.

I do not know.

Probably, @willdealtry can provide some insights?

I guess if the asv benchmarks aren't showing regressions, this might not be a concern. What do you think?

ASV benchmarks are not quite complete (still better than nothing). Let's see if @willdealtry has any concerns.

This code has had quite a lot of optimization effort because it's a known bottleneck for us (and for anyone else trying to write string data). If we want to move away from using the folly spinlock, it's not difficult to write one that should be sufficient for our purposes, even if it doesn't maybe have all the trickery that the folly one has:

#ifdef WIN32 #define PAUSE _mm_pause() #else if defined(__arm__) || defined(__ARM_ARCH) #define PAUSE __yield() #else #define PAUSE __builtin_ia32_pause() #endif struct SpinLock { std::atomic<bool> lock_ = false; void lock() noexcept { do { if (!lock_.exchange(true, std::memory_order_acquire)) return; while (lock_.load(std::memory_order_relaxed)) PAUSE; } while(true); } bool try_lock() noexcept { return !lock_.load(std::memory_order_relaxed) && !lock_.exchange(true, std::memory_order_acquire); } void unlock() { lock_.store(false, std::memory_order_release); } };

N.B. I have not compiled, much less tested this, although it's pretty canonical. I remember doing almost exactly this with volatile bool and intrinsics in the early Cretaceous period (when I was young)

In terms of the maps, std::unordered_map has generally horrible performance. I think over time we will probably end up writing some of our own maps for specific purposes, because quite a lot of the time we have use-cases where we could be much faster than a general-purpose hash map. For example in some cases, we can do without deletion, because we populate the map and then drop it entirely, and in other cases the values are known to be unique at insertion time, meaning that we can do away with a substantial part of the branching and complexity of a generic container.

If we're going to move everything away from folly and robin_hood at the same time, it would seem to be worthwhile at least migrating to a modern and performant implementation globally, then replacing that where we think we have a special case.

I would be in favour of using this one, as it's at least as fast if not faster than any of the ones we currently use, is header-only and has an MIT license: https://github.com/martinus/unordered_dense

std::unordered_map is for now a placeholder replacement, and I agree more performant implementations are needed.

Let's wait for the completion of the benchmarks triggered by b625e1f to see how severe performances regression are?

The recent boost::unordered_flat_{set,map} are also worth considering.

I have opened #1390 to see whether we can replace robin hood with boost's collections.

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Klaim

While the change of hash maps will probably have an impact on performance, as long as we dont set a proper replacement, I think it's fine - we can change that later. So LGTM. 👍🏽

This reverts "Disable ec2 runners on PR (#1357)" This reverts commit 198c9cc.

jjerphan · 2024-03-01T08:29:32Z

According to b625e1f's ASV benchmarks, there are no performance regressions.

I am also for replacing all the hash maps and hash sets with martinus/unordered_dense as proposed by @willdealtry in #1370 (comment) in another PR. What do you think?

Let me know if this looks good to you, and I will revert b625e1f before merging this PR.

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

…_map`" This reverts commit 76b7fc3.

jjerphan · 2024-03-08T10:31:49Z

Weirdly enough this PR as of 76b7fc3 had worse performance than master. That is why I reverted the changes then.

jjerphan added 6 commits February 27, 2024 10:03

Remove dependency on F14Set

b0d365f

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Replace folly::small_vector with boost::container::small_vector

eb202c4

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Remove dependency on F14Map

f3d24a0

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Remove dependency on ThreadedExecutor

2b2e561

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Remove dependency on Likely

5a9839c

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Remove dependency on SpinLock

61dc210

folly recommends against not using it: https://github.com/facebook/folly/blob/main/folly/SpinLock.h#L18-L19 Since `std::mutex` is non-copyable, we use indirections via `std::unique`. Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

vasil-pashov reviewed Feb 27, 2024

View reviewed changes

jjerphan force-pushed the maint/remove-folly-parts branch from 599677b to 61dc210 Compare February 27, 2024 16:26

jjerphan marked this pull request as ready for review February 28, 2024 16:25

Remove dependency on ScopeGuard

98340f8

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

JohanMabille approved these changes Feb 29, 2024

View reviewed changes

Klaim approved these changes Feb 29, 2024

View reviewed changes

Run ASV benchmark by activatinh EC2 runners on PR

b625e1f

This reverts "Disable ec2 runners on PR (#1357)" This reverts commit 198c9cc.

jjerphan added 4 commits March 6, 2024 15:07

Merge branch 'master' into maint/remove-folly-parts

f6706c6

Use ankerl::unordered_dense::map instead of std::unordered_map

76b7fc3

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Revert "Use ankerl::unordered_dense::map instead of `std::unordered…

9395360

…_map`" This reverts commit 76b7fc3.

Merge branch 'master' into maint/remove-folly-parts

d867cb3

jjerphan force-pushed the maint/remove-folly-parts branch 2 times, most recently from 81cebbb to d867cb3 Compare March 7, 2024 09:05

fixup! Merge branch 'master' into maint/remove-folly-parts

0a642de

jjerphan merged commit 4184a46 into master Mar 11, 2024
114 checks passed

jjerphan deleted the maint/remove-folly-parts branch March 11, 2024 12:12

jjerphan mentioned this pull request Mar 11, 2024

maint: Folly Replacement Plan #1412

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

maint: Remove dependency on some elements of folly #1370

maint: Remove dependency on some elements of folly #1370

jjerphan commented Feb 26, 2024 •

edited

Loading

vasil-pashov Feb 27, 2024

jjerphan Feb 27, 2024

vasil-pashov Feb 27, 2024

jjerphan Feb 27, 2024

vasil-pashov Feb 27, 2024

willdealtry Feb 29, 2024 •

edited

Loading

willdealtry Feb 29, 2024 •

edited

Loading

willdealtry Feb 29, 2024 •

edited

Loading

jjerphan Feb 29, 2024

jjerphan Mar 4, 2024 •

edited

Loading

Klaim left a comment

jjerphan commented Mar 1, 2024 •

edited

Loading

jjerphan commented Mar 8, 2024

maint: Remove dependency on some elements of folly #1370

maint: Remove dependency on some elements of folly #1370

Conversation

jjerphan commented Feb 26, 2024 • edited Loading

Reference Issues/PRs

What does this implement or fix?

Any other comments?

Checklist

vasil-pashov Feb 27, 2024

Choose a reason for hiding this comment

jjerphan Feb 27, 2024

Choose a reason for hiding this comment

vasil-pashov Feb 27, 2024

Choose a reason for hiding this comment

jjerphan Feb 27, 2024

Choose a reason for hiding this comment

vasil-pashov Feb 27, 2024

Choose a reason for hiding this comment

willdealtry Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

willdealtry Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

willdealtry Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

jjerphan Feb 29, 2024

Choose a reason for hiding this comment

jjerphan Mar 4, 2024 • edited Loading

Choose a reason for hiding this comment

Klaim left a comment

Choose a reason for hiding this comment

jjerphan commented Mar 1, 2024 • edited Loading

jjerphan commented Mar 8, 2024

jjerphan commented Feb 26, 2024 •

edited

Loading

willdealtry Feb 29, 2024 •

edited

Loading

willdealtry Feb 29, 2024 •

edited

Loading

willdealtry Feb 29, 2024 •

edited

Loading

jjerphan Mar 4, 2024 •

edited

Loading

jjerphan commented Mar 1, 2024 •

edited

Loading