Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jwinkler2083233/eject fake random #945

Merged
merged 45 commits into from
Jul 30, 2021

Conversation

jwinkler2083233
Copy link
Contributor

Jeff Winkler and others added 6 commits July 3, 2021 21:35
EjectTrueRandom appears to be saturating CPUs, so this changeset
is for testing the alternative.
These changes are to eliminate a majority of the linear searching that
was occurring during the EjectTrueRandom() method.
This has some updates to the eject logic
Removed a couple of magic numbers and used constants instead.
Improved comments
Jeff Winkler added 6 commits July 12, 2021 17:36
This changeset alters the unit tests to handle changes to the
eject logic.
…/flow-go into jwinkler2083233/eject-fake-random
This has a minor fix in whitespace for lint
This fixes a whitespace error on line 31
Fixes up whitespace
The unit tests are updated to reflect the change in logic for
the eject feature.   The ejections are now batched in groups of
64 and triggered by an accumulation that is greater than or
equal to 128 over the limit.
@codecov-commenter
Copy link

codecov-commenter commented Jul 13, 2021

Codecov Report

Merging #945 (21ca55d) into master (8e0d32b) will decrease coverage by 0.08%.
The diff coverage is 38.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #945      +/-   ##
==========================================
- Coverage   53.38%   53.29%   -0.09%     
==========================================
  Files         318      318              
  Lines       21491    21511      +20     
==========================================
- Hits        11472    11465       -7     
- Misses       8450     8475      +25     
- Partials     1569     1571       +2     
Flag Coverage Δ
unittests 53.29% <38.09%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
module/mempool/stdmap/incorporated_result_seals.go 80.82% <0.00%> (ø)
module/mempool/stdmap/eject.go 47.36% <21.42%> (-34.22%) ⬇️
module/mempool/stdmap/backend.go 92.04% <72.72%> (ø)
module/mempool/stdmap/options.go 100.00% <100.00%> (ø)
module/mempool/stdmap/pending_receipts.go 68.04% <0.00%> (-4.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e0d32b...21ca55d. Read the comment docs.

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jeff. This is an interesting PR. I am not quite clear what the stochastic guarantees are that the EjectTrueRandom policy is supposed to satisfy.

Context

From a mathematical standpoint, there is one safe answer: a simple random sample (which is actually not simple to implement efficiently):

  • consider a mempool with N elements and the desire to eject k
  • what we desire is that each subset of k elements has the same probability of being chosen

Any sampling method that does not generate a simple random sample introduces a bias, which could be exploited to compromise liveness.

  • If we don't care about unbiased sampling, we can just eject the first k elements
  • Based on the terminology of "TrueRandom" in EjectTrueRandom I would assume that we actually want an unbiased sample (?)

So does this implementation produce an unbiased sample? Unfortunately not:

  • Consider a mempool with N=4096 = 64 * 64 elements from which we want to eject k=64.
    • An unbiased sampling algorithm must be able to generate the sample 0, 1, 2, ..., 63 as one possible outcomes.
    • But the current implementation can only pick a single index from the interval [0, 63] and then picks the second index from [64, 127] and so on. It is quite biased in that it enforces a fairly uniform coverage of the total index space.
  • As a second example, consider a mempool N=4100 = 64 * 64 + 4. Due to the integer devision
    maxInterval := mapSize / batchSize
    the last 4 elements cannot be ejected.

So overall, we can see that the ejection policy is not "true random", it is in fact quite biased.

suggestions

Fact of the matter is that the previous implementation had a an unbiased sampling policy, which is provably safe (but with a horrible performance). You have changed the default behaviour to an unsafe (yet more performant) ejection policy. This concerns me, because you might be breaking safety assumptions (I don't think this is actually the case ... but nevertheless, its not a good practise to change the default behaviour from safe to unsafe).

So what can we do? I think there are a few options.

In any case

We should make it unambiguously clear that the fast ejection policy is statistically biased:

  • choose a more accurate name, e.g. FastBiasedRandom
  • and include a detailed documentation in which way this ejection policy is biased (feel free to reuse my examples above)

Default behaviour

I think the default behaviour should remain a safe ejection policy. The nodes can always use the unsafe, more performant policy with careful consideration.

There are efficient unbiased sampling algorithms, which we could implement

The old implementation had complexity O(k*N). I think we should use your trick of ejecting in batches of constant size. But maybe we can use a provably unbiased sampling algorithm? There are some which are only slightly less costly than your implementation. Suggestion:

  • you want to sample k elements from the index set [0, 1, ..., N-1]
  • The basic approach would be a Fisher-Yates, but it requires you to enumerate the full index set. Though, your initial population P would be P[i] = i.
  • Instead of writing P out, you can only memorize the slots of P which you picked.

Happy to write up a pseudo-code algorithm, if that would help. I feel then we have a good set of algorithms:

  • FakeRandomSampling if you don't meed unbiased sampling
  • a solution that is much more efficient than the current EjectTrueRandom (utilizing your idea of batch ejection) that still produces an unbiased sample

module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/backend.go Outdated Show resolved Hide resolved
jwinkler2083233 and others added 2 commits July 16, 2021 09:39
Co-authored-by: Alexander Hentschel <alex.hentschel@axiomzen.co>
Co-authored-by: Alexander Hentschel <alex.hentschel@axiomzen.co>
@jwinkler2083233
Copy link
Contributor Author

I agree completely that the new logic has bias. The bias we're seeing is very small (<0.1%), but it should be eliminated nonetheless.

As a fix, how about I modify the part that generates random indexes, so that the last index uses that actual remaining count of values to determine a random index. That would theoretically eliminate the bias in cases where the count of elements in the container is not evenly divisible by 64. 64 is an arbitrary value I chose because the compiler generates optimal code via bit shifting when 64 is the modulus.

This changeset tweaks the algorithm for ejecting elements
from the cache, such that there is less bias against the
remaining elements at the end of the cache.
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/eject.go Show resolved Hide resolved
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
@AlexHentschel
Copy link
Member

AlexHentschel commented Jul 16, 2021

the last index uses that actual remaining count of values to determine a random index

that would eliminate some bias that some elements have zero probability to be ejected.

Based on my understanding, the first bias:

  • Consider a mempool with N=4096 = 64 * 64 elements from which we want to eject k=64.
    • An unbiased sampling algorithm must be able to generate the sample 0, 1, 2, ..., 63 as one possible outcomes.
    • But the current implementation can only pick a single index from the interval [0, 63] and then picks the second index from [64, 127] and so on. It is quite biased in that it enforces a fairly uniform coverage of the total index space.

still remains (?)

Jeff Winkler added 3 commits July 19, 2021 09:40
…/flow-go into jwinkler2083233/eject-fake-random
This changeset has a larger population used for the random values,
and those values are then sorted before they are used.
jwinkler2083233 and others added 3 commits July 23, 2021 14:52
Co-authored-by: François Garillot <4142+huitseeker@users.noreply.github.com>
I checked this in godbolt, and Go is using 10 instructions less
when a tuple is used for a swap.
@jwinkler2083233
Copy link
Contributor Author

jwinkler2083233 commented Jul 23, 2021

https://github.com/dapperlabs/flow-go/issues/5645 is the related issue

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high-level design comments:

(1) Allowing the ejector to go below the specified capacity limit

removed as this was a misunderstanding on my part interpreting the test case.

(2) Return values of the EjectFunc

With the updated convention for the EjectFunc, the return values don't make sense for me anymore. We specifically allow the ejector to remove multiple elements, but the ejector can only return a single one. That doesn't make sense to me.

Suggestion:

EjectFunc func(b *Backend)

This is the API from the view-point of the Backend. It doesn't care what the ejector does specifically. If you need additional logic around the ejector (as your comment seems to indicate), I would suggest wrapping the ejector itself.

(3) Documentation of the EjectFunc

I think it would be worth-while to clearly specify the conventions an ejector implementation has to satisfy. Currently, it states:

// EjectFunc is a function used to pick an entity to evict from the memory pool
// backend when it overflows its limit. A custom eject function can be injected
// into the memory pool upon creation, which allows us to hook into the eject
// to clean up auxiliary data and/or to change the strategy of eviction.
type EjectFunc func(b *Backend) (flow.Identifier, flow.Entity, bool)

Here is my interpretation of the convention:

// EjectFunc implements an ejection policy to remove elements when the mempool
// exceeds its specified capacity. A custom ejection policy can be injected
// into the memory pool upon creation to change the strategy of eviction.
// The ejection policy is executed from within the thread that serves the
// mempool. Implementations should adhere to the following convention:
//  * The ejector function has the freedom to eject _multiple_ elements.
//  * In a single `eject` call, it must eject as many elements to statistically
//    keep the mempool size within the desired limit.
//  * The ejector _might_ (for performance reasons) retain more elements in the
//    mempool than the targeted capacity.
//  * The ejector _must_ notify the `Backend.ejectionCallbacks` for _each_
//    element it removes from the mempool.
//  * Implementations do _not_ need to be concurrency safe. The Backend handles
//    concurrency (specifically, it locks the mempool during ejection).
//  * The implementation should be non-blocking (though, it is allowed to
//    take a bit of time; the mempool will just be locked during this time).
type EjectFunc func(b *Backend)

In this PR, I have roughly implemented my suggestions https://github.com/onflow/flow-go/pull/1025/files

module/mempool/stdmap/backend.go Show resolved Hide resolved
module/mempool/stdmap/backend_test.go Show resolved Hide resolved
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/eject.go Outdated Show resolved Hide resolved
module/mempool/stdmap/incorporated_result_seals.go Outdated Show resolved Hide resolved
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. There is a panicking edge-case in the EjectTrueRandomFast (see #945 (comment)); easy to fix. Approving this PR trusting that this is fixed before merging.

Thanks

Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Batch ejection makes sense, I appreciate you taking the time to also make it legible!

@jwinkler2083233 jwinkler2083233 merged commit f2547a2 into master Jul 30, 2021
@jwinkler2083233 jwinkler2083233 deleted the jwinkler2083233/eject-fake-random branch July 30, 2021 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants