New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE BRANCH] More work on shuffle sharding utils #81182
[FEATURE BRANCH] More work on shuffle sharding utils #81182
Conversation
/sig api-machinery |
/cc @yue9944882 |
515aa1e
to
92359be
Compare
return hands[i] < hands[j] | ||
}) | ||
for i := uint64(0); i < hashMax; i++ { | ||
ShuffleAndDealIntoHand(i, deckSize, aHand[:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, using i
rather than rand.Uint64()
as the hash value made the distribution in the histogram much tighter. Using i
does an even sampling of the hash value space (modulo the fact that the falling factorial is not a power of two), whereas using rand.Uint64()
relies on the quality of rand
(which evidently is not that great).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, if we set hashMax touint64(fallingFactorial)
then every hand gets dealt the same number of times.
/cc @yliaog |
92359be
to
3581c05
Compare
if handSize <= 0 || deckSize <= 0 || handSize > deckSize { | ||
return false | ||
// ValidateParameters finds errors in the parameters for shuffle | ||
// sharding. Returns an empty slice iff there are no errors. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iff is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"iff" is standard jargon for "if and only if"
errs = append(errs, "deckSize is not positive") | ||
} | ||
if len(errs) > 0 { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we combine some if to make return in if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I chose to use the existing structure because it made a certain amount of sense. First, we check the range of each individual value, accumulating errors. If there are any then take an early out. Then check logically necessary relationships. Early out if that revealed problems. Finally do the more complicated and costly quantitative check. I thought that would be easiest on other peoples' minds. But since you objected, I will rewrite to be more direct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with Mike, the current flow is easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yliaog : I am not sure which you are agreeing with. Is the revised code OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the current code structure looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,let's keep this.
allCoordinateCount := 128 * 127 * 126 / 6 | ||
fallingFactorial := 128 * 127 * 126 | ||
allCoordinateCount := fallingFactorial / 6 | ||
hashMax := uint64(16 << uint32(math.Ceil(math.Log2(float64(fallingFactorial))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have hashMax := fallingFactorial << (64 - maxHashBits)
here? The center will be 2^(64-maxHashBits). Seems easy to be understood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is easy to prove that whenever hashMax is an integer multiple of fallingFactorial
the distribution of hand occurrences is perfect --- every hand gets dealt the same number of times. A more interesting test is when hashMax is a power of 2 rather than a multiple of fallingFactorial
.
We could replace 16 << X
with 1 << (X + 64 - maxHashBits)
.
histogram[count] = histogram[count] + 1 | ||
} | ||
|
||
for count := int(math.Ceil(center * 0.9)); count <= int(math.Floor(center*1.1)); count++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still 3 fixed number here for test, 0.9, 1.1 and 0.99, and they are linked, hard to maintain I think. How about using standard deviation? Then we will only have one fixed number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @yue9944882
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am fine with the hardcodes in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are really only two numbers, not three; 0.9 and 1.1 are just 0.1 below and above unity.
I am not sure there is a real issue with maintainability here. The ShuffleAndDeal function is intended to implement a certain mathematical function; alternative implementations of the same function will produce exactly the same histogram (because it is a consequence of the mathematical function of shuffle-and-deal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we bikeshed the evaluation of the histogram in follow-up PRs? I would like to get the validation function put to bed.
From the math of the ShuffleAndDeal function, when N*fallingFactorial
(for any whole number N) consecutive hash values are studied we know that each hand will be dealt the same number of times and that number is N*factorial(handSize)
. So to test that this is what happened the evaluation should sum the histogram from floor(hashMax/fallingFactorial)*factorial(handSize)
through ceiling(hashMax/fallingFactorial)*factorial(handSize)
and insist on finding 100% of the mass there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revised again, making the uniformity tester run multiple test cases and evaluate whether all the counts in the histogram fall in the expected range.
// sharding. Returns an empty slice iff there are no errors. The | ||
// entropy requirement is evaluated in a fast but approximate way: | ||
// bits(deckSize^handSize). | ||
func ValidateParameters(deckSize, handSize int32) (errs []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(errs []string) {
avoid named return value as much as possible, it's conventional in k/k to avoid that usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine reasons to avoid named return values in general, but in this case I think it works well. The alternative starts out with
func ValidateParameters(deckSize, handSize int32) []string {
errs := nil
and has more verbosity in the remainder too.
histogram[count] = histogram[count] + 1 | ||
} | ||
|
||
for count := int(math.Ceil(center * 0.9)); count <= int(math.Floor(center*1.1)); count++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am fine with the hardcodes in the test
@@ -71,14 +88,14 @@ func ShuffleAndDealWithValidation(hashValue uint64, deckSize, handSize int32, pi | |||
|
|||
// ShuffleAndDealToSlice will use specific pick function to return slices of indices | |||
// after ShuffleAndDeal | |||
func ShuffleAndDealToSlice(hashValue uint64, deckSize, handSize int32) []int32 { | |||
func ShuffleAndDealToSlice(hashValue uint64, deckSize, handSize int32) []int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the point of s/int32/int/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort.Slice(int32Array, func(i, j int){ return int32Array[i] < int32Array[j]})
we can sort int32 in another style. all the elements in the returned slice are not supposed to exceed an int32 deckSize
, returning []int32 is more reasonable here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://golang.org/ref/spec#Numeric_types promises that an int
can hold an int32
.
Yes, we could make clients sort in another way. But why should we? Returning int
makes for simpler client code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further thought: I understand the dislike of the dissonance between int32
arguments and int
returns. Using int32
for the arguments is really just inertia from when this was entirely dedicated to the priority and fairness feature, which is configured with int32
parameters. But for general purpose use, I think int
is preferred as it is the natural integer data type. There is an error here in putting the boundary between priority-and-fairness config fealty and general sensibilities inside this function. I will recode so that this general purpose function takes int
size parameters as well as returning int
cards.
if handSize <= 0 || deckSize <= 0 || handSize > deckSize { | ||
return false | ||
// ValidateParameters finds errors in the parameters for shuffle | ||
// sharding. Returns an empty slice iff there are no errors. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iff is correct
f3105c4
to
a8342c6
Compare
/retest |
errs = append(errs, "deckSize is not positive") | ||
} | ||
if len(errs) > 0 { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with Mike, the current flow is easier to understand.
if handSize <= 0 || deckSize <= 0 || handSize > deckSize { | ||
return false | ||
// ValidateParameters finds errors in the parameters for shuffle | ||
// sharding. Returns an empty slice iff there are no errors. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it an empty slice or nil slice returned when there are no errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nil
slice is empty. https://play.golang.org/p/AdlgjTL2DdP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are subtle differences between the two,
https://www.pixelstech.net/article/1539870875-Empty-slice-vs-nil-slice-in-GoLang,
https://programming.guide/go/nil-slice-vs-empty-slice.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, by "empty" I meant "len() returns 0". I will reword to make it clearer.
handSize := int32(len(hand)) | ||
var idx int | ||
ShuffleAndDeal(hashValue, deckSize, handSize, func(card int32) { | ||
hand[idx] = int(card) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't idx always 0, since ShuffleAndDeal is called only once inside ShuffleAndDealIntoHand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShuffleAndDeal
calls its pick
parameter (which is the function here) handSize
times, and each increments idx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Mike is right.
3aa0ea3
to
270765e
Compare
errs = append(errs, "deckSize is not positive") | ||
} | ||
if len(errs) > 0 { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,let's keep this.
handSize := int32(len(hand)) | ||
var idx int | ||
ShuffleAndDeal(hashValue, deckSize, handSize, func(card int32) { | ||
hand[idx] = int(card) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Mike is right.
} | ||
|
||
// ShuffleAndDeal can shuffle a hash value to handSize-quantity and non-redundant | ||
// indices of decks, with the pick function, we can get the optimal deck index | ||
// Eg. From deckSize=128, handSize=8, we can get an index array [12 14 73 18 119 51 117 26], | ||
// then pick function will choose the optimal index from these | ||
// Algorithm: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190228-priority-and-fairness.md#queue-assignment-proof-of-concept | ||
func ShuffleAndDeal(hashValue uint64, deckSize, handSize int32, pick func(int32)) { | ||
remainders := make([]int32, handSize) | ||
func ShuffleAndDeal(hashValue uint64, deckSize, handSize int, pick func(int)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeSpreitzer @yue9944882 why do we change int32 to int here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I documented that in the commit comment. See it at aadd486 , for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wrote about it in another comment: #81182 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, I think both are OK if we accept the type conversion in priority and fairness. cc @yue9944882
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convinced, let's make it an int
minCount := permutations * int(math.Floor(nff)) | ||
maxCount := permutations * int(math.Ceil(nff)) | ||
aHand := make([]int, test.handSize) | ||
for i := 0; i < test.hashMax; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is good to test from 0 to hashMax, because we all know the principle of our algorithm, it will get same results in [allCoordinateCount*N, allCoordinateCount*(N+1) ), so the test case will always succeed, which is not expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the point of unit tests is to check whether the implementation is doing what it is intended to do. If we were confident that the implementation does what it is intended to do then these unit tests would prove nothing, and the only things to prove would be math exercises that do not need to be repeated in unit tests.
By exploring from 0 to hashMax-1, we can test a range of hash space that is like what gets used in the priority and fairness filter --- a range from 0 to SomePowerOfTwo-1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think these test cases are enough for priority and fairness filter.
}{ | ||
{64, 3, 1 << uint(math.Ceil(math.Log2(float64(ff(64, 3))))+spare)}, | ||
{128, 3, ff(128, 3)}, | ||
{128, 3, 3 * ff(128, 3)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If {128, 3, ff(128,3)} pass, {128, 3, 3 *ff(128, 3)} will pass either, so I think this case is duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking for the {128, 3, 3 * ff(128, 3)}
test case is that by exploring a larger hash value space this gives the implementation more opportunities to mess up.
/hold cancel |
minCount := permutations * int(math.Floor(nff)) | ||
maxCount := permutations * int(math.Ceil(nff)) | ||
aHand := make([]int, test.handSize) | ||
for i := 0; i < test.hashMax; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think these test cases are enough for priority and fairness filter.
/lgtm |
@mars1024: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold for the clean-up from FQ package |
aadd486
to
4c4f536
Compare
Latest revision changes the fairqueuing package to use the shuffle sharding utility package. |
@@ -254,10 +231,16 @@ func (qs *queueSetImpl) ChooseQueueIdx(hashValue uint64, handSize int) int { | |||
// TODO(aaron-prindle) currently a lock is held for this in a larger anonymous function | |||
// verify that makes sense... | |||
|
|||
bestQueueIdx := -1 | |||
bestQueueLen := 2147483647 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
math.MaxInt32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way it is an integer and no conversion is required when doing the comparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see that constant is actually untyped. So the revised code has an unnecessary typecast. Oh well.
Changes following up on PR #807810 . Made the validation checking function return a slice of error messages rather than just a bit. Replaced all the `int32` with `int` because this is intended for more than just the priority-and-faireness feature and so should not be a slave to its configuration datatypes. Introduced ShuffleAndDealIntoHand, to make memory allocation the caller's problem/privilege. Made the hand uniformity tester avoid reflection, evaluate the histogram against the expected range of counts, and run multiple test cases, including one in which the number of hash values is a power of two with four extra bits (as the validation check requires) and one in which the deck size is not a power of two. Updated the fairqueuing implementation to use the shuffle sharding utility package.
/hold cancel |
24e9424
to
e5c9f50
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MikeSpreitzer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
/test pull-kubernetes-integration |
/test pull-kubernetes-kubemark-e2e-gce-big |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Changes following up on PR #80710 .
Made the validation checking function return a slice of error messages
rather than just a bit.
Made the shard-to-slice functions return
[]int
rather than[]int32
onthe expectation of increased convenience.
Made the hand uniformity tester avoid reflection, use a number of hash
values corresponding to the validation check, and evaluate a histogram
of hand counts.
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: