Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shuffle sharding package for priority and fairness #83665

Open
wants to merge 3 commits into
base: master
from

Conversation

@mars1024
Copy link
Member

commented Oct 9, 2019

What type of PR is this?

/kind feature

What this PR does / why we need it:

Implement package for shuffle sharding algorithm, which will be used in KEP priority and fariness .

Special notes for your reviewer:

Squashed PRs #80710 #81182 , related PRs #81833 (need to be merged in flowcontrol package later)

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Implement several shuffle sharding functions including ShuffleAndDeal,
ShuffleAndDealToSlice.

Add benchmarks and tests for shuffle sharding to test performance,
correctness and distribution uniformity.

Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 9, 2019

@mars1024 mars1024 changed the title shuffle sharding package shuffle sharding package for priority and fairness Oct 9, 2019
Changes following up on shuffle sharding util package.

Made the validation checking function return a slice of error messages
rather than just a bit.

Replaced all the `int32` with `int` because this is intended for more
than just the priority-and-faireness feature and so should not be a
slave to its configuration datatypes.

Introduced ShuffleAndDealIntoHand, to make memory allocation the
caller's problem/privilege.

Made the hand uniformity tester avoid reflection, evaluate the
histogram against the expected range of counts, and run multiple test
cases, including one in which the number of hash values is a power of
two with four extra bits (as the validation check requires) and one in
which the deck size is not a power of two.
@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from 99738ec to da0b647 Oct 9, 2019
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 9, 2019

/retest

Copy link
Member

left a comment

I made some independent comments in-line.
Also, why not have this PR put the files in the proper directory?

Copy link
Member

left a comment

First set of comments, I might have more but I think this public interface needs to be changed first.

@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 11, 2019

I made some independent comments in-line.
Also, why not have this PR put the files in the proper directory?

Which proper directory you mean?

@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from a6504f3 to 6238761 Oct 12, 2019
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 12, 2019

Updated, introduce Dealer and address some comments.

@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from 6238761 to 28d8eaa Oct 12, 2019
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 12, 2019

/retest

1 similar comment
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 14, 2019

/retest

@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from 28d8eaa to e022dba Oct 15, 2019
@lavalamp

This comment has been minimized.

Copy link
Member

commented Oct 15, 2019

I think we are super close now, thanks for bearing with us :)

@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from e022dba to 08c7ad8 Oct 18, 2019
@mars1024

This comment has been minimized.

Copy link
Member Author

commented Oct 18, 2019


// DealIntoHand shuffles a deck of the given size by the given hash value and
// deals cards into a given hand(slice) which should be pre-allocated.
// The size of the given hand must be equal to the initial handSize of the Dealer.

This comment has been minimized.

Copy link
@mars1024

mars1024 Oct 18, 2019

Author Member

Last question, should we handle the situation that the size of given hand is not equal to the handSize of Dealer, do nothing or return error? Or just put this limitation in comments to let user obey this? @lavalamp @MikeSpreitzer

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Oct 18, 2019

Member

This function should return an error that is non-nil in that case and that case only.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Oct 18, 2019

Member

An alternative is:

func (d *Dealer) DealIntoHand(hashValue uint64, hand []int) []int {
	h := hand[:0]
	d.Deal(hashValue, func(card int) { h = append(h, int(card)) })
	return h
}

This has no chance of failing and is zero-alloc if a hand of the correct size is passed in. Callers don't have to check an error.

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Oct 18, 2019

Member

Ah, good idea! And you mean "a hand of an adequate size", right? This code even works if the caller passes in nil for the slice, right? That would be good. The comment should be updated too.

We could add a check of cap(hand) >= d.handSize at the start of that function and explicitly allocate a big-enough h if needed; that would perform better in the case that the capacity is 2 or more lower than needed, right? I would be happy to continue not doing that, on the grounds that that would be for a use case that we do not want to encourage.

This comment has been minimized.

Copy link
@mars1024

mars1024 Oct 21, 2019

Author Member

Good idea from @lavalamp , thanks!

Clean up useless functions, only keep the basic function Deal
and the function DealIntoHand which will be used by Priority
and Fairness.

Improve some comments for constants and functions.

Introduce Dealer to combine parameters and methods into a whole.

Use fixed-size slice to improve performance.

Use math.Ceil and math.Log2 to calculate required entropy bits.

Make the given hand adaptive to handSize in DealIntoHand.

Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
@mars1024 mars1024 force-pushed the mars1024:feat/shuffle-sharding branch from 08c7ad8 to c67c64d Oct 21, 2019
if handSize > deckSize {
return nil, fmt.Errorf("handSize %d is greater than deckSize %d", handSize, deckSize)
}
if deckSize > 1<<26 {

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Oct 21, 2019

Member

Since the function's comment is documenting error conditions, this one has to be included.

// DealIntoHand shuffles a deck of the given size by the given hash value and
// deals cards into a given hand(slice) which should be pre-allocated, if the
// size of the given hand is not equal to the handSize of the Dealer, the given
// hand will be expanded or shrunk to the "handSize", then will be returned.

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Oct 21, 2019

Member

This comment needs some adjusting. The given slice is passed by value, so the caller sees no modification to that. Exact equality is not important here, note that what we actually have is a one-sided constraint. I suggest a revision like the following.

// DealIntoHand suffles and deals according to the Dealer's parameters,
// using the given hashValue as the source of entropy and
// returning the dealt cards as a slice of int. The returned slice uses the
// backing array of the given slice if it is big enough --- which it should be;
// the function will perform correctly but not optimally otherwise.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Oct 21, 2019

Member

Perhaps even clearer:

// DealIntoHand suffles and deals according to the Dealer's parameters,
// using the given hashValue as the source of entropy and
// returning the dealt cards as a slice of `int`.
// If `hand` is the correct length, it will be used as-is and no allocations will be made.
// If `hand` is nil or too small, it will be extended (performing an allocation).
// If `hand` is too large, a sub-slice will be returned.
// This function makes handSize sequential calls to pick, one for each dealt card.
// Each card is identified by an integer in the range [0, deckSize).
// For example, for deckSize=128 and handSize=4 this function might call pick(14); pick(73); pick(119); pick(26).
// This function can be extended with different pick functions.

This comment has been minimized.

Copy link
@MikeSpreitzer

MikeSpreitzer Oct 21, 2019

Member

This comment needs adjusting. The parameters are no longer given here. The statement that "This function can be extended with different pick functions" is awkward: "extend" is not quite right, the caller must always supply a pick function, and it's obvious (since we say nothing to forbid it) that the caller can supply whatever pick function it wants. I suggest something like the following.

// Deal shuffles a card deck and deals a hand of cards, using the given hashValue as the source of entropy.
// The deck size and hand size are properties of the Dealer.
// This function synchronously makes sequential calls to pick, one for each dealt card.
// Each card is identified by an integer in the range [0, deckSize).
// For example, for deckSize=128 and handSize=4 this function might call pick(14); pick(73); pick(119); pick(26).

@MikeSpreitzer

This comment has been minimized.

Copy link
Member

commented Oct 21, 2019

/retest

@MikeSpreitzer

This comment has been minimized.

Copy link
Member

commented Oct 21, 2019

FYI, I wondered whether returning the slice would confuse the compiler's escape analysis. To investigate that, I ran the unit tests with go test -gcflags -m .. That produced many remarks from the escape analysis, but none said that the given hand escapes in the unit test lines that invoke Dealer::DealIntoHand. However, those are probably not the lines we're looking for.

@MikeSpreitzer

This comment has been minimized.

Copy link
Member

commented Oct 21, 2019

Looking more carefully at the escape analysis remarks, I see the following. The unit test file makes three calls to DealIntoHand. The last two are coded in such a way that the given hand escapes to the heap regardless of what DealIntoHand does, and the analysis says that the hand escapes. The first one is not coded in that way, but the given hand is always empty (make([]int, 0)). For this the analysis says that it does not escape.

@MikeSpreitzer

This comment has been minimized.

Copy link
Member

commented Oct 21, 2019

I extended the last test function with some code that uses a non-empty given hand, and the analysis said it does not leak. Here is the code:

	hf := make([]int, 6, 12)
	hf2 := dealer.DealIntoHand(42, hf)
	if len(hf2) != 6 {
		t.Errorf("Failed at last!")
	}

and here is what the analysis said:

./shufflesharding_test.go:314:12: TestDealer_DealIntoHand make([]int, 6, 12) does not escape
@lavalamp

This comment has been minimized.

Copy link
Member

commented Oct 21, 2019

I am satisfied by this, I think everything remaining is a nit.

/approve

(anyone can LGTM one you fix the comments)

Thanks for the escape analysis, Mike!

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, mars1024

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.