Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: retry when overdraft peer might be closest #3052

Merged
merged 8 commits into from Jul 6, 2022
Merged

Conversation

notanatol
Copy link
Contributor

@notanatol notanatol commented Jun 29, 2022

Checklist

Description

Push Sync closestPeer overdraft case

Closes: #3048


This change is Reviewable

@notanatol notanatol marked this pull request as draft June 29, 2022 07:52
@notanatol notanatol changed the title fix: wip fix: retry when overdraft peer might be closest Jun 29, 2022
@notanatol notanatol marked this pull request as ready for review June 29, 2022 10:22
pkg/pushsync/pushsync.go Outdated Show resolved Hide resolved
@istae istae requested a review from metacertain June 29, 2022 17:03
pkg/pushsync/pushsync.go Outdated Show resolved Hide resolved
Copy link
Contributor

@mrekucci mrekucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 4 files at r1, 1 of 1 files at r2.
Reviewable status: 3 of 4 files reviewed, 3 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/pushsync/pushsync.go line 388 at r2 (raw file):

				skipPeers.ResetOverdraft() // reset the overdraft list and retry (in case the closest peer was there)
				return swarm.ZeroAddress, errContinue

My suggestion here would be to return three parameters instead of using a specific error to signal the retry.


pkg/pushsync/pushsync.go line 428 at r2 (raw file):

Previously, aloknerurkar wrote…

If we continue here after an errContinue, the select case would just hang. This assumes a push has been initiated and we wait for the resultChan to unblock us.

I'd suggest avoiding goto here, if 3rd param is returned, this can be rewritten as:

// ...
peer, retry, err := nextPeer()
if err != nil {
	return nil, err
}

// ...

if retry {
	// reached the limit, do not set timer to retry
	if allowedRetries <= 0 || allowedPushes <= 0 {
		continue
	}

	// retry
	timer.Reset(p90TTL)
}

// ...

pkg/skippeers/skip_peers_test.go line 37 at r2 (raw file):

	})

	t.Run("add overdraft removes from addresses", func(t *testing.T) {

The subtests don't add much value here and can cause problems. For example: if I only run the "add overdraft removes from addresses" subtest, it will fail because it depends on the "add peer" subtest being run before it.


pkg/skippeers/skip_peers.go line 16 at r2 (raw file):

	overdraftAddresses []swarm.Address
	addresses          []swarm.Address
	mu                 sync.Mutex

Mutex is usually placed before the fields it guards.


pkg/skippeers/skip_peers.go line 36 at r2 (raw file):

	defer s.mu.Unlock()

	for _, a := range s.addresses {

I'd suggest using map in order to avoid iterating the whole set before adding a new element.


pkg/skippeers/skip_peers.go line 49 at r2 (raw file):

	defer s.mu.Unlock()

	for _, a := range s.overdraftAddresses {

I'd suggest using map in order to avoid iterating the whole set before adding a new element.


pkg/skippeers/skip_peers.go line 70 at r2 (raw file):

	s.mu.Lock()
	defer s.mu.Unlock()
	return len(s.overdraftAddresses) <= 0

Just a nitpick, len(s.overdraftAddresses) == 0 is enough, since len() cannot be less than 0.

@mrekucci mrekucci self-requested a review June 30, 2022 09:36
Copy link
Contributor Author

@notanatol notanatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 3 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/skippeers/skip_peers.go line 36 at r2 (raw file):

Previously, mrekucci wrote…

I'd suggest using map in order to avoid iterating the whole set before adding a new element.

for small datasets a slice is cheaper than a map, if I'm not mistaken.

Copy link
Contributor

@mrekucci mrekucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, and @notanatol)


pkg/skippeers/skip_peers.go line 36 at r2 (raw file):

Previously, notanatol (Anatol) wrote…

for small datasets a slice is cheaper than a map, if I'm not mistaken.

It's up to a ~100 entries.

pkg/pushsync/pushsync.go Outdated Show resolved Hide resolved
pkg/pushsync/pushsync.go Outdated Show resolved Hide resolved
pkg/skippeers/skip_peers.go Outdated Show resolved Hide resolved
pkg/pushsync/pushsync.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@notanatol notanatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 8 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/pushsync/pushsync.go line 388 at r2 (raw file):

Previously, mrekucci wrote…

My suggestion here would be to return three parameters instead of using a specific error to signal the retry.

fixed


pkg/pushsync/pushsync.go line 428 at r2 (raw file):

Previously, mrekucci wrote…

I'd suggest avoiding goto here, if 3rd param is returned, this can be rewritten as:

// ...
peer, retry, err := nextPeer()
if err != nil {
	return nil, err
}

// ...

if retry {
	// reached the limit, do not set timer to retry
	if allowedRetries <= 0 || allowedPushes <= 0 {
		continue
	}

	// retry
	timer.Reset(p90TTL)
}

// ...

Done.


pkg/pushsync/pushsync.go line 351 at r4 (raw file):

Previously, istae (istae) wrote…

replace with a better error message

swapped for a boolean instead of signal error


pkg/pushsync/pushsync.go line 488 at r4 (raw file):

Previously, istae (istae) wrote…

no need to export, and can be moved to the top of the file.

Done.


pkg/skippeers/skip_peers.go line 23 at r4 (raw file):

Previously, istae (istae) wrote…

this is a bit strange, why do we do this?

not sure, this piece of code was moved without changing, I think it's just a redundant way to make sure we make a copy of a slice...

Copy link
Contributor Author

@notanatol notanatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 8 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/pushsync/pushsync.go line 415 at r5 (raw file):

Previously, istae (istae) wrote…

can be replaced with a variable, @metacertain is half a second sufficient?

done

@notanatol notanatol changed the base branch from master to ayscn_refresh July 5, 2022 07:46
Copy link
Contributor Author

@notanatol notanatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 13 files reviewed, 8 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, and @mrekucci)


pkg/pushsync/pushsync.go line 411 at r5 (raw file):

Previously, istae (istae) wrote…

much cleaner, thanks :)

👍

@notanatol notanatol changed the base branch from ayscn_refresh to master July 5, 2022 10:53
Copy link
Member

@acud acud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 4 files at r1, 1 of 1 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/pushsync/pushsync.go line 361 at r7 (raw file):

	defer timer.Stop()

	nextPeer := func() (swarm.Address, bool, error) {

it is not clear what this boolean value means. it would be nice to have named returns here at least for the sake of clarity, or at least adequate comments describing what is being returned


pkg/skippeers/skip_peers.go line 23 at r4 (raw file):

Previously, notanatol (Anatol) wrote…

fixed

that's correct, the double append makes sure that we create a new slice. if you return an append you must make sure that there's no calling context that does an append on the returned slice, otherwise you risk some external component manipulating the underlying array in an unsolicited manner.

in general in getters i would keep things like this, as it is very risky to return the underlying address of the slice. also, bear in mind that when you do an append(s.addresses, s.overdraftAddresses...) you're basically mutating s.addresses to have s.overdraftAddresses, so every time you'll call the getter you're constantly mutating s.addresses.

I created a go playground for this but I suspect a compiler optimization is preventing the mutation of the underlying field. The documentation clearly shows that the underlying slice should be mutated here. I would stick to returning a new slice.

Copy link
Member

@acud acud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @aloknerurkar, @istae, @metacertain, @mrekucci, and @notanatol)


pkg/skippeers/skip_peers.go line 23 at r4 (raw file):

Previously, acud (acud) wrote…

that's correct, the double append makes sure that we create a new slice. if you return an append you must make sure that there's no calling context that does an append on the returned slice, otherwise you risk some external component manipulating the underlying array in an unsolicited manner.

in general in getters i would keep things like this, as it is very risky to return the underlying address of the slice. also, bear in mind that when you do an append(s.addresses, s.overdraftAddresses...) you're basically mutating s.addresses to have s.overdraftAddresses, so every time you'll call the getter you're constantly mutating s.addresses.

I created a go playground for this but I suspect a compiler optimization is preventing the mutation of the underlying field. The documentation clearly shows that the underlying slice should be mutated here. I would stick to returning a new slice.

Maybe it would be good to have a unit test, to see that returning the appended output does not result in a grow of the field (i.e. repeated calls with different output return the expected output, not resulting in a grow of the slice, consider also mutating the appended content to see that the returned value does not mutate).

Copy link
Contributor Author

@notanatol notanatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 4 files reviewed, 9 unresolved discussions (waiting on @acud, @aloknerurkar, @istae, @metacertain, and @mrekucci)


pkg/pushsync/pushsync.go line 361 at r7 (raw file):

Previously, acud (acud) wrote…

it is not clear what this boolean value means. it would be nice to have named returns here at least for the sake of clarity, or at least adequate comments describing what is being returned

added comment and names


pkg/skippeers/skip_peers.go line 23 at r4 (raw file):

Previously, acud (acud) wrote…

Maybe it would be good to have a unit test, to see that returning the appended output does not result in a grow of the field (i.e. repeated calls with different output return the expected output, not resulting in a grow of the slice, consider also mutating the appended content to see that the returned value does not mutate).

I replaced it with an explicit copy-on-read version, pls review

Copy link
Member

@acud acud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r8, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @aloknerurkar, @istae, @metacertain, and @mrekucci)

@notanatol notanatol merged commit 7cc505e into master Jul 6, 2022
@notanatol notanatol deleted the fix-push-overdraft branch July 6, 2022 07:48
@aloknerurkar aloknerurkar added this to the 1.6.3 milestone Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Push Sync closestPeer overdraft case
6 participants