Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis: Backfill last acknowledged #1288

Merged
merged 17 commits into from
Dec 2, 2020

Conversation

aLekSer
Copy link
Collaborator

@aLekSer aLekSer commented Nov 19, 2020

What this PR does / Why we need it:
Add SortedSet to store Backfill in a sorted order to retrieve all expired Backfills.

Which issue(s) this PR fixes:
Work on #1240

Special notes for your reviewer:
This PR is dependent on StateStore changes.
We can select a specific TTL config variable later during implementation of Cleanup Service.

func (rb *redisBackend) AcknowledgeBackfill(ctx context.Context, id string) (error) {
func (rb *redisBackend) GetExpiredBackfills(ctx context.Context) ([]string, error) {
// Used in DeleteBackfil()
func (rb *redisBackend) deleteExpiredBackfillID(ctx context.Context, backfillID string) error

@google-cla
Copy link

google-cla bot commented Nov 19, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Nov 19, 2020
@hsorellana
Copy link
Contributor

Hey @aLekSer, look at this. "Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request"

@google-cla
Copy link

google-cla bot commented Nov 19, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@aLekSer
Copy link
Collaborator Author

aLekSer commented Nov 19, 2020

@hsorellana that would not be an issue once Alexey's PR would be merged. Then this PR would have only my changes.

@google-cla google-cla bot added cla: yes and removed cla: no labels Nov 23, 2020
@aLekSer aLekSer marked this pull request as ready for review November 23, 2020 10:23
}

// DeleteExpiredBackfillIDs - delete expired BackfillIDs from a sorted set
func (rb *redisBackend) DeleteExpiredBackfillIDs(ctx context.Context, backfillIDs []string) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me how this will be used?

Copy link
Collaborator Author

@aLekSer aLekSer Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be used by a cleanup service:
1st step backfillIds, err := GetExpiredBackfills()
2nd - Return associated tickets for all Backfills back to the pool.
Some AcknowledgeBackfill() could occur in between
3rd step DeleteExpiredBackfillIDs(backfillIds), note that there could be more expired Backfills but we only delete thos e processed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use the same mechanics as the delete method? Reduces implementation complexity and possible bugs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some DeleteBackfill() request can have some issues, others no, so Backend have an opportunity to select on its own.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got an idea now. This can be done as part of DeleteBackfill

defer handleConnectionClose(&redisConn)

// the same TTL is used for both Backfills and pendingRelease Tickets
ttl := rb.cfg.GetDuration("pendingReleaseTimeout")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be something like 80% to 90% of ttl, since the backfill should expire before the tickets waiting on it, as the tickets timing out but the backfill NOT timing out but being acknowledged would be a problem.

}
defer handleConnectionClose(&redisConn)

currentTime := time.Now().UnixNano()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Laremere is there an agreement of not using UTC()? time.Now() is used everywhere.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumption is all computers in a cluster will have clocks that are synchronized enough.

}
defer handleConnectionClose(&redisConn)

cmds := make([]interface{}, 0, len(backfillIDs)+1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good one

@@ -301,3 +302,54 @@ func TestDeleteBackfill(t *testing.T) {
require.Equal(t, codes.Unavailable.String(), status.Convert(err).Code().String())
require.Contains(t, status.Convert(err).Message(), "DeleteBackfill, id: 12345, failed to connect to redis:")
}

func TestAcknowledgeBackfill(t *testing.T) {
Copy link

@akremsa akremsa Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's not an AcknowledgeBackfill unit test now. You test here the whole acknowledge scenario. This is nice, but I would call this test something like TestAcknowledgeLifecycle.
To write a unit test you need to test exactly one single unit (method) under the different conditions.
Talking about backfill, it's enough to check that after calling AcknowledgeBackfill the passed ID is added to backfill_last_ack_time sorted set.
So I suggest you to rename this method and add TestAcknowledgeBackfill where you will test only AcknowledgeBackfill method. What do you think?

@@ -84,6 +84,15 @@ type Service interface {

// NewMutex returns an interface of a new distributed mutex with given name
NewMutex(key string) RedisLocker

// AcknowledgeBackfill - store Backfill's last accessed time
AcknowledgeBackfill(ctx context.Context, id string) error
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suggest sticking to the current comments format like:
AcknowledgeBackfill stores Backfill's last accessed time

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// AcknowledgeBackfill - stores Backfill's last acknowledgement time

_, err = redisConn.Do("ZADD", cmds...)
if err != nil {
err = errors.Wrap(err, "failed to store backfill's last acknowledgement time")
return status.Error(codes.Internal, err.Error())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think here you can use status.Errorf(...., err) as you do above. Probably we need to elaborate the same error logging format across this file.

endTimeInt := curTime.Add(-ttl).UnixNano()
startTimeInt := 0

// Filter out tickets that are fetched but not assigned within TTL time (ms).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter out tickets backfill ids ...

@aLekSer aLekSer changed the title Redis/backfill last ack Redis: backfill last acknowledged Nov 25, 2020
@aLekSer aLekSer changed the title Redis: backfill last acknowledged Redis: Backfill last acknowledged Nov 25, 2020
@aLekSer
Copy link
Collaborator Author

aLekSer commented Nov 25, 2020

@Laremere Scott, note that I would add config parameter for Backfill Time in a separate PR to reduce complexity of this PR. It is ready to review, hope that I got all comments right.

Store Backfills in a sorted set to retrieve all expired backfills by time.
It has the same input as output of GetExpiredBackfillIDs function.
Some backfillIDs can be acknowledged in between,But this should not affect the work of cleanup service.
@Laremere
Copy link

Force push is adding more work to re-review. Friendly reminder to not do that (otherwise "see changes since your last view" breaks.)
If you're having trouble pushing (eg, you used update branch button at some point), then do git pull before you push.

@aLekSer
Copy link
Collaborator Author

aLekSer commented Nov 26, 2020

@Laremere Do you mean that we should not perform rebase on a new master also? I am used to do it on Agones.

Read and update Redis content to make them independent of other
functions.
}

// deleteExpiredBackfillID - delete expired BackfillID from a sorted set
func (rb *redisBackend) deleteExpiredBackfillID(ctx context.Context, backfillID string) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can pass redis connection here from a caller method. what do you think?

_, err := conn.Do("ZREM", cmds...)
if err != nil {
return status.Errorf(codes.Internal, "failed to delete expired backfill ID %v",
errors.Wrap(err, "failed to delete expired backfill ID from Sorted Set"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use the same two error messages here, please clean it up.

}

// test that Backfill also deleted from last acknowledged sorted set
_, err = redis.Int64(conn.Do("ZSCORE", backfillLastAckTime, tc.backfillID))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no point to perform this check if you are testing the case with an empty backfill id. I think it's better to put it under if tc.backfillID != "" statement.

@Laremere
Copy link

@Laremere Do you mean that we should not perform rebase on a new master also? I am used to do it on Agones.

We squash merge all PRs anyways, so rebasing isn't necessary.

// test that Backfill last acknowledged is in a sorted set
ts, err := redis.Int64(conn.Do("ZSCORE", backfillLastAckTime, tc.backfillID))
require.NoError(t, err)
require.True(t, ts > 0, "timestamp is valid")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timestamp is NOT valid

Fixing error returned and text in Test failure.
}
defer handleConnectionClose(&redisConn)

// Use a fraction 80% of pendingRelease Tickets TTL
Copy link
Collaborator Author

@aLekSer aLekSer Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update this in this or the next PR - Create a new config parameter.

cmds := make([]interface{}, 0)
cmds = append(cmds, backfillLastAckTime, currentTime, id)

_, err = redisConn.Do("ZADD", cmds...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for the oddness with cmds, just do:
_, err = redisConn.Do("ZADD", backfillLastAckTime, currentTime, id)

@@ -28,7 +29,8 @@ import (
)

const (
allBackfills = "allBackfills"
backfillLastAckTime = "backfill_last_ack_time"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be set during backfill creation, so that a server which never starts up can be cleaned up.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true, made a TODO for this I wanted to make this as part of separate PR.
But I think can be added here also.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional call acknowledgeBackfill to StateStore CreateBackfill.

cmds := make([]interface{}, 0, 2)
cmds = append(cmds, backfillLastAckTime, backfillID)

_, err := conn.Do("ZREM", cmds...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too.

require.NoError(t, err)
err = service.AcknowledgeBackfill(ctx, bf2)
require.NoError(t, err)
// Sleep until the pending release expired and verify we still have all the tickets

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After acknowledging, but before sleeping, verify the expired backfills.

The better place to do updates on CreateBackfill.
Reuse one connection, added a helper function for that.
@@ -59,6 +61,8 @@ func (rb *redisBackend) CreateBackfill(ctx context.Context, backfill *pb.Backfil
if res.(int64) == 0 {
return status.Errorf(codes.AlreadyExists, "backfill already exists, id: %s", backfill.GetId())
}

acknowledgeBackfill(redisConn, backfill.GetId())
Copy link

@akremsa akremsa Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are ignoring an error here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks for noticing, will update

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestCreateBackfill should be updated according to this new behavior.

Don't CreateBackfill prior to running AckBackfill it is covered in
CreateBackfill.
@aLekSer
Copy link
Collaborator Author

aLekSer commented Dec 1, 2020

@Laremere all comments got resolved.

@Laremere Laremere merged commit 26d1aa2 into googleforgames:master Dec 2, 2020
@syntxerror syntxerror added this to the v1.2.0 milestone Mar 25, 2021
@syntxerror syntxerror added breaking api change Breaking API change that may require migration for existing customers to update to new version. area/feature labels Mar 29, 2021
@syntxerror syntxerror removed the breaking api change Breaking API change that may require migration for existing customers to update to new version. label Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants