Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[manager/orchestrator/reaper] Fix the condition used for skipping over running tasks #2677

Merged
merged 1 commit into from Aug 1, 2018

Conversation

@anshulpundir
Copy link
Contributor

anshulpundir commented Jun 27, 2018

Addresses the following from #2672 (comment):

The previous logic for skipping over running tasks in tick() was:

if desired=running AND state <= running then don't delete else delete

For example, if a task is (desired=complete, state=running) then this code will delete it from SwarmKit, causing SwarmKit to believe that its resources are no longer in use, which is not correct.

This fixes the logic to ignore tasks which are running (including tasks which are desired to be shutdown), or which are desired to be running (desired state running).

@anshulpundir anshulpundir requested review from cyli, talex5, dperny and MagnusS Jun 27, 2018
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jun 27, 2018

Codecov Report

Merging #2677 into master will increase coverage by 0.26%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2677      +/-   ##
==========================================
+ Coverage   61.99%   62.25%   +0.26%     
==========================================
  Files         134      134              
  Lines       21771    21746      -25     
==========================================
+ Hits        13496    13539      +43     
+ Misses       6818     6750      -68     
  Partials     1457     1457
})
return nil
}))

This comment has been minimized.

Copy link
@dperny

dperny Jun 27, 2018

Member

why did all of this get removed?

This comment has been minimized.

Copy link
@talex5

talex5 Jun 28, 2018

Contributor

I don't know why it was removed, but I'd also like to point out that 1 isn't a negative number...

					// set TaskHistoryRetentionLimit to a negative value, so
					// that tasks are cleaned up right away.
					TaskHistoryRetentionLimit: 1,

This comment has been minimized.

Copy link
@anshulpundir

anshulpundir Jun 29, 2018

Author Contributor

cluster object is not needed for this test

This comment has been minimized.

Copy link
@cyli

cyli Jul 30, 2018

Contributor

Non-blocking: I think it will default to this anyway, but just for good test hygiene, would it make sense to set taskReaper.taskHistory = 0 immediately after the task reaper is created, just like with your new test?

This comment has been minimized.

Copy link
@anshulpundir

anshulpundir Jul 31, 2018

Author Contributor

taskReaper.taskHistory should be 0 by default ? The rest of the test requires it to be 1.

This comment has been minimized.

Copy link
@cyli

cyli Aug 1, 2018

Contributor

Yep, it is definitely the default. I was just suggesting that setting it would be more explicit/easier to read/consistent style-wise as it is in the other test. Not important though.

@dperny

This comment has been minimized.

Copy link
Member

dperny commented Jun 27, 2018

LGTM except comment on removed code.

@cyli
cyli approved these changes Jun 27, 2018
Copy link
Contributor

cyli left a comment

LGTM

// Don't delete running tasks
// Ignore tasks which are running (including tasks which are desired to be shutdown),
// or which are desired to be running (desired state running).
if t.Status.State == api.TaskStateRunning || t.DesiredState <= api.TaskStateRunning {

This comment has been minimized.

Copy link
@talex5

talex5 Jun 28, 2018

Contributor

What about e.g. (state=starting, desired=shutdown)? We shouldn't reap that.

As I understand it, the reaper is only allowed to remove a task when:

  • state >= completed (resources freed), OR
  • state < assigned AND desired >= completed (resources will never be assigned on a node)

This comment has been minimized.

Copy link
@anshulpundir

anshulpundir Jun 29, 2018

Author Contributor

fair point. I think a task can still start even though it hasn't yet and even though it has been marked for shutdown, so it shouldn't be reaped in that case.

This comment has been minimized.

Copy link
@thaJeztah

thaJeztah Jul 19, 2018

Member

For readability, perhaps these should be split over two if-statements; it allows describing/documenting per condition, which may help understanding the logic)

if t.Status.State == api.TaskStateRunning {
	runningTasks++
	continue
}
if t.DesiredState <= api.TaskStateRunning {
	runningTasks++
	continue
}

or a switch, whatever looks best 👍

switch {
case t.Status.State == api.TaskStateRunning:
	runningTasks++
	continue

case  t.DesiredState <= api.TaskStateRunning:
	runningTasks++
	continue
}
// 1. The task has reached a terminal state i.e. actual state beyond TaskStateRunning.
// 2. The task has not yet become running and desired state is a terminal state i.e.
// actual state not yet TaskStateAssigned and desired state beyond TaskStateRunning.
if t.Status.State > api.TaskStateRunning ||

This comment has been minimized.

Copy link
@thaJeztah

thaJeztah Jul 27, 2018

Member

This is becoming hard to grasp (also a reason we now need a long comment to explain)

Perhaps we should split these up (may give some duplicated code) or extract the check to a function; e.g okToCleanup(task) bool

Inside that function we can do an early return for each check to reduce complexity

This comment has been minimized.

Copy link
@anshulpundir

anshulpundir Jul 27, 2018

Author Contributor

ahh fair point. I'll try to make it easier to read.

BTW the big comment is needed.

This comment has been minimized.

Copy link
@thaJeztah

thaJeztah Jul 28, 2018

Member

Yes, the comment itself is useful, sorry, didn't meant to imply that, just that it's easy to make mistakes in these conditions; breaking them up makes the code easier to read, and less likely to make mistakes

@anshulpundir anshulpundir force-pushed the anshulpundir:reaper branch from fcb6519 to 00b11eb Jul 27, 2018
Copy link
Member

thaJeztah left a comment

LGTM, thanks for updating!

I can't merge (don't have permissions to merge in protected branches)

@cyli
cyli approved these changes Jul 30, 2018
Copy link
Contributor

cyli left a comment

Non-blocking comments, LGTM though.

})
return nil
}))

This comment has been minimized.

Copy link
@cyli

cyli Jul 30, 2018

Contributor

Non-blocking: I think it will default to this anyway, but just for good test hygiene, would it make sense to set taskReaper.taskHistory = 0 immediately after the task reaper is created, just like with your new test?

assert.Equal(t, "id1task3", deletedTask1.ID)
}

// desired = TaskStateRunning, actual = TaskStateNew

This comment has been minimized.

Copy link
@cyli

cyli Jul 30, 2018

Contributor

Non-blocking: These all seem to be mostly repeated blocks of code. Would it make sense for this to just a range over the variable objects (desired state, actual state, cleaned up)?

for _, testcase := range []struct{
    desired, actual api.TaskState
    cleanedUp bool
} {
    {desired: api.TaskStateRunning, actual: api.TaskStateNew, cleanedUp: False},
    ...
} {
    testfunc(testcase.desired, testcase.actual)
    assert.Zero(t, len(taskReaper.dirty))
    if testcase.cleanedUp {
        waitForTaskDelete(api.TaskStateRunning, api.TaskStateCompleted)
    }
    s.View(func(tx store.ReadTx) {
        task := store.GetTask(tx, "id1task3")
        if testcase.cleanedUp {
            assert.Nil(t, task)
        } else {
            assert.NotNil(t, task)
        }
    }
}
…r running tasks.

Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
@anshulpundir anshulpundir force-pushed the anshulpundir:reaper branch from 00b11eb to 8c5d353 Jul 31, 2018
@cyli cyli merged commit 9f35cb5 into docker:master Aug 1, 2018
3 checks passed
3 checks passed
ci/circleci Your tests passed on CircleCI!
Details
codecov/project 62.25% (target 0%)
Details
dco-signed All commits are signed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.