Skip to content
This repository was archived by the owner on Jan 30, 2020. It is now read-only.
This repository was archived by the owner on Jan 30, 2020. It is now read-only.

fleet engine leadership losting trigger all units restarting on centos7 #1181

@holmes86

Description

@holmes86

hi, I run fleet 0.8 cluster on centos7. But I found fleet trigger all units restarting in a mchine of cluster recently. There were no rules. sometimes occur once two days, sometimes occur once twenty minutes. And another machine is ok. The log is as follow:

Apr 09 21:22:46 machine1 fleetd[21543]: INFO engine.go:208: Waiting 26s for previous lease to expire before continuing reconciliation
Apr 09 21:22:46 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:22:44 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13679370 != 1
Apr 09 21:22:20 machine1 fleetd[21543]: INFO engine.go:208: Waiting 24s for previous lease to expire before continuing reconciliation
Apr 09 21:22:20 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:22:18 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13678688 != 1
Apr 09 21:21:54 machine1 fleetd[21543]: INFO engine.go:208: Waiting 24s for previous lease to expire before continuing reconciliation
Apr 09 21:21:54 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:21:52 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13677937 != 1
Apr 09 21:21:27 machine1 fleetd[21543]: INFO engine.go:208: Waiting 25s for previous lease to expire before continuing reconciliation
Apr 09 21:21:27 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:21:25 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13677236 != 1
Apr 09 21:20:59 machine1 fleetd[21543]: INFO engine.go:208: Waiting 26s for previous lease to expire before continuing reconciliation
Apr 09 21:20:59 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:20:57 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13676575 != 1
Apr 09 21:20:31 machine1 fleetd[21543]: INFO engine.go:208: Waiting 25s for previous lease to expire before continuing reconciliation
Apr 09 21:20:31 machine1 fleetd[21543]: INFO engine.go:205: Stole engine leadership from Machine(9c33fade6ddf46448fcbacd8ed8495a5)
Apr 09 21:20:30 machine1 fleetd[21543]: ERROR engine.go:218: Engine leadership lost, renewal failed: 101: Compare failed ([13675908 != 1
Apr 09 21:20:29 machine1 fleetd[21543]: WARN engine.go:116:
Apr 09 21:20:29 machine1 fleetd[21543]: INFO reconciler.go:163: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: 
Apr 09 21:20:29 machine1 fleetd[21543]: ERROR engine.go:268: Failed scheduling Unit(login.service) to Machine(21abf58f4
Apr 09 21:20:29 machine1 fleetd[21543]: INFO reconciler.go:163: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: 
Apr 09 21:20:29 machine1 fleetd[21543]: ERROR engine.go:268: Failed scheduling Unit(user.service) to Machine(21abf58f49294
Apr 09 21:20:29 machine1 fleetd[21543]: INFO reconciler.go:163: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: 
Apr 09 21:20:29 machine1 fleetd[21543]: ERROR engine.go:268: Failed scheduling Unit(tradecenter.service) to Machine(21abf58f49294
Apr 09 21:20:29 machine1 fleetd[21543]: INFO reconciler.go:163: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: 
Apr 09 21:20:29 machine1 fleetd[21543]: ERROR engine.go:268: Failed scheduling Unit(user.service) to Machine(21abf58f49294
Apr 09 21:20:28 machine1 fleetd[21543]: INFO reconciler.go:163: EngineReconciler completed task: {Type: AttemptScheduleUnit, JobName: 
Apr 09 21:20:28 machine1 fleetd[21543]: ERROR engine.go:268: Failed scheduling Unit(mysql.service) to Machine(21abf58f49
Apr 09 21:20:28 machine1 fleetd[21543]: INFO reconcile.go:321: AgentReconciler completed task: type=StartUnit job=vendor.s
Apr 09 21:20:28 machine1 fleetd[21543]: INFO reconcile.go:321: AgentReconciler completed task: type=StartUnit job=home.ser
.......
.......

thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions