Skip to content
This repository was archived by the owner on Jan 30, 2020. It is now read-only.
This repository was archived by the owner on Jan 30, 2020. It is now read-only.

UnitStateGenerator stopped publishing state #833

@gabrtv

Description

@gabrtv

This unit (and another) were both automatically re-scheduled after a host failure. The unit now appears to be indestructible:

core@ip-10-21-2-210 ~ $ fleetctl list-units |grep router
deis-router@1.service           6d323006.../10.21.2.210 activating  start-pre
core@ip-10-21-2-210 ~ $ fleetctl status deis-router@1
Unit deis-router@1.service does not exist.
core@ip-10-21-2-210 ~ $ fleetctl destroy deis-router@1.service
core@ip-10-21-2-210 ~ $ fleetctl list-units |grep router
deis-router@1.service           6d323006.../10.21.2.210 activating  start-pre
core@ip-10-21-2-210 ~ $ etcdctl ls --recursive /_coreos.com/fleet | grep router
/_coreos.com/fleet/state/deis-router@1.service
/_coreos.com/fleet/states/deis-router@1.service
/_coreos.com/fleet/states/deis-router@1.service/6d32300640cd41b4a25a55b6a59fd4f4
core@ip-10-21-2-210 ~ $ journalctl -n 200 --no-pager -u fleet | grep router
Aug 28 23:59:27 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0828 23:59:27.310391 00600 manager.go:221] Writing systemd unit deis-router@1.service (629b)
Aug 28 23:59:27 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0828 23:59:27.366615 00600 reconcile.go:310] AgentReconciler completed task: type=LoadJob job=deis-router@1.service reason="job scheduled here but not loaded"
Aug 29 00:00:23 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:00:23.011464 00600 manager.go:114] Started systemd unit deis-router@1.service(done)
Aug 29 00:00:23 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:00:23.011614 00600 reconcile.go:310] AgentReconciler completed task: type=StartJob job=deis-router@1.service reason="job currently loaded but desired state is launched"
Aug 29 00:07:25 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:07:25.291084 00600 reconcile.go:303] AgentReconciler task chain failed: chain={%!s(*job.Job=&{deis-router@1.service 0xc208302380   {map[] []}}) [{UnloadJob job loaded but not scheduled here}]} err=task already in flight
Aug 29 00:07:26 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:07:26.684357 00600 manager.go:122] Stopped systemd unit deis-router@1.service(done)
Aug 29 00:07:26 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:07:26.684382 00600 manager.go:234] Removing systemd unit deisrouter@1.service
Aug 29 00:07:26 ip-10-21-2-210.us-west-2.compute.internal fleet[600]: I0829 00:07:26.760170 00600 reconcile.go:310] AgentReconciler completed task: type=UnloadJob job=deis-router@1.service reason="job loaded but not scheduled here"

Host info:

core@ip-10-21-2-210 ~ $ cat /etc/lsb-release
DISTRIB_ID=CoreOS
DISTRIB_RELEASE=423.0.0
DISTRIB_CODENAME="Red Dog"
DISTRIB_DESCRIPTION="CoreOS 423.0.0"
core@ip-10-21-2-210 ~ $ fleet --version
fleet version 0.7.1
core@ip-10-21-2-210 ~ $ etcd --version
etcd version 0.4.6

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions