tectonic-identity frequently restarts after 1.9.6-tectonic.2 release #3339

pierrebeaucamp · 2018-12-12T15:22:47Z

What keywords did you search in tectonic-installer issues before filing this one?

tectonic-identity. I also looked through recently opened and closed bugs.

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

After our cluster updated itself to 1.9.6-tectonic.2, we're getting a lot of alerts about tectonic-identity pods frequently restarting.

containerStatuses:
    - name: tectonic-identity
      state:
        running:
          startedAt: '2018-12-12T15:11:10Z'
      lastState:
        terminated:
          exitCode: 137
          reason: OOMKilled
          startedAt: '2018-12-12T15:05:39Z'
          finishedAt: '2018-12-12T15:10:47Z'
          containerID: >-
            docker://861e6545dc9ac82aca3d101cb4b0e4b9129e005b345612649b7a8cd001ec0b5d
      ready: true
      restartCount: 1018
      image: 'quay.io/coreos/dex:v2.8.1'
      imageID: >-
        docker-pullable://quay.io/coreos/dex@sha256:19510b560e851bce6a27023fcbab9b6b8a3928d493de11c026e06df854cb37e1
      containerID: >-
        docker://45ed2b5b0958ba6235e7d5567e5fc327a3530a87533d19308f9e481fa2185458

(This pattern repeats itself since upgrading to the 1.9.6-tectonic.2 release)

Versions

Tectonic version (release or commit hash): 1.9.6-tectonic.2
Terraform version (terraform version): Terraform v0.11.8
Platform (aws|azure|openstack|metal): AWS

What happened?

The cluster, having auto-update enabled, updated itself to 1.9.6-tectonic.2. Afterwards the tectonic-identity pods started failing. When they're up and running, everything works fine and we can access and interact with the tectonic console. When they're down, we're getting 503 errors when accessing the console.

What you expected to happen?

The tectonic-identity pods should be stable

How to reproduce it (as minimally and precisely as possible)?

Update to 1.9.6-tectonic.2

Anything else we need to know?

All other components of our cluster operate normally

The text was updated successfully, but these errors were encountered:

pierrebeaucamp · 2018-12-16T03:56:05Z

This issue seems to be more severe than I initially thought. Twice now in the last 36 hours, our entire cluster required recovery (through re-bootstrapping the masters). Kubectl suddenly stopped working (error: You must be logged in to the server, similar to coreos/tectonic-forum#161) and ingress stopped working. I see myself forced to migrate to another Kubernetes solution if this is not being addressed soon

Andrei-Paul · 2018-12-16T18:41:42Z

Got a similar issue on a metal setup that was updated to 1.9.6-tectonic.2.
I type in user and pass, get a glimpse of the dashboard and am immediately redirected to the login prompt again.

ghost · 2019-02-04T18:56:23Z

im also facing issue with 1.9.6-tectonic.2 fresh install, masters won't go online, behind elb something is differently wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tectonic-identity frequently restarts after 1.9.6-tectonic.2 release #3339

tectonic-identity frequently restarts after 1.9.6-tectonic.2 release #3339

pierrebeaucamp commented Dec 12, 2018

pierrebeaucamp commented Dec 16, 2018

Andrei-Paul commented Dec 16, 2018 •

edited

ghost commented Feb 4, 2019

tectonic-identity frequently restarts after 1.9.6-tectonic.2 release #3339

tectonic-identity frequently restarts after 1.9.6-tectonic.2 release #3339

Comments

pierrebeaucamp commented Dec 12, 2018

What keywords did you search in tectonic-installer issues before filing this one?

Is this a BUG REPORT or FEATURE REQUEST?

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

pierrebeaucamp commented Dec 16, 2018

Andrei-Paul commented Dec 16, 2018 • edited

ghost commented Feb 4, 2019

Andrei-Paul commented Dec 16, 2018 •

edited