Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

teleport-cluster-auth-test job failing when upgrading from 15.2.5 to 15.3.0 #41087

Closed
PierreBesson opened this issue May 1, 2024 · 5 comments
Assignees

Comments

@PierreBesson
Copy link

Expected behavior:

The upgrade should be successful and not fail.

Current behavior:

Test job failing with error message:

FAIL /etc/teleport/teleport.yaml
ERROR: failed parsing the config file: yaml: unmarshal errors:  line 1: field Error not found in type config.FileConfig

The faulty confimap teleport-cluster-auth-test is rendered as:

Name:         teleport-cluster-auth-test
Namespace:    teleport
Labels:       app.kubernetes.io/component=auth
              app.kubernetes.io/instance=teleport-cluster
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=teleport-cluster
              app.kubernetes.io/version=15.3.0
              argocd.argoproj.io/instance=teleport-cluster-public
              helm.sh/chart=teleport-cluster-15.3.0
              teleport.dev/majorVersion=15
Annotations:  helm.sh/hook: pre-install,pre-upgrade
              helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
              helm.sh/hook-weight: 4

Data
====
apply-on-startup.yaml:
----
kind: token
version: v2
metadata:
  name: teleport-cluster-proxy
  expires: "3000-01-01T00:00:00Z"
spec:
  roles: [Proxy]
  join_method: kubernetes
  kubernetes:
    allow:
      - service_account: "teleport:teleport-cluster-proxy"

teleport.yaml:
----
Error: 'error converting YAML to JSON: yaml: line 20: did not find expected key'

[omitted our config from `auth.teleportConfig`]

BinaryData
====

From our debugging this is caused by the faulty configmap merge in this line: https://github.com/gravitational/teleport/blob/branch/v15/examples/chart/teleport-cluster/templates/auth/predeploy_config.yaml#L34

The issue was probably introduce by this change f431c3b

@hugoShaka
Copy link
Contributor

[omitted our config from auth.teleportConfig]

To validate and reproduce the issue we need your values.yaml with the potential secrets redacted, and the generated auth configmap which contans invalid YAML.

@PierreBesson
Copy link
Author

PierreBesson commented May 2, 2024

@hugoShaka thank you for the response. The teleport-cluster-auth-test configmap was posted in the original issue.

Here is our values.yaml for our teleport-cluster helm release:

teleport-cluster:
  enterprise: true
  proxyProtocol: "off"
  auth:
    teleportConfig:
      auth_service:
        client_idle_timeout: 4h
        web_idle_timeout: 4h
  authentication:
    type: oidc
    secondFactor: "on"
  proxyListenerMode: multiplex
  exposeDiagPort: false
  sessionRecording: "'off'"
  operator:
    enabled: false
  chartMode: gcp
  gcp:
    backendTable: teleport-backend
    auditLogTable: teleport-auditlog
    auditLogMirrorOnStdout: true
    sessionRecordingBucket: redacted-session-recordings
    credentialSecretName: ""
  podMonitor:
    enabled: true
  highAvailability:
    replicaCount: 2
    requireAntiAffinity: true
    podDisruptionBudget:
      enabled: true
      minAvailable: 1
    certManager:
      enabled: true
      addCommonName: true
      issuerName: letsencrypt-dns01
      issuerKind: ClusterIssuer
      issuerGroup: cert-manager.io
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      memory: 2Gi
  log:
    level: INFO
    output: stderr
    format: json
    extraFields: ["timestamp", "level", "component", "caller"]
  priorityClassName: high-priority
  exposeDiagPort: true
  authentication:
    localAuth: true
  clusterName: teleport.example.com
  gcp:
    projectId: example
    sessionRecordingBucket: example-session-recordings
  global:
    license: redacted_teleport_license

@hugoShaka
Copy link
Contributor

sessionRecording: "'off'"

This is the broken bit, you're relying on the previous bug and are working around by encapsulating a string in a string. Since f431c3b and as indicated in the release notes, the typing bug is solved and you no longer have to hack in the chart for the field to stay string type.

The correct value to disable session recording is

sessionRecording: 'off'

@PierreBesson
Copy link
Author

Thank you @hugoShaka, I confirm this fixes our issue with the upgrade.

You help is very much appreciated 👍 !

@kriegster108
Copy link

just wanted to say, thankyou for this feedback. resolved upgrading from 15.0.0 -> 16.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants