Skip to content

Scheduler not starting when using proxy environment variables #33630

@ViniciusBastosTR

Description

@ViniciusBastosTR

Official Helm Chart version

1.10.0 (latest released)

Apache Airflow version

2.6.2

Kubernetes Version

v1.24.10

Helm Chart configuration

config:
  webserver:
    base_url: "https://{{ url_base }}/airflow"
extraEnv: |
  - name: "AIRFLOW__WEBSERVER__BASE_URL"
    value: "https://{{ url_base }}/airflow"
airflow:
  config:
    AIRFLOW__WEBSERVER__BASE_URL: "https://{{ url_base }}/airflow"
    AIRFLOW__CELERY__FLOWER_URL_PREFIX: "/airflow/flower"
flower:
  enabled: true
ingress:
  web:
    enabled: true
    ingressClassName: "nginx"
    path: /airflow
    pathType: "ImplementationSpecific"
    hosts:
      - name: "{{ url_base }}"
  flower:
    enabled: true
    ingressClassName: "nginx"
    path: /airflow/flower
    pathType: "ImplementationSpecific"
    hosts:
      - name: "{{ url_base }}"
dags:
  persistence:
    annotations: {}
    enabled: true
    size: 1Gi
    storageClassName: longhorn
    accessMode: ReadWriteMany
    existingClaim:
    subPath: ~
  gitSync:
    enabled: true
    repo: https://github.com/apache/airflow.git
    branch: v2-2-stable
    rev: HEAD
    depth: 1
    maxFailures: 0
    subPath: "tests/dags"
    # credentialsSecret: git-credentials
    wait: 5
    containerName: git-sync
    uid: 65533
    securityContext: {}
    securityContexts:
      container: {}
    extraVolumeMounts: []
    env:
      - name: http_proxy
        value: "{{ http_proxy }}"
      - name: https_proxy
        value: "{{ http_proxy }}"
      - name: HTTP_PROXY
        value: "{{ http_proxy }}"
      - name: HTTPS_PROXY
        value: "{{ http_proxy }}"
    resources:
      limits:
        cpu: 100m
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi
workers:
  replicas: 1
  revisionHistoryLimit: ~
  command: ~
  args:
    - "bash"
    - "-c"
    - |-
      exec \
      airflow {{ '{{' }} semverCompare ">=2.0.0" .Values.airflowVersion | ternary "celery worker" "worker" {{ '}}' }}
  livenessProbe:
    enabled: true
    initialDelaySeconds: 10
    timeoutSeconds: 20
    failureThreshold: 5
    periodSeconds: 60
    command: ~
  updateStrategy: ~
  strategy:
    rollingUpdate:
      maxSurge: "100%"
      maxUnavailable: "50%"
  securityContext: {}
  securityContexts:
    pod: {}
    container: {}
  serviceAccount:
    create: true
    name: ~
    annotations: {}
  keda:
    enabled: false
    namespaceLabels: {}
    pollingInterval: 5
    cooldownPeriod: 30
    minReplicaCount: 0
    maxReplicaCount: 10
    advanced: {}

  persistence:
    enabled: true
    size: 15Gi
    storageClassName: longhorn
    fixPermissions: false
    annotations: {}
    securityContexts:
      container: {}

  kerberosSidecar:
    enabled: false
    resources: {}
    securityContexts:
      container: {}

  resources: {}
  terminationGracePeriodSeconds: 600
  safeToEvict: true
  extraContainers: []
  extraInitContainers: []
  extraVolumes: []
  extraVolumeMounts: []
  nodeSelector: {}
  priorityClassName: ~
  affinity: {}
  tolerations: []
  topologySpreadConstraints: []
  hostAliases: []
  annotations: {}
  podAnnotations: {}
  labels: {}

  logGroomerSidecar:
    enabled: true
    command: ~
    args: ["bash", "/clean-logs"]
    retentionDays: 15
    resources: {}
    securityContexts:
      container: {}

  waitForMigrations:
    enabled: true
    env: []
    securityContexts:
      container: {}

  env: []
triggerer:
  enabled: true
  replicas: 1
  revisionHistoryLimit: ~
  command: ~
  args: ["bash", "-c", "exec airflow triggerer"]
  updateStrategy: ~
  strategy:
    rollingUpdate:
      maxSurge: "100%"
      maxUnavailable: "50%"
  livenessProbe:
    initialDelaySeconds: 10
    timeoutSeconds: 20
    failureThreshold: 5
    periodSeconds: 60
    command: ~
  serviceAccount:
    create: true
    name: ~
    annotations: {}
  securityContext: {}
  securityContexts:
    pod: {}
    container: {}
  persistence:
    enabled: true
    size: 15Gi
    storageClassName: longhorn
    fixPermissions: false
    annotations: {}
logs:
  persistence:
    enabled: true
    size: 15Gi
    annotations: {}
    storageClassName: longhorn
    existingClaim:

Docker Image customizations

No response

What happened

Airflow scheduler not starting when using proxies environment variables:

kubectl logs -f  -n airflow airflow-scheduler-bdc4769d5-58gpn -c git-sync-init
INFO: detected pid 1, running init handler
I0822 21:07:24.311563      13 main.go:389] "level"=0 "msg"="starting up" "pid"=13 "args"=["/git-sync"]
I0822 21:07:24.409175      13 main.go:934] "level"=0 "msg"="cloning repo" "origin"="https://github.com/apache/airflow.git" "path"="/git"
I0822 21:07:24.411938      13 main.go:940] "level"=0 "msg"="git root exists and is not empty (previous crash?), cleaning up" "path"="/git"
I0822 21:07:36.781274      13 main.go:748] "level"=0 "msg"="syncing git" "rev"="HEAD" "hash"="620dc90f61a676ceed74553a259b850ef6fe077b"
E0822 21:09:29.207631      13 main.go:535] "msg"="too many failures, aborting" "error"="Run(git fetch -f --tags --depth 1 https://github.com/apache/airflow.git v2-2-stable): context deadline exceeded: { stdout: "", stderr: "" }" "failCount"=1

What you think should happen instead

I think it's a problem with proxy resolution in git-sync-init init container

How to reproduce

Setup HTTP proxy environment variables in gitSync

  gitSync:
    enabled: true
    repo: https://github.com/apache/airflow.git
    branch: v2-2-stable
    rev: HEAD
    depth: 1
    maxFailures: 0
    subPath: "tests/dags"
    # credentialsSecret: git-credentials
    wait: 5
    containerName: git-sync
    uid: 65533
    securityContext: {}
    securityContexts:
      container: {}
    extraVolumeMounts: []
    env:
      - name: http_proxy
        value: "{{ http_proxy }}"
      - name: https_proxy
        value: "{{ http_proxy }}"
      - name: HTTP_PROXY
        value: "{{ http_proxy }}"
      - name: HTTPS_PROXY
        value: "{{ http_proxy }}"

Deploy it:

helm upgrade --install airflow apache-airflow/airflow -f ./apache-airflow-values.yaml --namespace airflow --create-namespace

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:helm-chartAirflow Helm Chartkind:bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yet

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions