Skip to content

New job fail immediately after start, no logs #17558

@rchangj

Description

@rchangj

Apache Airflow version:1.10.12

Apache Airflow Provider versions (please include all providers that are relevant to your bug):
No other providers

Kubernetes version (if you are using kubernetes) (use kubectl version):
No using k8s

Environment:
AWS Linux

  • Cloud provider or hardware configuration:AWS
  • OS (e.g. from /etc/os-release):
    -NAME="Amazon Linux"
    VERSION="2"
    ID="amzn"
    ID_LIKE="centos rhel fedora"
  • Kernel (e.g. uname -a): 4.14.186-146.268.amzn2.x86_64 Improving the search functionality in the graph view #1 SMP Tue Jul 14 18:16:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: yum, pip3
  • Others:

What happened:
I have been running airflow 1.10.12 for several months without any issues. Recently I updated airflow to use SSL cert. After that I reboot the airflow server and then restarted the airflow scheduler and Webserver as following:
airflow Webserver &
airflow scheduler &

The jobs run normally at beginning. However, within one day, all the new jobs will immediately fail. From UI, I saw the task status as failed. However there is no log under the task when I checked the log from UI.

I also checked the /logs/scheduler directory, except a warning of "psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "variable_key_key", which has existed for long time, I didn't see any other warning or error.

Now this problem happened every day, i.e., if I restart airflow Webserver and scheduler, it works well. But within one day, all the new jobs will immediately fail with no log.

Do you have any insight what can go wrong? Thanks.

What you expected to happen:

How to reproduce it:
Not exactly sure how to reproduce. I just reboot airflow server.

Anything else we need to know:

How often does this problem occur? Once? Every time etc?
I noticed the problem happens within 1 day.

Any relevant logs to include? Put them here in side a detail tag:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bugpending-responsestaleStale PRs per the .github/workflows/stale.yml policy file

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions