Skip to content

"Failed to fetch log file from worker" when running CeleryExecutor in docker worker #17350

@paantya

Description

@paantya

** Apache Airflow version **:
AIRFLOW_IMAGE_NAME: -apache / airflow: 2.1.2-python3.7

Environment:

What happened:

I run on one server a docker compose without workers, and on another server a worker using docker.
When the task has been counted, I cannot get the logs from another server.

** What did you expect **:

something in the docker worker run settings on the second server.

** How to reproduce this **:
<! ---

run in server1:

mkdir tmp-airflow
cd tmp-airflow
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.1.2/docker-compose.yaml'
mkdir ./dags ./logs ./plugins
echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env

chamge imade name to apache/airflow:2.1.2-python3.7
add to docker-compose.yaml file open ports for postgres:

   postgres:
     image: postgres: 13
     ports:
       - 5432:5432

and change output ports for redis to 6380:

  redis:
    image: redis:latest
    ports:
      - 6380:6379

add to webserver ports 8793 for logs (not sure what is needed)

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - 8080:8080
      - 8793:8793

Comment out or delete the worker description. like:

#  airflow-worker:
#    <<: *airflow-common
#    command: celery worker
#    healthcheck:
#      test:
#        - "CMD-SHELL"
#        - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
#      interval: 10s
#      timeout: 10s
#      retries: 5
#    restart: always

next, run in terminal

docker-compose up airflow-init
docker-compose up

then run in server2 worker:

mkdir tmp-airflow
cd tmp-airflow
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.1.2/docker-compose.yaml'
mkdir ./dags ./logs ./plugins
echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
cat ./.env

docker run --rm -it -e AIRFLOW__CORE__EXECUTOR="CeleryExecutor" -e AIRFLOW__CORE__SQL_ALCHEMY_CONN="postgresql+psycopg2://airflow:airflow@10.0.0.197:5432/airflow" e AIRFLOW__CELERY__RESULT_BACKEND="db+postgresql://airflow:airflow@10.0.0.197:5432/airflow" -e AIRFLOW__CELERY__BROKER_URL="redis://:@10.0.0.197:6380/0" -e AIRFLOW__CORE__FERNET_KEY="" -e AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION="true" -e AIRFLOW__CORE__LOAD_EXAMPLES="true" -e AIRFLOW__API__AUTH_BACKEND="airflow.api.auth.backend.basic_auth" -e _PIP_ADDITIONAL_REQUIREMENTS="" -v /home/apatshin/tmp/airflow-worker/dags:/opt/airflow/dags -v /home/apatshin/tmp/airflow-worker/logs:/opt/airflow/logs -v /home/apatshin/tmp/airflow-worker/plugins:/opt/airflow/plugins -p "6380:6379" -p "5432:5432" -p "8793:8793" -e DB_HOST="10.0.0.197" --user 1012:0 --hostname="host197" "apache/airflow:2.1.2-python3.7" celery worker

10.0.0.197 -- ip server1
host197 -- naming in local network for server2
--user 1012:0 -- 1012 it is in file ./.env in "AIRFLOW_UID"
see in you files and replace:
cat ./.env

next, run DAG in webserver on server1:

then go to in browser to 10.0.0.197:8080 or 0.0.0.0:8080
and run DAG "tutorial"
image

and then see log on ferst tasks
image

errors

and see next:

*** Log file does not exist: /opt/airflow/logs/tutorial/print_date/2021-07-30T18:48:54.052656+00:00/1.log
*** Fetching from: http://host197:8793/log/tutorial/print_date/2021-07-30T18:48:54.052656+00:00/1.log
*** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: http://host197:8793/log/tutorial/print_date/2021-07-30T18:48:54.052656+00:00/1.log
For more information check: https://httpstatuses.com/403

in terminal in server2:
[2021-07-30 18:49:12,966] {_internal.py:113} INFO - 10.0.0.197 - - [30/Jul/2021 18:49:12] "GET /log/tutorial/print_date/2021-07-30T18:48:54.052656+00:00/1.log HTTP/1.1" 403 -

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions