Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Operation not permitted" error when chmod on log folder #29112

Closed
1 of 2 tasks
denskh opened this issue Jan 23, 2023 · 12 comments · Fixed by #30123
Closed
1 of 2 tasks

"Operation not permitted" error when chmod on log folder #29112

denskh opened this issue Jan 23, 2023 · 12 comments · Fixed by #30123
Labels
area:helm-chart Airflow Helm Chart kind:bug This is a clearly a bug
Milestone

Comments

@denskh
Copy link

denskh commented Jan 23, 2023

Official Helm Chart version

1.7.0 (latest released)

Apache Airflow version

2.5.1

Kubernetes Version

1.24.6

Helm Chart configuration

executor: "KubernetesExecutor" # however same issue happens with LocalExecutor
logs:
persistence:
enabled: true
size: 50Gi
storageClassName: azurefile-csi

Docker Image customizations

Using airflow-2.5.1-python3.10 as a base image.
Copy custom shared libraries into folder under /opt/airflow/company
Copy DAGs /opt/airflow/dags

What happened

After migrating from airflow 2.4.3 to 2.5.1 start getting error below. No other changes to custom image. No task is running because of this error:
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/__main__.py", line 39, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/cli_parser.py", line 52, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/cli.py", line 108, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/commands/task_command.py", line 384, in task_run
    ti.init_run_context(raw=args.raw)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 2414, in init_run_context
    self._set_context(self)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/logging_mixin.py", line 77, in _set_context
    set_context(self.log, context)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/logging_mixin.py", line 213, in set_context
    flag = cast(FileTaskHandler, handler).set_context(value)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 71, in set_context
    local_loc = self._init_file(ti)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 382, in _init_file
    self._prepare_log_folder(Path(full_path).parent)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 358, in _prepare_log_folder
    directory.chmod(mode)
  File "/usr/local/lib/python3.10/pathlib.py", line 1191, in chmod
    self._accessor.chmod(self, mode, follow_symlinks=follow_symlinks)
PermissionError: [Errno 1] Operation not permitted: '/opt/airflow/logs/dag_id=***/run_id=manual__2023-01-22T02:59:43.752407+00:00/task_id=***'

What you think should happen instead

Seem like airflow attempts to set change log folder permissions and not permissioned to do it.
Getting same error when executing command manually (confirmed folder path exists): chmod 511 '/opt/airflow/logs/dag_id=/run_id=manual__2023-01-22T02:59:43.752407+00:00/task_id='
chmod: changing permissions of '/opt/airflow/logs/dag_id=/run_id=scheduled__2023-01-23T15:30:00+00:00/task_id=': Operation not permitted

How to reproduce

My understanding is that this error happens before any custom code is executed.

Anything else

Error happens every time, unable to start any DAG while using airflow 2.5.1. Exactly same configuration works with 2.5.0 and 2.4.3.
Same image and configuration works fine while running locally using docker-composer.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@denskh denskh added area:helm-chart Airflow Helm Chart kind:bug This is a clearly a bug labels Jan 23, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jan 23, 2023

Thanks for opening your first issue here! Be sure to follow the issue template!

@Taragolis
Copy link
Contributor

I guess it is regression of #28477

@potiuk potiuk added this to the Airflow 2.5.2 milestone Jan 23, 2023
@potiuk
Copy link
Member

potiuk commented Jan 23, 2023

Should be easy fix but it is likely another candidate to speed up 2.5.2 cc: @pierrejeambrun @ephraimbuddy

@ephraimbuddy
Copy link
Contributor

@pingzh do you want to take a look at it?

@nsphung
Copy link

nsphung commented Jan 31, 2023

I have the same issue with image based on apache/airflow:2.5.1-python3.9. I have updated due to CVE-2023-22884

@potiuk
Copy link
Member

potiuk commented Jan 31, 2023

You can upgrade MySQL provider and build your image @nsphung if you do not want to wait for the 2.5.2 to release.

@nsphung
Copy link

nsphung commented Feb 3, 2023

I don't need the provider so for now. I've just uninstalled it in my airflow instances. It's just the Dependabot alerts that is annoying me for this when it sees Airflow < 2.5.1 at the moment.

@mmichaels01
Copy link

@denskh Did wind up with a workaround for this?

I'm having the same issue, would like to use the shared logs, but I can disable them until we can upgrade to the 2.5.2 version when that's released

@denskh
Copy link
Author

denskh commented Feb 17, 2023

Nope, still on 2.5.0 which doesn't have this issue, hope it gets fixed soon.

@SamWheating
Copy link
Contributor

SamWheating commented Mar 2, 2023

I have been trying to replicate this issue this morning, here's what I did:

  • set up a Kind cluster with a ReadWriteMany PVC (running on NFS)
  • Deployed Airflow 2.5.1 with the latest helm chart, writing task logs to said PVC with similar logging config.
  • Ran some tasks, which all succeeded.

So I think that it must be an issue of the access / fsGroup on your volume, and might be related to the CSI?

If anyone has any other suggestions, I am happy to work on a fix, but would like to replicate the issue first.

Update: I wasn't using impersonation 🤦 once I set AIRFLOW__CORE__DEFAULT_IMPERSONATION=nobody I was able to replicate the issue.

@potiuk
Copy link
Member

potiuk commented Mar 15, 2023

Apparently there are cases where changing the permission is impossible. We should ignore the error rather than fail the log file creation in such cases (as it is anyhow only needed to handle a very specific subdag issue).

potiuk added a commit to potiuk/airflow that referenced this issue Mar 15, 2023
In some circumstances, changing the permission of a parent folder
for the log might not be possible - for example when it was
created by another user (with impersonation) or when the filesystem
does not allow for permission change.

Fixes: apache#29112
potiuk added a commit that referenced this issue Mar 15, 2023
In some circumstances, changing the permission of a parent folder
for the log might not be possible - for example when it was
created by another user (with impersonation) or when the filesystem
does not allow for permission change.

Fixes: #29112
@potiuk potiuk modified the milestones: Airflow 2.5.3, Airflow 2.6.0 Mar 28, 2023
@thesuperzapper
Copy link
Contributor

thesuperzapper commented Apr 6, 2023

@potiuk I am still seeing many users report errors in airflow 2.5.3 as the fix in PR #30123 was not back-ported.

The error is always associated with this line airflow/utils/log/file_task_handler.py#L358, perhaps we can put a temporary fix for 2.5.4 which puts a try/catch around that (and any other chmod attempts)?

Alternatively, can we revert #28477 which is the cause of this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:helm-chart Airflow Helm Chart kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants