Skip to content

Validate executor_config keys in BaseOperator#53459

Closed
HsiuChuanHsu wants to merge 2 commits intoapache:mainfrom
HsiuChuanHsu:bug/k8s-executor-config-validation
Closed

Validate executor_config keys in BaseOperator#53459
HsiuChuanHsu wants to merge 2 commits intoapache:mainfrom
HsiuChuanHsu:bug/k8s-executor-config-validation

Conversation

@HsiuChuanHsu
Copy link
Contributor

@HsiuChuanHsu HsiuChuanHsu commented Jul 17, 2025

Description

Closes: #47702 by adding validation for the executor_config dictionary in the BaseOperator class of task-sdk/src/airflow/sdk/bases/operator.py. The change ensures that only the allowed keys (pod_override and pod_template_file) are used in executor_config. If invalid keys are present, an AirflowException is raised with a descriptive error message, preventing silent failures and providing clear feedback to users.

Changes:

  • Added a new method validate_executor_config in task-sdk/src/airflow/sdk/bases/operator.py to validate the executor_config dictionary.
  • The validation checks that executor_config is a dictionary and only contains the keys pod_override or pod_template_file.
  • If invalid keys are found, an AirflowException is raised with an error message that includes the task ID, DAG ID (if available), and the list of invalid keys.

Additional Notes:

The error would be displayed in the import-errors section of the Airflow UI.
Screenshot 2025-07-16 at 7 21 40 AM


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Validate executor_config in BaseOperator to ensure it only contains 'pod_override' or 'pod_template_file' keys. This prevents invalid configurations from being set during initialization or attribute updates.

- Add validate_executor_config method to check for valid keys
- Call validation in __setattr__ when executor_config is modified
- Call validation during BaseOperator initialization
- Introduce tests to validate executor_config in BaseOperator
- Test valid keys (pod_override, pod_template_file) are accepted
- Test invalid keys raise RuntimeError with appropriate message
- Test error message includes DAG ID when DAG is provided
@HsiuChuanHsu HsiuChuanHsu force-pushed the bug/k8s-executor-config-validation branch from b3d3998 to 8e6e3b9 Compare July 17, 2025 15:11
@eladkal eladkal added this to the Airflow 3.0.4 milestone Jul 17, 2025
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label Jul 17, 2025
Comment on lines +1206 to +1225
def validate_executor_config(self, executor_config: dict | None) -> None:
"""
Validate the executor_config to ensure it only contains 'pod_override' or 'pod_template_file' keys.

:param executor_config: The executor_config dictionary to validate.
:raises AirflowException: If executor_config contains keys other than 'pod_override' or 'pod_template_file'.
"""
if not executor_config or not isinstance(executor_config, dict):
return

valid_keys = {"pod_override", "pod_template_file"}
invalid_keys = set(executor_config.keys()) - valid_keys
if invalid_keys:
error_msg = (
f"Invalid executor_config keys for task '{self.task_id}'"
f"{' in DAG ' + self.dag.dag_id if self.has_dag() else ''}: {sorted(invalid_keys)}. "
f"Only 'pod_override' and 'pod_template_file' are allowed."
)
self.log.error(error_msg)
raise AirflowException(error_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@o-nikolas does this stand true form ECS executor too?

I see one of the tests doing this and I wanted to check if thats the case:

        tags_exec_config = [{"key": "FOO", "value": "BAR"}]
        workload.ti.executor_config = {"tags": tags_exec_config}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, whoa, whoa, thanks for tagging me @amoghrajesh we do not want to do this.

We have many executors other than KubernetesExecutor, so at the base class level we absolutely should not assume/assert anything about the executor_config.

For example the ECS executor treats the executor_config as kwargs for the ECS run_task api call, to allow users to set overrides for the parameters that API accepts.

If we want to do any validation it's going to have to be a lot more complicated to try detect the executor being used, which feels like a lot of work and I expect it to be a little brittle. You could also do validation at task queued time, which would at least bring it a bit earlier.

Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a blocker so that this change does not get merged.

Comment on lines +1206 to +1225
def validate_executor_config(self, executor_config: dict | None) -> None:
"""
Validate the executor_config to ensure it only contains 'pod_override' or 'pod_template_file' keys.

:param executor_config: The executor_config dictionary to validate.
:raises AirflowException: If executor_config contains keys other than 'pod_override' or 'pod_template_file'.
"""
if not executor_config or not isinstance(executor_config, dict):
return

valid_keys = {"pod_override", "pod_template_file"}
invalid_keys = set(executor_config.keys()) - valid_keys
if invalid_keys:
error_msg = (
f"Invalid executor_config keys for task '{self.task_id}'"
f"{' in DAG ' + self.dag.dag_id if self.has_dag() else ''}: {sorted(invalid_keys)}. "
f"Only 'pod_override' and 'pod_template_file' are allowed."
)
self.log.error(error_msg)
raise AirflowException(error_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, whoa, whoa, thanks for tagging me @amoghrajesh we do not want to do this.

We have many executors other than KubernetesExecutor, so at the base class level we absolutely should not assume/assert anything about the executor_config.

For example the ECS executor treats the executor_config as kwargs for the ECS run_task api call, to allow users to set overrides for the parameters that API accepts.

If we want to do any validation it's going to have to be a lot more complicated to try detect the executor being used, which feels like a lot of work and I expect it to be a little brittle. You could also do validation at task queued time, which would at least bring it a bit earlier.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, wrong change. This will not allow anyone to use Custom Executors which might accept different executor_config

@HsiuChuanHsu
Copy link
Contributor Author

Hi @amoghrajesh, @o-nikolas, @kaxil,
Thanks a ton for your reviews and feedback! After going through your comments, I realized I might’ve overlooked that other executors besides KubernetesExecutor could be affected too.
To be honest, this issue feels a bit bigger than what I can handle right now. To avoid going down the wrong path, I’m gonna close this PR for the time being. I’ll dig into it more and get a better grasp of the whole picture, maybe come back with a fresh PR later when I’m more confident.

Thanks again for all the help and pointers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-sdk type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid keys in executor_config should raise an error

5 participants