Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a mechanism to warn if executors override existing CLI commands #33423

Merged
merged 9 commits into from Aug 19, 2023

Conversation

vandonr-amz
Copy link
Contributor

@vandonr-amz vandonr-amz commented Aug 15, 2023

Currently, if an executor redefines and existing command (allowed by #29055), I think it silently overrides the existing one, which can be an issue.
I'm in the process of adding one more way to add commands, so doing this seems like a good idea.

Also fixed a test side-effect, which led to fixing a bunch of tests that were only working because of the order in which they were called.

airflow/cli/cli_parser.py Outdated Show resolved Hide resolved
Comment on lines -48 to +50
airflow_commands = core_commands
airflow_commands = core_commands.copy() # make a copy to prevent bad interactions in tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we do airflow_commands.extend(...) after that, and airflow_commands is just pointing to core_commands, we were actually modifying core_commands, and whatever we were adding in there stayed there for all tests.

Comment on lines -37 to +46
custom_executor_module.CustomCeleryExecutor = type( # type: ignore
"CustomLocalExecutor", (celery_executor.CeleryExecutor,), {}
)
custom_executor_module.CustomCeleryKubernetesExecutor = type( # type: ignore
"CustomLocalKubernetesExecutor", (celery_kubernetes_executor.CeleryKubernetesExecutor,), {}
custom_executor_module.CustomLocalExecutor = type( # type: ignore
"CustomLocalExecutor", (local_executor.LocalExecutor,), {}
)
custom_executor_module.CustomCeleryExecutor = type( # type: ignore
"CustomKubernetesExecutor", (celery_executor.CeleryExecutor,), {}
custom_executor_module.CustomLocalKubernetesExecutor = type( # type: ignore
"CustomLocalKubernetesExecutor", (local_kubernetes_executor.LocalKubernetesExecutor,), {}
)
custom_executor_module.CustomCeleryKubernetesExecutor = type( # type: ignore
"CustomCeleryKubernetesExecutor", (celery_kubernetes_executor.CeleryKubernetesExecutor,), {}
custom_executor_module.CustomKubernetesExecutor = type( # type: ignore
"CustomKubernetesExecutor", (kubernetes_executor.KubernetesExecutor,), {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole thing was a mess of copy-paste mistakes. Tests were passing only because commands were never removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

"custom_executor.CustomCeleryKubernetesExecutor",
],
)
def test_dag_parser_celery_command_accept_celery_executor(self, executor):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test was originally checking that celery executors were accepting the celery sub-command, but this is now included in the parameterized test below, with the kubernetes command at the same time.

parser.parse_args([expected_arg, "--help"])
assert e.value.code == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assert was never executed because it was in the block that captures the exception

stderr = stderr.getvalue()
assert "airflow command error" not in stderr

def test_dag_parser_config_command_dont_required_celery_executor(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a obsolete test remaining from when there was code to check the executor when executing a command. It was introduced after a bugfix on that code (#17071), but the tested code has been removed since, so it's not testing anything anymore.

Comment on lines +76 to +77
f"This can be due to the executor '{ExecutorLoader.get_default_executor_name()}' "
f"redefining core airflow CLI commands."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have this message? In the future it might no longer be true and other components such as auth managers could be able to also define their own CLI command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we'll need to edit the message then, but in my opinion, we cannot crash the whole thing in the face of the user without giving them as much information as possible on why this crash is happening.

Or maybe the only people who would encounter this error would be developers trying to add new commands and a vague message would be enough because they know what they just modified ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think we cannot entirely rule out the possibility that a user would face this error, especially as we add more "command vendors", the number of combination raises quadratically, and we cannot expect developers to test them all.
We can try to enforce it with "officially" provided components, but with custom components, all bets are off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see what you mean and I dont disagree but at the same time, it might not be related at all to the default executor and then point the wrong direction to the user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont disagree but at the same time, it might not be related at all to the default executor and then point the wrong direction to the user

I think the language used in the message is already pretty cautious, it just suggests that the executor could be the issue:

This can be due to the executor ...

We could try soften the language even more, but personally I think this is okay for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think we cannot entirely rule out the possibility that a user would face this error, especially as we add more "command vendors", the number of combination raises quadratically, and we cannot expect developers to test them all.
We can try to enforce it with "officially" provided components, but with custom components, all bets are off.

Yes. But one comment/ suggestion.

This is the same with "config" we handle now in exactly the same way. I think there are two solutions for "all bets off" and potential conflicts:

  1. introduce some kind of central registry of commands and "give them" to those who want certain id (pessimistic, top-down-driven, centralized, does not play well with free/open-source project). This for example what IP addresses, DNS names, IMEI numbers for simcards are about as well as few others.

  2. make those who introduce custom commands painfully aware that THEY have to do everythig to make their commands unique. This is far softer, distributed and more self-regulating. The problem is that those who develop things will have problem if they use non-unique names. So anyone doing "aws executors" ;) should have commands starting with "aws-" to avoid contention. That is more or less what java and python packages follow. Nobody keeps the registry of those, yet somehow everyone takes care to not use generic package names following common conventions ("google.",. "aws.") and it just works (TM).

Obviously 2) is better for us.

This is something we might want to document as "soft convention" and mention why ("When you create a custom config/CLI, you should make sure to use unique-enough prefix indicating your unique product/service/etc. to avoid conflicts with other commands/configs added by others.")

Maybe we should mention it somewhere where we document how to create custom executor (and likely should be the same on custom provider for config). Once we document it, it's really on those who will add new commands to worry about it (and yes I know you are on both sides of it :D ).

Comment on lines -37 to +46
custom_executor_module.CustomCeleryExecutor = type( # type: ignore
"CustomLocalExecutor", (celery_executor.CeleryExecutor,), {}
)
custom_executor_module.CustomCeleryKubernetesExecutor = type( # type: ignore
"CustomLocalKubernetesExecutor", (celery_kubernetes_executor.CeleryKubernetesExecutor,), {}
custom_executor_module.CustomLocalExecutor = type( # type: ignore
"CustomLocalExecutor", (local_executor.LocalExecutor,), {}
)
custom_executor_module.CustomCeleryExecutor = type( # type: ignore
"CustomKubernetesExecutor", (celery_executor.CeleryExecutor,), {}
custom_executor_module.CustomLocalKubernetesExecutor = type( # type: ignore
"CustomLocalKubernetesExecutor", (local_kubernetes_executor.LocalKubernetesExecutor,), {}
)
custom_executor_module.CustomCeleryKubernetesExecutor = type( # type: ignore
"CustomCeleryKubernetesExecutor", (celery_kubernetes_executor.CeleryKubernetesExecutor,), {}
custom_executor_module.CustomKubernetesExecutor = type( # type: ignore
"CustomKubernetesExecutor", (kubernetes_executor.KubernetesExecutor,), {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@potiuk
Copy link
Member

potiuk commented Aug 19, 2023

I am merging it, but I have a comment for the future (and also part of earlier discussion with @o-nikolas ). We do not currently have "How to create custom executor" docs. And it's (IMHO) one of the most important thing to complete AIP-51. We need to make sure in the future docs that we mention the "conventions" we have for CLI/Config uniqueness.

Just a note so that we do not forget about it.

@potiuk potiuk merged commit 1945c1a into apache:main Aug 19, 2023
42 checks passed
@potiuk potiuk added this to the Airflow 2.7.1 milestone Aug 19, 2023
@potiuk
Copy link
Member

potiuk commented Aug 19, 2023

Marked it as 2.7.1 as well - the sooner we release it, the better.

@vandonr-amz vandonr-amz deleted the vandonr/fab2 branch August 19, 2023 23:50
@o-nikolas
Copy link
Contributor

We do not currently have "How to create custom executor" docs. And it's (IMHO) one of the most important thing to complete AIP-51.

@potiuk Yupp, still on my radar. I've been pushing hard to get our first executor itself completed, but I'll try make some time for this task this week!

@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Aug 27, 2023
@ephraimbuddy
Copy link
Contributor

We can't be able to cherrpick this one to 2.7.1 due to a dependence on #33279.

@potiuk
Copy link
Member

potiuk commented Aug 28, 2023

Yeah. It's not a bugfix as well - can be easily added in 2.8

@ephraimbuddy ephraimbuddy added type:improvement Changelog: Improvements and removed type:bug-fix Changelog: Bug Fixes labels Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants