Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align cncf provider file names with AIP-21 #29905

Merged
merged 1 commit into from
Mar 4, 2023
Merged

Conversation

eladkal
Copy link
Contributor

@eladkal eladkal commented Mar 3, 2023

Changes to import paths:

operators.kubernetes_pod -> operators.pod
triggers.kubernetes_pod -> triggers.pod

This PR is backward compatible


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added provider:cncf-kubernetes Kubernetes provider related issues area:providers area:system-tests provider:amazon-aws AWS/Amazon - related issues provider:google Google (including GCP) related issues labels Mar 3, 2023
@o-nikolas
Copy link
Contributor

Some static checks need fixing and warnings need accounting for, but otherwise the changes lgtm 👍

@eladkal eladkal requested review from potiuk and ashb as code owners March 4, 2023 00:00
@potiuk
Copy link
Member

potiuk commented Mar 4, 2023

(Pending fixing the tests :)

Changes to import paths:

`operators.kubernetes_pod` -> `operators.pod`
`triggers.kubernetes_pod` -> `triggers.pod`

This PR is backward compatible
@eladkal eladkal merged commit 9710394 into apache:main Mar 4, 2023
@eladkal eladkal deleted the k8s branch March 4, 2023 20:44
@pierrejeambrun pierrejeambrun added this to the Airflow 2.6.0 milestone Mar 22, 2023
@pierrejeambrun pierrejeambrun added the type:misc/internal Changelog: Misc changes that should appear in change log label Mar 22, 2023
@dstandish
Copy link
Contributor

dstandish commented May 15, 2023

I am confused what was the basis of this change. I thought AIP-21 just said that you should omit the _operator suffix. Why remove the kubernetes_? We still keep it (and very much need it), for example, on the kubernetes hook?

I should clarify, specifically I'm concerned with the rename of kubernetes_pod.py > pod.py

@eladkal
Copy link
Contributor Author

eladkal commented May 16, 2023

I am confused what was the basis of this change. I thought AIP-21 just said that you should omit the operator suffix. Why remove the kubernetes? We still keep it (and very much need it), for example, on the kubernetes hook?

Please explain why this is needed?
All providers are aligned without the prefix this is the convention.

I should clarify, specifically I'm concerned with the rename of kubernetes_pod.py > pod.py

Please clarify why. We are not going to remove kubernetes_pod.py any time soon.. This is one that will stay with us for long till we are fully confident that most users migrated. At least 1 year to my taste.. even more (at least this is how I see it)

@dstandish
Copy link
Contributor

dstandish commented May 16, 2023

I am confused what was the basis of this change. I thought AIP-21 just said that you should omit the operator suffix. Why remove the kubernetes? We still keep it (and very much need it), for example, on the kubernetes hook?

Please explain why this is needed? All providers are aligned without the prefix this is the convention.

My sentence structure maybe makes it harder to see but to clarify I'm talking about with kubernetes hook we call the file kubernetes.py -- what would you call it if you removed the "kubernetes"? That's what I mean. :)

I should clarify, specifically I'm concerned with the rename of kubernetes_pod.py > pod.py

Please clarify why. We are not going to remove kubernetes_pod.py any time soon.. This is one that will stay with us for long till we are fully confident that most users migrated. At least 1 year to my taste.. even more (at least this is how I see it)

When I wrote "specifically I'm concerned" I was clarifying that that was the rename I was asking about because there were other renames in the PR but I was just asking about pod.py (initially i said "what's the reason for this change" so it wasn't clear what I was asking about). I'm not saying that the change is like... especially "concerning" just clarifying what I'm asking about.

But yeah so anyway, just not sure why the change and curious to understand, does not seem that it is required / suggested / indicated by the AIP. I thought that kubernetes_pod.py was the right name -- just chopping the _operator suffix. So like KubernetesHook -> kubernetes.py and KubernetesPodOperator -> kubernetes_pod.py both seem consistent.

@potiuk
Copy link
Member

potiuk commented May 16, 2023

I think (and this is why I approve it) - this change was really going together with the "spirit" of the AIP 21 even if not necessarily the "letter" of it.

The spirit of AIP-21 was that we remove redundant, repeated names from the full path of the class/module. That was the gist of it and the reason we implemented the change.

And In this case. the change was rather following that:

from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator

contains kubernetes 3 times.

To be perfectly honest, I have not agreed with the change when we implemented it. I was against shortening the imports because I though that rather than length of the imports, the important thing is that when I press SHIFT (search anywhere) three times in my IntelliJ IDE and start typing datadog I can see datadog_operators.py, datadog_hooks.py, datadog_sensors.py rather than three datadog.py so that I can choose the one I want easier. It was made easier (actually during our discussion back then) that Intellij started to also show package in the result of the search so I did not oppose that much and went into "disagree but engage" mode (but it somewhat haunts me till today even though it is pretty usable even now with those shorter names and package hints):

image

I think this is just a natural application to be more consistent there. Again - I would personally prefer to go with the kubernetes_ prefix but the spirit of AIP-21 was to optimise the import length and reduce redundancy and for consistency, I think it makes sense to use the pod.py

@eladkal
Copy link
Contributor Author

eladkal commented May 16, 2023

My sentence structure maybe makes it harder to see but to clarify I'm talking about with kubernetes hook we call the file kubernetes.py -- what would you call it if you removed the "kubernetes"? That's what I mean. :)

The import is
from airflow.providers.cncf.kubernetes.operators.pod import KubernetesPodOperator

You know it's Kubernetes because you import it from Kubernetes provider. There is no reason to have kubernetes_ for all py file in the provider.

When I wrote "specifically I'm concerned" I was clarifying that that was the rename I was asking about because there were other renames in the PR but I was just asking about pod.py (initially i said "what's the reason for this change" so it wasn't clear what I was asking about). I'm not saying that the change is like... especially "concerning" just clarifying what I'm asking about.

When we have single hook we normally name it after the provider itself.

from airflow.providers.asana.hooks.asana import AsanaHook
from airflow.providers.ftp.hooks.ftp import FTPHook
from airflow.providers.http.hooks.http import HttpHook

When we have multiple hooks we name it after the service (like you can see in Amazon and Google providers.

About operators note that #29930 is going to add
airflow.providers.cncf.kubernetes.operators.resource
so having
airflow.providers.cncf.kubernetes.operators.pod
along side make sense to me.

We are not quite there in having this completely organized, we probably need better defined policy with enforcement to bind all providers.

@dstandish
Copy link
Contributor

dstandish commented May 16, 2023

The import is from airflow.providers.cncf.kubernetes.operators.pod import KubernetesPodOperator

You know it's Kubernetes because you import it from Kubernetes provider. There is no reason to have kubernetes_ for all py file in the provider.

Yea I guess my interpretation of AIP-21 was not that we avoid all redundancy (and hooks is a good example where we don't do this, because then your file is _.py) but a rule about how to map to module name from class name. So FtpHook becomes ftp.py, KubernetesPodOperator becomes kubernetes_pod.py -- and indeed, someone in the past must have also shared this interpretation when it was originally renamed from kubernetes_pod_operator.py to kubernetes_pod.py.

I see the logic, that from organization perspective, the information ("kubernetes") is there in the module path, so in that sense you can say it's not necessary. I guess I'm with Jarek in preferring the verbosity -- and like he was mentioning, it's nice for finding a file, e.g.:

image

vs

image

But I am also with Jarek in the sense of not standing in the way of the shorter naming if that's what the community prefers.

@potiuk
Copy link
Member

potiuk commented May 16, 2023

Well. If there are two of us, we might actually attempt to revert that decision..... 🤯

@eladkal
Copy link
Contributor Author

eladkal commented May 16, 2023

AIP-21 did not completely resolved all issues.

For example:
We have AutoMLPredictOperator, BigtableCreateInstanceOperator in Google provider
We have EC2TerminateInstanceOperator, GlueJobOperator in Amazon provider

Then why do we have KubernetesPodOperator? Why the provider name is part of the operator name? Should it be change to PodOperator? I'm not sure.
I can show other examples that we do have the name of the provider as part of the operator name.

I can raise like 5-6 more issues of inconstancy of how we name things. We can reopen this and potentially come up with a replacement to AIP-21 with new rules and some enforcements but that is a massive refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers area:system-tests provider:amazon-aws AWS/Amazon - related issues provider:cncf-kubernetes Kubernetes provider related issues provider:google Google (including GCP) related issues type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants