Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Public Interface description to Airflow documentation #28300

Merged
merged 1 commit into from
Jan 11, 2023

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Dec 12, 2022


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Dec 12, 2022

About 80% of it was written by the GPT Chat :D

docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@BasPH BasPH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides several structural changes, my main concern is the naming. To me "public APIs" 99% of the time relates to a REST API. I think "public interfaces" is better suited here. WDYT?

docs/apache-airflow/index.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-api.rst Outdated Show resolved Hide resolved
@potiuk
Copy link
Member Author

potiuk commented Dec 12, 2022

Besides several structural changes, my main concern is the naming. To me "public APIs" 99% of the time relates to a REST API. I think "public interfaces" is better suited here. WDYT?

Yes. That might avoid a lot of confusion.

@potiuk potiuk force-pushed the add-public-api-documentation branch from 6ec44ba to 065138c Compare December 13, 2022 01:28
@potiuk potiuk changed the title Add Public API description to Airflow documentation Add Public Interface description to Airflow documentation Dec 13, 2022
@potiuk potiuk force-pushed the add-public-api-documentation branch from 065138c to dc71c20 Compare December 13, 2022 01:28
@potiuk
Copy link
Member Author

potiuk commented Dec 13, 2022

First round review passed - all comments applied @BasPH @o-nikolas

@potiuk
Copy link
Member Author

potiuk commented Dec 13, 2022

(and docs building is fixed here).

@potiuk potiuk force-pushed the add-public-api-documentation branch from dc71c20 to ad15bc9 Compare December 13, 2022 16:20
@potiuk potiuk requested a review from BasPH December 13, 2022 16:20
@potiuk potiuk force-pushed the add-public-api-documentation branch from ad15bc9 to 9d56cd0 Compare December 13, 2022 16:21
@potiuk
Copy link
Member Author

potiuk commented Dec 13, 2022

Next round of reviews completed. I've also added Trigger as it was missing. I think it's coming up nicely with everyone's contributions.

Please take a look and see if there are other things I missing, maybe it is close to mergability

BTW. The more review. the less GPT Chatbot's text remains:) .

@potiuk potiuk force-pushed the add-public-api-documentation branch from 9d56cd0 to f7499c6 Compare December 13, 2022 20:08
Copy link
Contributor

@BasPH BasPH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicky consistency things, but overall looking good!

docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
Comment on lines 113 to 115
There are a number of different executor implementations built-in Airflow, each with its own unique
characteristics and capabilities. Executor interface was available in earlier version of Airflow but
only as of version 2.6 executors are decoupled and Airflow does not rely on built-in set of executors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the key message is here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message is that only as 2.6 executors are truly extendable. Because they weren't before. There are certain modifications and changes in the interface (as explained and being implemented in https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-51+Removing+Executor+Coupling+from+Core+Airlfow ) that allow you to implement custom executor without hittting some roadblocks.

We just failed in delivering it before even if it was somewhat promised. And this is to warn users they should only target 2.6+. I want to avoid the situation that people mistakenly think they can do it in earlier versions. Of course there is the "version" selector at the top of the documentation, but many people do not realise that the feature they see there in the "stable" version are not available in their version and loose time and effort to try to make them work. And then they come up with bad feelings about Airflow, because we failed to warn them clearly enough.

Just one example from last week:

User thinking that breadth-first mapping is implemented in 2.3

https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1670968167747709?thread_ts=1670937467.554809&cid=CCQ7EGB1P

User is at 2.3.4 in Composer and cannot upgrade to 2.5.0:

User: any possibility to implement an inheritance on TaskGroup and add the "jinja interpretation mechanism" somehow?? even simplistically, just to get a {{ds_nodash}} interpretation (edited)
Jarek: No
User: RIP 10 days of work lmao

I want to malke sure our users don't even start working on 2.3, 2.4, 2.5 with custom executors basing on the new documentation that's coming on 2.6 where we explicitly admit Executors are extendable. Before we hinted at it but never admitted to it. Now when we explicitly state it that users "can" do it, we should tell them they should only do it in 2.5 and explain why. Otherwise they will start working on it with earlier versions and when they fail, they will complain. We can prevent it by being explicit about it and if they complain we can link to that doc and tell them "you had a chance to read it, it's your fault". Otherwise they can tell "you have not explained this explicitly that it's only 2.6" and they will be right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple not true though -- I know of multiple people (Astro and others) who have successfully written custom executors against 2.1 and above.

Copy link
Member Author

@potiuk potiuk Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple not true though -- I know of multiple people (Astro and others) who have successfully written custom executors against 2.1 and above.

It's very strong statement to claim something is not true. But I think we are mixing two things here. I know there are people who wrote some implementations of custom executors that worked in some cases. This is true. I agree with that statement. And I am not arguing with that statement at all. In fact I wholeheartedly agree with your statement @ashb. And I believe it is not invalidating what I wrote (or at least what my intention was). What I wrote that the earlier attempts to write executor could have failed in a number of non-obvious cases because there were a number of hard-coded behaviours that made executior not a "well defined API".

I do not want to make an attempt to describe what things you should avoid and what problems you would hit with tryibng to implement Executors before. I think it's a topic that requires some intrinsic analysis of where the "hard-coded" behaviours could cause the problems. And if someone would like to attempt it - I am all ears.

But - let me rephrase it in the way that will make it clear that there were successful attempts before, but you could have hit difficult to diagnose and solve problems because built-in executors were hard-coded (and I will mention that a there were cases successes in writing limited implementations of executors). I hope this will be acceptable.

Copy link
Member Author

@potiuk potiuk Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the note now:

There are a number of different executor implementations built-in Airflow, each with its own unique
characteristics and capabilities. Executor interface was available in earlier version of Airflow but
only as of version 2.6 executors are fully decoupled and Airflow does not rely on built-in set of executors.
You could have implemented (and succeeded) with implementing Executors before Airflow 2.6 and a number
of people succeeded in doing so, but there were some hard-coded behaviours that preferred in-built
executors, and custom executors could not provide full functionality that built-in executors had.

I hope it is much closer to "truth" now - it mentions that it was possible to implement executors before and people succeed with it (truth 1) but that you could have hit the limits because of some hard-coded behaviours (truth 2) and that in 2.6 you can rely on public api of executor to be on-par with the built-in executors (truth 3)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope there are no more non-truths in this statement @ashb

Copy link
Member Author

@potiuk potiuk Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW. I can also link to AIP-51 for details of what "could not be done" before. I think it describes it pretty well (and completely). But I think we rarely link to AIPs in our docs (and it was criricised before when I added it), so I think just leaving a generic note is better.

@potiuk
Copy link
Member Author

potiuk commented Dec 14, 2022

Hey @BasPH - explained reasoning why I addded those a bit longer explanation and context.

I hope this is good for you as well.

One of the cases which I want to address is to be able to explain people that if they do rely on some of those things being explained as "public interface" , even if they do, whenever they complain or raise questions about that I want to simply send them there - "Please read the docs, it's all explained there and you likely missed it".

That saves enormous mental effort and time in disputes when you engage in such discussions (which are often) "but you did not explain it, I had the right to think it works this way".

Also just to stress a bit that my "thinking" is not theorethical. It is very much based on practice of interacting with our users A LOT. I would encourage everyone to engage more in discussions we have in our community - or at least review them and follow. So far not many of us, maintainers are engaged, but I am. And I think I understand quite well - and likely bettter than others - what our users are struggling with as well what is helpful for people.

I am not sure if you realize, but over the last 6 months or so the status of Github Discussions of ours looks like that (I have - very consistently 11/15 "most helpful" answers last 30 days while the next person never had more than 1-2:

Screenshot 2022-12-14 at 12 12 02

And the raise in a number of discussions looks like that - so this is an extremely important medium where our users look for help:

Screenshot 2022-12-14 at 12 16 02

@potiuk potiuk requested a review from BasPH December 14, 2022 11:22
@potiuk
Copy link
Member Author

potiuk commented Dec 14, 2022

BTW. That was a good exercise (and learning) for GPT Chat after the reviews. I don't think it is good in writing docs like that - it's far to chatty (it's a chat bot eventually). Almost none of the original text from the GPT chat is left, and the number of omissions (which were added during the review) and mistakes it made was quite big.

@potiuk
Copy link
Member Author

potiuk commented Dec 16, 2022

Hey everyone - I spoke to @TohnJhomas and after we merged the big restructuring of his in #27235 we decided together to move all the changes for "customizing airflow" and "public-airlfow-interface" to "administration-and-deployment".

This is mostly what it is all about - people writing DAGs are rarely concerned with changes that would involve breaking the API - those are more people who are installing and administering airlfow that are concerned about it, but I think we can still discuss what is the best place if others think otherwise.

Looking forward to merging that one afte the big restructure of docs is merged.

@potiuk
Copy link
Member Author

potiuk commented Dec 29, 2022

Ok. I brought it to the state that - I hope we can start discussing what is missing. I re-wrote the headers to reflect the current thinking of mine how our users could find this page useful, but also to make this page a really comprehensive and complete description of what the users might expect from Airflow as a "library".

This is the current proposal for the "structure" of the page:

Screenshot 2022-12-29 at 21 52 13

And I like that we can get rid of other pages that were "almost, but not quite entirely unlike what i wanted to achieve" (quoting my favourite writer).

@potiuk potiuk force-pushed the add-public-api-documentation branch from 5c14858 to c78e64b Compare January 5, 2023 10:33
@potiuk
Copy link
Member Author

potiuk commented Jan 5, 2023

@BasPH ?

@potiuk
Copy link
Member Author

potiuk commented Jan 10, 2023

Hey @BasPH - just to repeat it - is your "change requested" still valid?

Is the current approach good enough for you or do you think there is something stil not OK with teh approach I finally came up with after listening and getting feedback and applying multiple ideas on how to address the issue that has been already discussed in the devlist as needed ?

I think we need to start openly communicate with our users on what they should expect as something they can rely on, And I think this is the first step tp do it.

Anything else that needs to be done here do you thinkl? I would really love to merge it to move forward and seems it's been stalled on your review.

Copy link
Contributor

@BasPH BasPH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another review. Starting to look nice and clear!

docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
docs/apache-airflow/public-airflow-interface.rst Outdated Show resolved Hide resolved
Airflow is a complex system and since it is a platform, it is supposed
to be extended by the users by writing custom code wherever they miss
functionality in Airflow's core or its providers.

This page is an attempt to have a single place where we can express
our intention of what is the Public Interface of Airflow that the
user can depend on when implementing such customizations.

This is never 100% possible, we know that some users workflows might
depend on stuff that is internal implementation details or behaviours,
however the user doing so should be aware of the risk they are taking
by relying on something that was not intentionally exposed.

This page is intended to serve as a guideline for the users who would
like to make a decision to rely on some of the Airflow behaviours, so
that they know whether the API they want to rely on were explicitly
intended by Airflow community to expose as Public, or not.

Co-authored-by: Bas Harenslak <basph@users.noreply.github.com>
@potiuk potiuk force-pushed the add-public-api-documentation branch from 340673d to 858e6f9 Compare January 11, 2023 10:38
@potiuk
Copy link
Member Author

potiuk commented Jan 11, 2023

Did another review. Starting to look nice and clear!

Indeed. I like how this change now consolidates few places where we tried to communicate what should be considered as "public" in one dedicated place and I like the way how concise it is after all those corrections and feedback.

I updated it now:

  • applied your corrections Bas (I actually made you co-author @BasPH if you do not mind, as I think there are enormous improvements in clarity from you here :)
  • i re-added the links to providers with just a "read more" - similarly as in other cases - I think this is consistent with all other parts and allows the reader to dig-deeper if they want.

I hope we will be able to get it in soon :)

@potiuk potiuk merged commit 352d492 into apache:main Jan 11, 2023
@potiuk potiuk deleted the add-public-api-documentation branch January 11, 2023 13:37
@pierrejeambrun pierrejeambrun added the type:doc-only Changelog: Doc Only label Feb 27, 2023
@pierrejeambrun pierrejeambrun added this to the Airflow 2.5.2 milestone Feb 27, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 6, 2023
Airflow is a complex system and since it is a platform, it is supposed
to be extended by the users by writing custom code wherever they miss
functionality in Airflow's core or its providers.

This page is an attempt to have a single place where we can express
our intention of what is the Public Interface of Airflow that the
user can depend on when implementing such customizations.

This is never 100% possible, we know that some users workflows might
depend on stuff that is internal implementation details or behaviours,
however the user doing so should be aware of the risk they are taking
by relying on something that was not intentionally exposed.

This page is intended to serve as a guideline for the users who would
like to make a decision to rely on some of the Airflow behaviours, so
that they know whether the API they want to rely on were explicitly
intended by Airflow community to expose as Public, or not.

Co-authored-by: Bas Harenslak <basph@users.noreply.github.com>

Co-authored-by: Bas Harenslak <basph@users.noreply.github.com>
(cherry picked from commit 352d492)
potiuk added a commit to potiuk/airflow that referenced this pull request May 2, 2023
The Airflow models have been accidentally removed from the docs
when apache#28300 was implemented. The whole models documentation have
been removed accidentally, even if there were references to the
actual classes and packages used as the models package was
entirely excluded.

This PR fixes it by selectively including the models that should
be included and by linking the package indexes directly in
the public interface documentation.
potiuk added a commit that referenced this pull request May 2, 2023
The Airflow models have been accidentally removed from the docs
when #28300 was implemented. The whole models documentation have
been removed accidentally, even if there were references to the
actual classes and packages used as the models package was
entirely excluded.

This PR fixes it by selectively including the models that should
be included and by linking the package indexes directly in
the public interface documentation.
ephraimbuddy pushed a commit that referenced this pull request May 2, 2023
The Airflow models have been accidentally removed from the docs
when #28300 was implemented. The whole models documentation have
been removed accidentally, even if there were references to the
actual classes and packages used as the models package was
entirely excluded.

This PR fixes it by selectively including the models that should
be included and by linking the package indexes directly in
the public interface documentation.

(cherry picked from commit 6720a90)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants