Skip to content

Commit

Permalink
Update public interface doc re operators (#36767)
Browse files Browse the repository at this point in the history
Base classes such as BaseOperator and BaseSensorOperator are public in the traditional sense.  We publish them explicitly for the purpose of being extended.

But I think that derivatives of them are published only as end products, in that their behavior and signature should be subject to semver, but not their structure, which should be considered "internal" and not user-facing.  Users can extend these classes but they should do so at their own risk, with the knowledge that we might refactor.  Otherwise the "public interface" is just unnecessarily big.
  • Loading branch information
dstandish committed Jan 19, 2024
1 parent fba05db commit 66de4bd
Show file tree
Hide file tree
Showing 5 changed files with 30 additions and 57 deletions.
6 changes: 5 additions & 1 deletion airflow/models/baseoperator.py
Expand Up @@ -15,7 +15,11 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Base operator for all operators."""
"""
Base operator for all operators.
:sphinx-autoapi-skip:
"""
from __future__ import annotations

import abc
Expand Down
7 changes: 6 additions & 1 deletion airflow/operators/__init__.py
Expand Up @@ -16,7 +16,12 @@
# specific language governing permissions and limitations
# under the License.
# fmt: off
"""Operators."""
"""
Operators.
:sphinx-autoapi-skip:
"""

from __future__ import annotations

from airflow.utils.deprecation_tools import add_deprecated_classes
Expand Down
6 changes: 5 additions & 1 deletion airflow/sensors/__init__.py
Expand Up @@ -16,7 +16,11 @@
# specific language governing permissions and limitations
# under the License.
# fmt: off
"""Sensors."""
"""
Sensors.
:sphinx-autoapi-skip:
"""
from __future__ import annotations

from airflow.utils.deprecation_tools import add_deprecated_classes
Expand Down
66 changes: 12 additions & 54 deletions docs/apache-airflow/public-airflow-interface.rst
Expand Up @@ -18,27 +18,29 @@
Public Interface of Airflow
...........................

The Public Interface of Apache Airflow is a set of interfaces that allow developers to interact
with and access certain features of the Apache Airflow system. This includes operations such as
creating and managing DAGs (Directed Acyclic Graphs), managing tasks and their dependencies,
The Public Interface of Apache Airflow is the collection of interfaces and behaviors in Apache Airflow
whose changes are governed by semantic versioning. A user interacts with Airflow's public interface
by creating and managing DAGs, managing tasks and dependencies,
and extending Airflow capabilities by writing new executors, plugins, operators and providers. The
Public Interface can be useful for building custom tools and integrations with other systems,
and for automating certain aspects of the Airflow workflow.

Using Airflow Public Interfaces
===============================

Using Airflow Public Interfaces is needed when you want to interact with Airflow programmatically:
The following are some examples of the public interface of Airflow:

* When you are extending Airflow classes such as Operators and Hooks. This can be done by DAG authors to add missing functionality in their DAGs or by those who write reusable custom operators for other DAG authors.
* When you are writing your own operators or hooks. This commonly done when no hook or operator exists for your use case, or when perhaps when one exists but you need to customize the behavior.
* When writing new :doc:`Plugins <authoring-and-scheduling/plugins>` that extend Airflow's functionality beyond
DAG building blocks. Secrets, Timetables, Triggers, Listeners are all examples of such functionality. This
is usually done by users who manage Airflow instances.
* Bundling custom Operators, Hooks, Plugins and releasing them together via
:doc:`provider packages <apache-airflow-providers:index>` - this is usually done by those who intend to
provide a reusable set of functionality for external services or applications Airflow integrates with.
* Using the taskflow API to write tasks
* Relying on the consistent behavior of Airflow objects

All the ways above involve extending or using Airflow Python classes and functions. The classes
One aspect of "public interface" is extending or using Airflow Python classes and functions. The classes
and functions mentioned below can be relied on to keep backwards-compatible signatures and behaviours within
MAJOR version of Airflow. On the other hand, classes and methods starting with ``_`` (also known
as protected Python methods) and ``__`` (also known as private Python methods) are not part of the Public
Expand Down Expand Up @@ -73,8 +75,6 @@ Airflow has a set of example DAGs that you can use to learn how to write DAGs

You can read more about DAGs in :doc:`DAGs <core-concepts/dags>`.

.. _pythonapi:operators:

References for the modules used in DAGs are here:

.. toctree::
Expand All @@ -95,57 +95,14 @@ Properties of a :class:`~airflow.models.dagrun.DagRun` can also be referenced in

_api/airflow/models/dagrun/index

.. _pythonapi:operators:

Operators
---------

Operators allow for generation of certain types of tasks that become nodes in
the DAG when instantiated.

There are 3 main types of operators:

- Operators that performs an **action**, or tell another system to
perform an action
- **Transfer** operators move data from one system to another
- **Sensors** are a certain type of operator that will keep running until a
certain criterion is met. Examples include a specific file landing in HDFS or
S3, a partition appearing in Hive, or a specific time of the day. Sensors
are derived from :class:`~airflow.sensors.base.BaseSensorOperator` and run a poke
method at a specified :attr:`~airflow.sensors.base.BaseSensorOperator.poke_interval` until it
returns ``True``.

All operators are derived from :class:`~airflow.models.baseoperator.BaseOperator` and acquire much
functionality through inheritance. Since this is the core of the engine,
it's worth taking the time to understand the parameters of :class:`~airflow.models.baseoperator.BaseOperator`
to understand the primitive features that can be leveraged in your DAGs.

Airflow has a set of Operators that are considered public. You are also free to extend their functionality
by extending them:

.. toctree::
:includehidden:
:glob:
:maxdepth: 1

_api/airflow/operators/index

_api/airflow/sensors/index


You can read more about the operators in :doc:`core-concepts/operators`, :doc:`core-concepts/sensors`.
Also you can learn how to write a custom operator in :doc:`howto/custom-operator`.

.. _pythonapi:hooks:

References for the modules used in for operators are here:

.. toctree::
:includehidden:
:glob:
:maxdepth: 1

_api/airflow/models/baseoperator/index
The base classes :class:`~airflow.models.baseoperator.BaseOperator` and :class:`~airflow.sensors.base.BaseSensorOperator` are public and may be extended to make new operators.

Subclasses of BaseOperator which are published in Apache Airflow are public in *behavior* but not in *structure*. That is to say, the Operator's parameters and behavior is governed by semver but the methods are subject to change at any time.

Task Instances
--------------
Expand Down Expand Up @@ -175,6 +132,7 @@ Task instance keys are unique identifiers of task instances in a DAG (in a DAG R

_api/airflow/models/taskinstancekey/index

.. _pythonapi:hooks:

Hooks
-----
Expand Down
2 changes: 2 additions & 0 deletions docs/conf.py
Expand Up @@ -197,6 +197,8 @@
exclude_patterns = [
# We only link to selected subpackages.
"_api/airflow/index.rst",
# "_api/airflow/operators/index.rst",
# "_api/airflow/sensors/index.rst",
# Included in the cluster-policies doc
"_api/airflow/policies/index.rst",
"README.rst",
Expand Down

0 comments on commit 66de4bd

Please sign in to comment.