Skip to content

Commit

Permalink
Fix providers documentation formatting (#28754)
Browse files Browse the repository at this point in the history
  • Loading branch information
Taragolis committed Jan 5, 2023
1 parent 455d05d commit 2b92c3c
Show file tree
Hide file tree
Showing 29 changed files with 163 additions and 105 deletions.
7 changes: 5 additions & 2 deletions airflow/providers/apache/livy/hooks/livy.py
Expand Up @@ -344,8 +344,11 @@ def build_post_batch_body(
) -> dict:
"""
Build the post batch request body.
For more information about the format refer to
.. seealso:: https://livy.apache.org/docs/latest/rest-api.html
.. seealso::
For more information about the format refer to
https://livy.apache.org/docs/latest/rest-api.html
:param file: Path of the file containing the application to execute (required).
:param proxy_user: User to impersonate when running the job.
:param class_name: Application Java/Spark main class string.
Expand Down
6 changes: 4 additions & 2 deletions docs/apache-airflow-providers-apache-drill/operators.rst
Expand Up @@ -22,14 +22,16 @@ Apache Drill Operators
Prerequisite
------------

To use ``DrillOperator``, you must configure a :doc:`Drill Connection <connections/drill>`.
To use :class:`~airflow.providers.apache.drill.operators.drill.DrillOperator`,
you must configure a :doc:`Drill Connection <connections/drill>`.

.. _howto/operator:DrillOperator:

DrillOperator
-------------

Executes one or more SQL queries on an Apache Drill server. The ``sql`` parameter can be templated and be an external ``.sql`` file.
Executes one or more SQL queries on an Apache Drill server.
The ``sql`` parameter can be templated and be an external ``.sql`` file.

Using the operator
""""""""""""""""""
Expand Down
3 changes: 2 additions & 1 deletion docs/apache-airflow-providers-apache-druid/operators.rst
Expand Up @@ -22,7 +22,8 @@ Apache Druid Operators
Prerequisite
------------

To use ``DruidOperator``, you must configure a Druid Connection first.
To use :class:`~airflow.providers.apache.druid.operators.druid.DruidOperator`,
you must configure a Druid Connection first.

DruidOperator
-------------------
Expand Down
Expand Up @@ -29,8 +29,8 @@ WebHdfsSensor
Waits for a file or folder to land in HDFS
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The :class:`~airflow.providers.apache.hdfs.sensors.web_hdfs.WebHdfsSensor`. is used to check for a file or folder
to land in HDFS
The :class:`~airflow.providers.apache.hdfs.sensors.web_hdfs.WebHdfsSensor` is used to check for a file or folder
to land in HDFS.

Use the ``filepath`` parameter to poke until the provided file is found.

Expand Down
2 changes: 1 addition & 1 deletion docs/apache-airflow-providers-apache-pig/operators.rst
Expand Up @@ -24,7 +24,7 @@ Apache Pig is a platform for analyzing large data sets that consists of a high-l
for expressing data analysis programs, coupled with infrastructure for evaluating these programs.
Pig programs are amenable to substantial parallelization, which in turns enables them to handle very large data sets.

use the PigOperator to execute a pig script
Use the :class:`~airflow.providers.apache.pig.operators.pig.PigOperator` to execute a pig script.

.. exampleinclude:: /../../tests/system/providers/apache/pig/example_pig.py
:language: python
Expand Down
10 changes: 7 additions & 3 deletions docs/apache-airflow-providers-apache-spark/operators.rst
Expand Up @@ -22,9 +22,13 @@ Apache Spark Operators
Prerequisite
------------

To use ``SparkJDBCOperator`` and ``SparkSubmitOperator``, you must configure a :doc:`Spark Connection <connections/spark>`. For ``SparkJDBCOperator``, you must also configure a :doc:`JDBC connection <apache-airflow-providers-jdbc:connections/jdbc>`.

``SparkSqlOperator`` gets all the configurations from operator parameters.
* To use :class:`~airflow.providers.apache.spark.operators.spark_submit.SparkSubmitOperator`
you must configure :doc:`Spark Connection <connections/spark>`.
* To use :class:`~airflow.providers.apache.spark.operators.spark_jdbc.SparkJDBCOperator`
you must configure both :doc:`Spark Connection <connections/spark>`
and :doc:`JDBC connection <apache-airflow-providers-jdbc:connections/jdbc>`.
* :class:`~airflow.providers.apache.spark.operators.spark_sql.SparkSqlOperator`
gets all the configurations from operator parameters.

.. _howto/operator:SparkJDBCOperator:

Expand Down
Expand Up @@ -24,10 +24,11 @@ The ArangoDB connection provides credentials for accessing the ArangoDB.
Configuring the Connection
--------------------------
ArangoDB Host (required)
Specify ArangoDB Host URL or comma separated list of URLs (coordinators in a cluster) `eg. "http://127.0.0.1:8529"` or `"http://127.0.0.1:8529,http://127.0.0.1:8530 so on.`
Specify ArangoDB Host URL or comma separated list of URLs (coordinators in a cluster),
e.g. ``http://127.0.0.1:8529`` or ``http://127.0.0.1:8529,http://127.0.0.1:8530``.
ArangoDB Database/Schema (required)
Specify `Database/Schema` for the ArangoDB. eg. `_system`
Specify **Database/Schema** for the ArangoDB. eg. ``_system``.
ArangoDB Username (required)
Specify `username` for the ArangoDB, eg. `root`
Specify **username** for the ArangoDB, e.g. ``root``.
ArangoDB Password (required)
Specify `password` for the ArangoDB
Specify **password** for the ArangoDB
12 changes: 6 additions & 6 deletions docs/apache-airflow-providers-arangodb/operators/index.rst
Expand Up @@ -21,14 +21,14 @@

Operators
=======================
You can build your own Operator hook in :class:`~airflow.providers.arangodb.hooks.ArangoDBHook`,
You can build your own Operator hook in :class:`~airflow.providers.arangodb.hooks.arangodb.ArangoDBHook`.


Use the :class:`~airflow.providers.arangodb.operators.AQLOperator` to execute
Use the :class:`~airflow.providers.arangodb.operators.arangodb.AQLOperator` to execute
AQL query in `ArangoDB <https://www.arangodb.com/>`__.

You can further process your result using :class:`~airflow.providers.arangodb.operators.AQLOperator` and
further process the result using **result_processor** Callable as you like.
You can further process your result using :class:`~airflow.providers.arangodb.operators.arangodb.AQLOperator` and
further process the result using :class:`result_processor <airflow.providers.arangodb.operators.arangodb.AQLOperator>`
Callable as you like.

An example of Listing all Documents in **students** collection can be implemented as following:

Expand All @@ -48,7 +48,7 @@ please provide **template_searchpath** while creating **DAG** object,
Sensors
=======

Use the :class:`~airflow.providers.arangodb.sensors.AQLSensor` to wait for a document or collection using
Use the :class:`~airflow.providers.arangodb.sensors.arangodb.AQLSensor` to wait for a document or collection using
AQL query in `ArangoDB <https://www.arangodb.com/>`__.

An example for waiting a document in **students** collection with student name **judy** can be implemented as following:
Expand Down
2 changes: 1 addition & 1 deletion docs/apache-airflow-providers-asana/index.rst
Expand Up @@ -27,7 +27,7 @@ Content
:caption: Guides

Connection types <connections/asana>
Operators <operators/asana>
Operators <operators/index>

.. toctree::
:maxdepth: 1
Expand Down
11 changes: 5 additions & 6 deletions docs/apache-airflow-providers-asana/operators/asana.rst
Expand Up @@ -16,7 +16,6 @@
under the License.
.. _howto/operator:AsanaCreateTaskOperator:

AsanaCreateTaskOperator
Expand All @@ -27,7 +26,7 @@ create an Asana task.


Using the Operator
^^^^^^^^^^^^^^^^^^
------------------

The AsanaCreateTaskOperator minimally requires the new task's name and
the Asana connection to use to connect to your account (``conn_id``). There are many other
Expand All @@ -46,7 +45,7 @@ delete an existing Asana task.


Using the Operator
^^^^^^^^^^^^^^^^^^
------------------

The AsanaDeleteTaskOperator requires the task id to delete. Use the ``conn_id``
parameter to specify the Asana connection to use to connect to your account.
Expand All @@ -55,14 +54,14 @@ parameter to specify the Asana connection to use to connect to your account.
.. _howto/operator:AsanaFindTaskOperator:

AsanaFindTaskOperator
=======================
=====================

Use the :class:`~airflow.providers.asana.operators.AsanaFindTaskOperator` to
search for Asana tasks that fit some criteria.


Using the Operator
^^^^^^^^^^^^^^^^^^
------------------

The AsanaFindTaskOperator requires a dict of search parameters following the description
`here <https://developers.asana.com/docs/get-multiple-tasks>`_.
Expand All @@ -80,7 +79,7 @@ update an existing Asana task.


Using the Operator
^^^^^^^^^^^^^^^^^^
------------------

The AsanaUpdateTaskOperator minimally requires the task id to update and
the Asana connection to use to connect to your account (``conn_id``). There are many other
Expand Down
26 changes: 26 additions & 0 deletions docs/apache-airflow-providers-asana/operators/index.rst
@@ -0,0 +1,26 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Asana Operators
===============


.. toctree::
:maxdepth: 1
:glob:

*
Expand Up @@ -26,14 +26,13 @@ The Kubernetes cluster Connection type enables connection to a Kubernetes cluste
Authenticating to Kubernetes cluster
------------------------------------

There are three ways to connect to Kubernetes using Airflow.

1. Use kube_config that reside in the default location on the machine(~/.kube/config) - just leave all fields empty
2. Use in_cluster config, if Airflow runs inside Kubernetes cluster take the configuration from the cluster - mark:
In cluster configuration
3. Use kube_config from different location - insert the path into ``Kube config path``
4. Use kube_config in JSON format from connection configuration - paste kube_config into
``Kube config (JSON format)``
There are different ways to connect to Kubernetes using Airflow.

#. Use kube_config that reside in the default location on the machine(~/.kube/config) - just leave all fields empty
#. Use in_cluster config, if Airflow runs inside Kubernetes cluster take the configuration from the cluster - mark:
In cluster configuration
#. Use kube_config from different location - insert the path into ``Kube config path``
#. Use kube_config in JSON format from connection configuration - paste kube_config into ``Kube config (JSON format)``


Default Connection IDs
Expand Down
Expand Up @@ -34,7 +34,8 @@ There are several ways to connect to Databricks using Airflow.
i.e. add a token to the Airflow connection. This is the recommended method.
2. Use Databricks login credentials
i.e. add the username and password used to login to the Databricks account to the Airflow connection.
Note that username/password authentication is discouraged and not supported for ``DatabricksSqlOperator``.
Note that username/password authentication is discouraged and not supported for
:class:`~airflow.providers.databricks.operators.databricks_sql.DatabricksSqlOperator`.
3. Using Azure Active Directory (AAD) token generated from Azure Service Principal's ID and secret
(only on Azure Databricks). Service principal could be defined as a
`user inside workspace <https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/service-prin-aad-token#--api-access-for-service-principals-that-are-azure-databricks-workspace-users-and-admins>`_, or `outside of workspace having Owner or Contributor permissions <https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/service-prin-aad-token#--api-access-for-service-principals-that-are-not-workspace-users>`_
Expand Down Expand Up @@ -83,7 +84,8 @@ Extra (optional)
* ``azure_resource_id``: optional Resource ID of the Azure Databricks workspace (required if managed identity isn't
a user inside workspace)

Following parameters could be set when using ``DatabricksSqlOperator``:
Following parameters could be set when using
:class:`~airflow.providers.databricks.operators.databricks_sql.DatabricksSqlOperator`:

* ``http_path``: optional HTTP path of Databricks SQL endpoint or Databricks cluster. See `documentation <https://docs.databricks.com/dev-tools/python-sql-connector.html#get-started>`_.
* ``session_configuration``: optional map containing Spark session configuration parameters.
Expand Down
Expand Up @@ -25,10 +25,11 @@ Elasticsearch Hook that is using the native Python client to communicate with El
Parameters
------------
hosts
A list of a single or many Elasticsearch instances. Example: ["http://localhost:9200"]
A list of a single or many Elasticsearch instances. Example: ``["http://localhost:9200"]``.
es_conn_args
Additional arguments you might need to enter to connect to Elasticsearch.
Example: {"ca_cert":"/path/to/cert", "basic_auth": "(user, pass)"}
Example: ``{"ca_cert":"/path/to/cert", "basic_auth": "(user, pass)"}``

For all possible configurations, consult with Elasticsearch documentation.
Reference: https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html

Expand Down
17 changes: 11 additions & 6 deletions docs/apache-airflow-providers-github/operators/index.rst
Expand Up @@ -20,12 +20,16 @@
Operators
=========

Use the :class:`~airflow.providers.github.operators.GithubOperator` to execute
Use the :class:`~airflow.providers.github.operators.github.GithubOperator` to execute
Operations in a `GitHub <https://www.github.com/>`__.

You can build your own operator using :class:`~airflow.providers.github.operators.GithubOperator`
and passing **github_method** and **github_method_args** from top level `PyGithub <https://pygithub.readthedocs.io/>`__ methods.
You can further process the result using **result_processor** Callable as you like.
You can build your own operator using :class:`~airflow.providers.github.operators.github.GithubOperator`
and passing :class:`github_method <airflow.providers.github.operators.github.GithubOperator>`
and :class:`github_method_args <airflow.providers.github.operators.github.GithubOperator>`
from top level `PyGithub <https://pygithub.readthedocs.io/>`__ methods.

You can further process the result using
:class:`result_processor <airflow.providers.github.operators.github.GithubOperator>` Callable as you like.

An example of Listing all Repositories owned by a user, **client.get_user().get_repos()** can be implemented as following:

Expand Down Expand Up @@ -55,7 +59,7 @@ You can also implement your own sensor on Repository using :class:`~airflow.prov
an example of this is :class:`~airflow.providers.github.sensors.GithubTagSensor`


Use the :class:`~airflow.providers.github.sensors.GithubTagSensor` to wait for creation of
Use the :class:`~airflow.providers.github.sensors.github.GithubTagSensor` to wait for creation of
a Tag in `GitHub <https://www.github.com/>`__.

An example for tag **v1.0**:
Expand All @@ -66,7 +70,8 @@ An example for tag **v1.0**:
:start-after: [START howto_tag_sensor_github]
:end-before: [END howto_tag_sensor_github]

Similar Functionality can be achieved by directly using :class:`~airflow.providers.github.sensors.GithubSensor` ,
Similar Functionality can be achieved by directly using
:class:`~from airflow.providers.github.sensors.github.GithubSensor`.

.. exampleinclude:: /../../tests/system/providers/github/example_github.py
:language: python
Expand Down
22 changes: 10 additions & 12 deletions docs/apache-airflow-providers-influxdb/connections/influxdb.rst
Expand Up @@ -31,19 +31,17 @@ Extra (required)
Specify the extra parameters (as json dictionary) that can be used in InfluxDB
connection.

The following extras are required:
``token``: (required) `Create token <https://docs.influxdata.com/influxdb/cloud/security/tokens/create-token/>`_
using the influxdb cli or UI

- token - Create token - https://docs.influxdata.com/influxdb/cloud/security/tokens/create-token/
- org_name - Create organization - https://docs.influxdata.com/influxdb/cloud/reference/cli/influx/org/create/
``org_name``: (required) `Create org <https://docs.influxdata.com/influxdb/cloud/reference/cli/influx/org/create/>`_
name using influxdb cli or UI

* ``token``: Create token using the influxdb cli or UI
* ``org_name``: Create org name using influxdb cli or UI
Example "extras" field:

Example "extras" field:
.. code-block:: JSON
.. code-block:: JSON
{
"token": "343434343423234234234343434",
"org_name": "Test"
}
{
"token": "343434343423234234234343434",
"org_name": "Test"
}
2 changes: 1 addition & 1 deletion docs/apache-airflow-providers-influxdb/operators/index.rst
Expand Up @@ -22,7 +22,7 @@
InfluxDBOperator
=================

Use the :class:`~airflow.providers.influxdb.operators.InfluxDBOperator` to execute
Use the :class:`~airflow.providers.influxdb.operators.influxdb.InfluxDBOperator` to execute
SQL commands in a `InfluxDB <https://www.influxdata.com/>`__ database.

An example of running the query using the operator:
Expand Down
6 changes: 0 additions & 6 deletions docs/apache-airflow-providers-neo4j/connections/neo4j.rst
Expand Up @@ -43,12 +43,6 @@ Extra (optional)
connection.

The following extras are supported:

- Default - uses bolt scheme(bolt://)
- neo4j_scheme - neo4j://
- certs_self_signed - neo4j+ssc://
- certs_trusted_ca - neo4j+s://

* ``encrypted``: Sets encrypted=True/False for GraphDatabase.driver, Set to ``True`` for Neo4j Aura.
* ``neo4j_scheme``: Specifies the scheme to ``neo4j://``, default is ``bolt://``
* ``certs_self_signed``: Sets the URI scheme to support self-signed certificates(``neo4j+ssc://``)
Expand Down
13 changes: 7 additions & 6 deletions docs/apache-airflow-providers-oracle/connections/oracle.rst
Expand Up @@ -53,20 +53,21 @@ Extra (optional)
The maximum length for this string is 48 and if you exceed this length you will get ORA-24960.
* ``thick_mode`` (bool) - Specify whether to use python-oracledb in thick mode. Defaults to False.
If set to True, you must have the Oracle Client libraries installed.
See `oracledb docs<https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html>` for more info.
See `oracledb docs <https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html>`__ for more info.
* ``thick_mode_lib_dir`` (str) - Path to use to find the Oracle Client libraries when using thick mode.
If not specified, defaults to the standard way of locating the Oracle Client library on the OS.
See `oracledb docs<https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html#setting-the-oracle-client-library-directory>` for more info.
See `oracledb docs <https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html#setting-the-oracle-client-library-directory>`__ for more info.
* ``thick_mode_config_dir`` (str) - Path to use to find the Oracle Client library configuration files when using thick mode.
If not specified, defaults to the standard way of locating the Oracle Client library configuration files on the OS.
See `oracledb docs<https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html#optional-oracle-net-configuration-files>` for more info.
See `oracledb docs <https://python-oracledb.readthedocs.io/en/latest/user_guide/initialization.html#optional-oracle-net-configuration-files>`__ for more info.
* ``fetch_decimals`` (bool) - Specify whether numbers should be fetched as ``decimal.Decimal`` values.
See `defaults.fetch_decimals<https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>` for more info.
See `defaults.fetch_decimals <https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>`_ for more info.
* ``fetch_lobs`` (bool) - Specify whether to fetch strings/bytes for CLOBs or BLOBs instead of locators.
See `defaults.fetch_lobs<https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>` for more info.
See `defaults.fetch_lobs <https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>`_ for more info.


Connect using `dsn`, Host and `sid`, Host and `service_name`, or only Host `(OracleHook.getconn Documentation) <https://airflow.apache.org/docs/apache-airflow-providers-oracle/stable/_modules/airflow/providers/oracle/hooks/oracle.html#OracleHook.get_conn>`_.
Connect using ``dsn``, Host and ``sid``, Host and ``service_name``,
or only Host `(OracleHook.getconn Documentation) <https://airflow.apache.org/docs/apache-airflow-providers-oracle/stable/_modules/airflow/providers/oracle/hooks/oracle.html#OracleHook.get_conn>`_.

For example:

Expand Down

0 comments on commit 2b92c3c

Please sign in to comment.