Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
d26dca8
refactor: Added Apache Arrow provider and implemented basic AdbcHook
davidblain-infrabel Jun 27, 2025
a42d821
Merge branch 'main' into feature/apache-arrow-provider
dabla Jun 27, 2025
d44a7f2
Merge branch 'main' into feature/apache-arrow-provider
dabla Jun 28, 2025
b7ec972
refactor: Fixed adbc connection docs, still needs additional document…
davidblain-infrabel Jun 28, 2025
61f480e
refactor: Updated link to ADBC site
davidblain-infrabel Jun 28, 2025
6171a75
refactor: Fixed version in pyproject.toml
davidblain-infrabel Jun 28, 2025
edc7408
refactor: Fixed breeze test_docs_filter for Common SQL provider packa…
davidblain-infrabel Jun 28, 2025
7e0b3ad
Merge branch 'main' into feature/apache-arrow-provider
dabla Jun 30, 2025
f0aba05
Merge branch 'main' into feature/apache-arrow-provider
dabla Jun 30, 2025
a95ba2a
Merge branch 'main' into feature/apache-arrow-provider
dabla Jul 2, 2025
a2695ef
Merge branch 'main' into feature/apache-arrow-provider
dabla Jul 3, 2025
6c307d9
Merge branch 'main' into feature/apache-arrow-provider
dabla Jul 3, 2025
565bbd2
Merge branch 'main' into feature/apache-arrow-provider
dabla Jul 4, 2025
985163d
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 4, 2025
04163a8
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 5, 2025
80208ab
refactor: Implemented ADBC dialect
davidblain-infrabel Aug 6, 2025
b56cb0c
refactor: Implemented get_column_names in PostgresDialect so we don't…
davidblain-infrabel Aug 7, 2025
d4edee5
refactor: Removed AdbcDialect and make sure AdbcHook can work with ex…
davidblain-infrabel Aug 7, 2025
6f85e54
refactor: Don't use prepared statement in PostgresDialect for get_pri…
davidblain-infrabel Aug 12, 2025
ff296c6
refactor: Refactored insert_rows of AdbcHook using pyarrow RecordBatch
davidblain-infrabel Aug 12, 2025
1def92d
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 12, 2025
b1bad8f
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
55ae8ae
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
0b7993d
refactor: Replace native placeholders with ADBC placeholder to suppor…
davidblain-infrabel Aug 13, 2025
f2c19eb
refactor: Use prepared statements in PostgresDialect
davidblain-infrabel Aug 13, 2025
a741d46
refactor: Fixed placeholders in prepared statements in PostgresDialect
davidblain-infrabel Aug 13, 2025
fcb1d20
refactor: Refactored _run_command in AdbcHook
davidblain-infrabel Aug 13, 2025
6fded87
refactor: Updated adbc-driver-manager and pyarrow dependencies
davidblain-infrabel Aug 13, 2025
11d21d6
refactor: Refactored insert_rows
dabla Aug 13, 2025
8ea29bf
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
fc6068d
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
a44d975
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
67bc924
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 13, 2025
1124109
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 14, 2025
20f3e4e
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 18, 2025
ac59e63
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 18, 2025
fda7c41
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 18, 2025
fc18a8a
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 19, 2025
59b9454
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 20, 2025
3e3e1c3
refactor: Removed apache-airflow-providers-apache-arrow from airflow …
dabla Aug 20, 2025
c1c30b6
refactor: Removed apache-airflow-providers-apache-arrow from airflow …
dabla Aug 21, 2025
596d8ed
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 21, 2025
ffc3b2c
Revert "refactor: Removed apache-airflow-providers-apache-arrow from …
dabla Aug 21, 2025
e36613a
refactor: Removed apache-airflow-providers-apache-arrow from airflow …
dabla Aug 21, 2025
a3f166f
refactor: Removed duplicate get_column_names in PostgresDialect
dabla Aug 21, 2025
59f4b39
refactor: Added missing apache arrow entries in pyproject.toml
dabla Aug 21, 2025
c8769a4
refactor: Changed signature get_records in AdbcHook
dabla Aug 21, 2025
0bb6a96
refactor: Changed minimum required pyarrow version to 18.0.0 to not h…
dabla Aug 21, 2025
0a3f462
fix: Fixed name for flit module in apache arrow
dabla Aug 21, 2025
507e5e0
fix: Fixed import of AdbcHook in TestAdbcHook
dabla Aug 21, 2025
ce8da7d
refactor: Added apache arrow to remove and test sources YAML files
dabla Aug 21, 2025
d9eccff
refactor: Updated pyproject.toml
dabla Aug 21, 2025
49d35a5
refactor: Fixed provider package name in docs of arrow
dabla Aug 21, 2025
802108b
refactor: Fixed project urls in pyproject.toml
dabla Aug 21, 2025
a688fd1
refactor: Fixed version in apache arrow provider
dabla Aug 21, 2025
3e60219
refactor: Fixed message in RuntimeError for provider info in apache a…
dabla Aug 21, 2025
519d725
refactor: Reformatted get_provider_info
dabla Aug 21, 2025
cecdcc8
refactor: Reformatted AdbcHook
dabla Aug 21, 2025
13179ad
refactor: Reorganized imports for TestAdbcHook
dabla Aug 21, 2025
24a3581
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 21, 2025
fe7d65b
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 21, 2025
600ddf6
refactor: Removed handler parameter from overridden get_records metho…
dabla Aug 21, 2025
5702b6e
refactor: Fixed include for security.rst
dabla Aug 21, 2025
954f7d6
refactor: Fixed include for installing-providers-from-sources.rst
dabla Aug 21, 2025
bfb4836
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 21, 2025
90d85a8
refactor: Fixed static checks and mypy issues on AdbcHook
dabla Aug 22, 2025
6f080bf
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 22, 2025
bde3082
refactor: Fxied fetch_all_handler
dabla Aug 22, 2025
f857231
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 22, 2025
24f04e8
Merge branch 'main' into feature/apache-arrow-provider
dabla Aug 28, 2025
d73a2b9
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 9, 2025
442ccb6
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 13, 2025
aa0301e
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 13, 2025
899dc5e
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 13, 2025
fd4f318
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 14, 2025
fd77c65
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 15, 2025
8629eea
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 17, 2025
bae2196
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 22, 2025
a62c473
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 24, 2025
7751929
Merge branch 'main' into feature/apache-arrow-provider
dabla Sep 25, 2025
a49e018
refactor: Changed required python version check
dabla Sep 25, 2025
312410f
refactor: Uncommented apache-arrow in pyproject.toml
dabla Sep 25, 2025
5a14869
refactor: Removed min version from arrow
dabla Sep 28, 2025
f041530
refactor: Added missing max_retry_delay property to MappedOperator model
dabla Oct 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions airflow-core/src/airflow/models/mappedoperator.py
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,10 @@ def retry_delay(self) -> datetime.timedelta:
def retry_exponential_backoff(self) -> bool:
return bool(self.partial_kwargs.get("retry_exponential_backoff"))

@property
def max_retry_delay(self) -> datetime.timedelta | None:
return self.partial_kwargs.get("max_retry_delay")

@property
def weight_rule(self) -> PriorityWeightStrategy:
return validate_and_load_priority_weight_strategy(
Expand Down
61 changes: 61 additions & 0 deletions airflow-core/tests/unit/models/test_mappedoperator.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@
from __future__ import annotations

from collections import defaultdict
from datetime import timedelta
from typing import TYPE_CHECKING
from unittest import mock
from unittest.mock import patch

import pytest
from airflow.models.mappedoperator import MappedOperator
from sqlalchemy import select

from airflow.exceptions import AirflowSkipException
Expand All @@ -31,6 +33,8 @@
from airflow.models.taskmap import TaskMap
from airflow.providers.standard.operators.python import PythonOperator
from airflow.sdk import DAG, BaseOperator, TaskGroup, setup, task, task_group, teardown
from airflow.serialization.serialized_objects import SerializedBaseOperator
from airflow.task.priority_strategy import PriorityWeightStrategy
from airflow.task.trigger_rule import TriggerRule
from airflow.utils.state import TaskInstanceState

Expand Down Expand Up @@ -1401,3 +1405,60 @@ def t3(a):
dr.task_instance_scheduling_decisions()
ti3 = dr.get_task_instance(task_id="tg1.t3")
assert not ti3.state


def test_properties():
op = PythonOperator.partial(
task_id="mapped",
python_callable=print,
email="email",
execution_timeout=timedelta(seconds=10),
retry_delay=timedelta(seconds=5),
max_retry_delay=timedelta(seconds=60),
retry_exponential_backoff=True,
max_active_tis_per_dag=1,
max_active_tis_per_dagrun=2,
run_as_user="user",
).expand(op_args=["Hello", "world"])
assert op.operator_name == PythonOperator.__name__
assert op.roots == [op]
assert op.leaves == [op]
assert op.task_display_name == "mapped"
assert op.owner == SerializedBaseOperator.owner
assert op.email == "email"
assert op.email_on_failure
assert op.email_on_retry
assert not op.map_index_template
assert op.trigger_rule == SerializedBaseOperator.trigger_rule
assert not op.is_setup
assert not op.is_teardown
assert not op.depends_on_past
assert op.ignore_first_depends_on_past == bool(SerializedBaseOperator.ignore_first_depends_on_past)
assert not op.wait_for_downstream
assert op.retries == SerializedBaseOperator.retries
assert op.queue == SerializedBaseOperator.queue
assert op.pool == SerializedBaseOperator.pool
assert op.pool_slots == SerializedBaseOperator.pool_slots
assert op.execution_timeout == timedelta(seconds=10)
assert op.max_retry_delay == timedelta(seconds=60)
assert op.retry_delay == timedelta(seconds=5)
assert op.retry_exponential_backoff
assert op.priority_weight == SerializedBaseOperator.priority_weight
assert isinstance(op.weight_rule, PriorityWeightStrategy)
assert op.max_active_tis_per_dag == 1
assert op.max_active_tis_per_dagrun == 2
assert not op.resources
assert not op.has_on_execute_callback
assert not op.has_on_failure_callback
assert not op.has_on_retry_callback
assert not op.has_on_success_callback
assert not op.has_on_skipped_callback
assert op.run_as_user == "user"
assert not op.executor_config
assert not op.inlets
assert not op.outlets
assert not op.doc
assert not op.doc_md
assert not op.doc_json
assert not op.doc_yaml
assert not op.doc_rst
2 changes: 1 addition & 1 deletion dev/breeze/tests/test_selective_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -2125,7 +2125,7 @@ def test_upgrade_to_newer_dependencies(
pytest.param(
("providers/common/sql/src/airflow/providers/common/sql/common_sql_python.py",),
{
"docs-list-as-string": "amazon apache.drill apache.druid apache.hive "
"docs-list-as-string": "amazon apache.arrow apache.drill apache.druid apache.hive "
"apache.impala apache.pinot common.sql databricks elasticsearch "
"exasol google jdbc microsoft.mssql mysql odbc openlineage "
"oracle pgvector postgres presto slack snowflake sqlite teradata trino vertica ydb",
Expand Down
81 changes: 81 additions & 0 deletions providers/apache/arrow/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@

.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

.. NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!

.. IF YOU WANT TO MODIFY TEMPLATE FOR THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
``PROVIDER_README_TEMPLATE.rst.jinja2`` IN the ``dev/breeze/src/airflow_breeze/templates`` DIRECTORY

Package ``apache-airflow-providers-apache-arrow``

Release: ``1.0.0``


`Apache Arrow <https://arrow.apache.org/>`__


Provider package
----------------

This is a provider package for ``apache.arrow`` provider. All classes for this provider package
are in ``airflow.providers.apache.arrow`` python package.

You can find package information and changelog for the provider
in the `documentation <https://airflow.apache.org/docs/apache-airflow-providers-apache-arrow/1.0.0/>`_.

Installation
------------

You can install this package on top of an existing Airflow 2 installation (see ``Requirements`` below
for the minimum Airflow version supported) via
``pip install apache-airflow-providers-apache-arrow``

The package supports the following python versions: 3.9,3.10,3.11,3.12

Requirements
------------

======================================= ==================
PIP package Version required
======================================= ==================
``apache-airflow`` ``>=2.10.0``
``apache-airflow-providers-common-sql`` ``>=1.20.0``
``adbc-driver-manager`` ``>=1.6.0``
======================================= ==================

Cross provider package dependencies
-----------------------------------

Those are dependencies that might be needed in order to use all the features of the package.
You need to install the specified providers in order to use them.

You can install such cross-provider dependencies when installing from PyPI. For example:

.. code-block:: bash

pip install apache-airflow-providers-arrow[common.sql]


============================================================================================================ ==============
Dependent package Extra
============================================================================================================ ==============
`apache-airflow-providers-common-sql <https://airflow.apache.org/docs/apache-airflow-providers-common-sql>`_ ``common.sql``
============================================================================================================ ==============

The changelog for the provider package can be found in the
`changelog <https://airflow.apache.org/docs/apache-airflow-providers-arrow/1.0.0/changelog.html>`_.
1 change: 1 addition & 0 deletions providers/apache/arrow/docs/.latest-doc-only-change.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
7b2ec33c7ad4998d9c9735b79593fcdcd3b9dd1f
49 changes: 49 additions & 0 deletions providers/apache/arrow/docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.


.. NOTE TO CONTRIBUTORS:
Please, only add notes to the Changelog just below the "Changelog" header when there are some breaking changes
and you want to add an explanation to the users on how they are supposed to deal with them.
The changelog is updated and maintained semi-automatically by release manager.

``apache-airflow-providers-arrow``

Changelog
---------

1.0.0
.....

.. note::
This release of provider is only available for Airflow 2.10+ as explained in the
Apache Airflow providers support policy <https://github.com/apache/airflow/blob/main/PROVIDERS.rst#minimum-supported-version-of-airflow-for-community-managed-providers>_.

Misc
~~~~

* ``Bump min Airflow version in providers to 2.10 (#49843)``

.. Below changes are excluded from the changelog. Move them to
appropriate section above if needed. Do not delete the lines(!):
* ``Update description of provider.yaml dependencies (#50231)``
* ``Avoid committing history for providers (#49907)``

1.0.0
.....

Initial version of the provider.
35 changes: 35 additions & 0 deletions providers/apache/arrow/docs/commits.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@

.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

.. NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!

.. IF YOU WANT TO MODIFY THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
`PROVIDER_COMMITS_TEMPLATE.rst.jinja2` IN the `dev/breeze/src/airflow_breeze/templates` DIRECTORY

.. THE REMAINDER OF THE FILE IS AUTOMATICALLY GENERATED. IT WILL BE OVERWRITTEN!

Package apache-airflow-providers-arrow
------------------------------------------------------

`ADBC: Arrow Database Connectivity <https://github.com/apache/arrow-adbc/>`__


This is detailed commit list of changes for versions provider package: ``arrow``.
For high-level changelog, see :doc:`package information including changelog <index>`.

.. airflow-providers-commits::
27 changes: 27 additions & 0 deletions providers/apache/arrow/docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Disable Flake8 because of all the sphinx imports
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Configuration of Providers docs building."""

from __future__ import annotations

import os

os.environ["AIRFLOW_PACKAGE_NAME"] = "apache-airflow-providers-apache-arrow"

from docs.provider_conf import * # noqa: F403
19 changes: 19 additions & 0 deletions providers/apache/arrow/docs/configurations-ref.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

.. include:: /../../../devel-common/src/sphinx_exts/includes/providers-configurations-ref.rst
.. include:: /../../../devel-common/src/sphinx_exts/includes/sections-and-options.rst
23 changes: 23 additions & 0 deletions providers/apache/arrow/docs/connections/adbc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

.. _howto/connection:adbc:

ADBC connection
===============

The ADBC connection type enables connection to a ADBC data source.
Loading
Loading