Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support retriveing extra fields which do not have extra prefix set - Azure ADF DAG failure #899

Merged
merged 3 commits into from
Feb 27, 2023

Conversation

pankajkoti
Copy link
Collaborator

@pankajkoti pankajkoti commented Feb 27, 2023

With the 5.0.0 release of the Microsoft Azure PR, the PR apache/airflow#27047 introduced a breaking change where while creating new connections, the extra__ prefix is no longer set for extra fields in the conneciton. This issue was not identified with testing the 5.0.0 RC because, it only happens for new connections that are created. The existing connections still contain the extra fiels with the extra__ prefix. Hence, the existing code which looks for the connection field with the prefix extra__azure_data_factory__subscriptionId works on the older deployment with the new provider release as the connection was created before the release, but it fails on new deployments when a fresh connection is created.

To fix this, we're removing the prefix while retrieving the connection field and at the same time we're supporting previously created connections by using the same get_field method from Airflow OSS introduced in the same PR above to allow backward compatibility.

closes: #897

With the 5.0.0 release of the Microsoft Azure PR, the PR apache/airflow#27758
introduced a breaking change where while creating new connections,
the ``extra__`` prefix is no longer set for extra fields in the conneciton.
This issue was not identified with testing the 5.0.0 RC because, it
only happens for new connections that are created. The existing connections
still contain the extra fiels with the ``extra__`` prefix. Hence, the
existing code which looks for the connection field with the prefix
``extra__azure_data_factory__subscriptionId`` works on the older deployment
with the new provider release as the connection was created before the release,
but it fails on new deployments when a fresh connection is created.

To fix this, we're removing the prefix while retrieving the connection
field and at the same time we're supporting previously created connections
by using the same ``get_field`` method from Airflow OSS introduced
in the same PR above to allow backward compatibility.
@pankajkoti pankajkoti changed the title Support retriveing extra fields which do not have extra prefix set Support retriveing extra fields which do not have extra prefix set - Azure ADF DAG failure Feb 27, 2023
@codecov
Copy link

codecov bot commented Feb 27, 2023

Codecov Report

Base: 98.61% // Head: 98.61% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (029bf25) compared to base (3606dfa).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #899   +/-   ##
=======================================
  Coverage   98.61%   98.61%           
=======================================
  Files          89       89           
  Lines        4974     4975    +1     
=======================================
+ Hits         4905     4906    +1     
  Misses         69       69           
Impacted Files Coverage Δ
...er/providers/microsoft/azure/hooks/data_factory.py 98.43% <100.00%> (+0.02%) ⬆️
astronomer/providers/microsoft/azure/hooks/wasb.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@rajaths010494
Copy link
Contributor

@pankajkoti good that we got the root cause of the issue and fixed it.

@pankajkoti
Copy link
Collaborator Author

I am merging this PR to test deployment from CircleCI main branch to new Astronomer Validation Org deployment. I will address comments if any that come later in a separate PR.

cc: @phanikumv @pankajastro

@pankajkoti pankajkoti merged commit 4767f18 into main Feb 27, 2023
@pankajkoti pankajkoti deleted the 897-fix-adf-dag branch February 27, 2023 11:45
import inspect
from functools import wraps
from typing import Any, Optional, TypeVar, Union, cast

from airflow.exceptions import AirflowException
from airflow.providers.microsoft.azure.hooks.data_factory import AzureDataFactoryHook
from airflow.providers.microsoft.azure.hooks.data_factory import AzureDataFactoryHook, get_field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing this will fail on v4 of the provider I think - is that a problem? Is this meant to only work with v5+?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's meant to support both versions. Thanks, @ashb , I will create another PR to fix this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a PR #900 to fix this.

pankajkoti added a commit that referenced this pull request Feb 27, 2023
PR #899 introduced an import for method ``get_field`` and started
its usage. However, as reviewed and suggested by @ashb this will
fail on providers with version < 5.0.0 as it was only the release
5.0.0 which included the ``get_field`` method. To solve this,
we remove the dependency on the imported method and introduce the
same method in local module.
pankajkoti added a commit that referenced this pull request Feb 27, 2023
* Fix get_field import availability - PR #899 followup

PR #899 introduced an import for method ``get_field`` and started
its usage. However, as reviewed and suggested by @ashb this will
fail on providers with version < 5.0.0 as it was only the release
5.0.0 which included the ``get_field`` method. To solve this,
we remove the dependency on the imported method and introduce the
same method in local module.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Microsoft Azure Provider example DAG to run ADF pipeline
3 participants