Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split redshift cluster and redshift sql #20276

Merged

Conversation

dstandish
Copy link
Contributor

The first redshift hook was for managing the cluster itself. Later a hook for using the cluster was added to the same module. Better to separate these into distinct modules redshift_sql and redshift_cluster. Here we split the redshift modules for operators, hooks, and sensors.

@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Dec 14, 2021
@potiuk
Copy link
Member

potiuk commented Dec 14, 2021

The deprecation warnings should be added to "known" list and tests are failing unfortunately :)

@dstandish
Copy link
Contributor Author

The deprecation warnings should be added to "known" list and tests are failing unfortunately :)

thanks wil work on it

@dstandish
Copy link
Contributor Author

The deprecation warnings should be added to "known" list and tests are failing unfortunately :)

Hey @potiuk , can you point me the way here...

Providers manager is warning here:

                        if already_registered.hook_class_name != hook_class_name:
                            log.warning(
                                "The hook connection type '%s' is registered twice in the"
                                " package '%s' with different class names: '%s' and '%s'. "
                                " Please fix it!",
                                hook_info.connection_type,
                                package_name,
                                already_registered.hook_class_name,
                                hook_class_name,
                            )

This causes a test to fail. I did not change anything about the class names or connection types. They're now in different modules though.

Do you know what to do here? Perhaps we should not have let conn_type be redshift for the RedshiftSqlHook when it was added.

@potiuk
Copy link
Member

potiuk commented Dec 14, 2021

Providers manager is warning here:
if already_registered.hook_class_name != hook_class_name:

This warning is not in the CI. I believe this is caused by .pyc file which remained from the Hook being present in the old location when you moved the .py file (in your local development environment). This is one of the main reasons in Breeze we have this:

    # Disable writing .pyc files - slightly slower imports but not messing around when switching
    # Python version and avoids problems with root-owned .pyc files in host
    export PYTHONDONTWRITEBYTECODE=${PYTHONDONTWRITEBYTECODE:="true"}

cc: @uranusjr (our slack discussion from yesterday) -> this is one of the bad sides of the .pyc files, that's why I always disable writing .pyc files in any development environment I have. Renaming and moving modules/packages causes this kind of problems.

@dstandish
Copy link
Contributor Author

dstandish commented Dec 14, 2021

Providers manager is warning here:
if already_registered.hook_class_name != hook_class_name:

This warning is not in the CI. I believe this is caused by .pyc file which remained from the Hook being present in the old location when you moved the .py file (in your local development environment). This is one of the main reasons in Breeze we have this:

    # Disable writing .pyc files - slightly slower imports but not messing around when switching
    # Python version and avoids problems with root-owned .pyc files in host
    export PYTHONDONTWRITEBYTECODE=${PYTHONDONTWRITEBYTECODE:="true"}

cc: @uranusjr (our slack discussion from yesterday) -> this is one of the bad sides of the .pyc files, that's why I always disable writing .pyc files in any development environment I have. Renaming and moving modules/packages causes this kind of problems.

but it is in ci @potiuk 🤔

see: https://github.com/apache/airflow/runs/4515244414?check_suite_focus=true#step:6:12103

@potiuk
Copy link
Member

potiuk commented Dec 14, 2021

Ah. i see.

See the FULL stack trace there (The summary of error does not have the warning just assert - but in the output of the test you have the whole error message:

https://github.com/apache/airflow/runs/4515244414?check_suite_focus=true#step:6:4063

@potiuk
Copy link
Member

potiuk commented Dec 14, 2021

You need to unfold the failed "core" tests to see it all:

  
  tests/core/test_providers_manager.py:151: AssertionError
  ----------------------------- Captured stdout call -----------------------------
  [2021-12-14 03:43:05,390] {providers_manager.py:511} WARNING - The hook connection type 'aws' is registered twice in the package 'apache-airflow-providers-amazon' with different class names: 'airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook' and 'airflow.providers.amazon.aws.hooks.redshift_cluster.RedshiftHook'.  Please fix it!
  ------------------------------ Captured log call -------------------------------
  WARNING  airflow.providers_manager:providers_manager.py:511 The hook connection type 'aws' is registered twice in the package 'apache-airflow-providers-amazon' with different class names: 'airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook' and 'airflow.providers.amazon.aws.hooks.redshift_cluster.RedshiftHook'.  Please fix it!
  ________________________ TestProviderManager.test_hooks ________________________

Is it clearer :) ?

@eladkal
Copy link
Contributor

eladkal commented Dec 14, 2021

@dstandish would it be possible to handle the class name changes on this PR as well?
#20296
The modifications for that are relatively small.

We shouldn't have AwsRedshiftClusterSensor. It should be:
from airflow.providers.amazon.aws.sensors.redshift_cluster import RedshiftClusterSensor

airflow/contrib/hooks/redshift_hook.py Outdated Show resolved Hide resolved
@uranusjr
Copy link
Member

@dstandish There are more deprecation messages that need changing

@dstandish
Copy link
Contributor Author

@dstandish There are more deprecation messages that need changing

Ok will search again tomorrow thought I got everything

@dstandish dstandish force-pushed the split-redshift-cluster-and-redshift-sql branch from 798bbbc to 467a3e5 Compare December 16, 2021 17:26
The first redshift hook was for managing the cluster itself.  Later a hook for _using_ the cluster was added to the same module.  Better to separate these  into distinct modules redshift_sql and redshift_cluster.  Here we split the redshift modules for operators, hooks, and sensors.
@dstandish dstandish force-pushed the split-redshift-cluster-and-redshift-sql branch from 467a3e5 to 816e1ed Compare December 16, 2021 17:29
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

we will need a follow up on renaming AwsRedshiftClusterSensor #20296

@dstandish dstandish merged commit bfcaf19 into apache:main Dec 16, 2021
@dstandish
Copy link
Contributor Author

thanks everyone for the help in [hopefully] getting this one right

@eladkal eladkal mentioned this pull request Dec 28, 2021
@dstandish dstandish deleted the split-redshift-cluster-and-redshift-sql branch February 14, 2022 21:46
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants