Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to test connections from UI or API #15795

Merged
merged 4 commits into from
May 24, 2021

Conversation

msumit
Copy link
Contributor

@msumit msumit commented May 12, 2021

Adding test connections support from UI or API. The connection will be tried from the webserver and report the outcome.

sqlite
mysql
hive

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:core-operators Operators, Sensors and hooks within Core Airflow area:webserver Webserver related Issues labels May 12, 2021
@potiuk
Copy link
Member

potiuk commented May 12, 2021

Nice! One small (potential) issue - which might be worth thinking about here. I think often the Webserver is limited in connectivity. I know for a fact that MWAA and Composer have a much more restricted "environment" to run webserver on due to sensitivity of the webserver (it is exposed to outside world). I am not sure if they are limiting the outgoing connections from webserver, but I think they might @subashcanapathy ? or at least might want to limit it in the future.

Also there are are number of connections that might require the worker "environment" to work on (for example GOOGLE_APPLICATION_CREDENTIALS for gcp, or worker-identity configured for pod they are running on, or Kerberos configured for workers and only for workers (I've worked for a customer that had an environment where only the workers had the credentials that allowed them to make outgoing connections and authenticate using Kerberos).

This is not a blocker for that change, but just something to remember about - that it fhe "connection" test does not work via API/Webserver (the test will be executed in the webserver instance), it does not mean that the connection is not working from workers. It might be worth-while to add a "last successful connection from worker" - so basically store when last time the connection succeeded for actual worker running. This would give more complete information about the status of the connection.

@potiuk potiuk self-requested a review May 12, 2021 10:37
@msumit
Copy link
Contributor Author

msumit commented May 12, 2021

Completely agree with you @potiuk here. In fact, I had similar questions about this feature. But then with my experience of working with a couple of companies over the past 5yrs, the webserver is running the same code of what workers are running, so the feature would be useful for such installations. For those, who don't have access from the webserver, it can act as a quick validator :).

Updating things from the workers is also a good idea & for that, we can create a decorator which automatically updates the status behind the scene. But again, what if there are 2 or more Celery workers & are running in different VPCs? Or K8S pods with different images, some with correct creds & some not?

@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the workflow link to check the reason.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, love where this is heading

@kaxil
Copy link
Member

kaxil commented May 12, 2021

Nice! One small (potential) issue - which might be worth thinking about here. I think often the Webserver is limited in connectivity. I know for a fact that MWAA and Composer have a much more restricted "environment" to run webserver on due to sensitivity of the webserver (it is exposed to outside world). I am not sure if they are limiting the outgoing connections from webserver, but I think they might @subashcanapathy ? or at least might want to limit it in the future.

Also there are are number of connections that might require the worker "environment" to work on (for example GOOGLE_APPLICATION_CREDENTIALS for gcp, or worker-identity configured for pod they are running on, or Kerberos configured for workers and only for workers (I've worked for a customer that had an environment where only the workers had the credentials that allowed them to make outgoing connections and authenticate using Kerberos).

This is not a blocker for that change, but just something to remember about - that it fhe "connection" test does not work via API/Webserver (the test will be executed in the webserver instance), it does not mean that the connection is not working from workers. It might be worth-while to add a "last successful connection from worker" - so basically store when last time the connection succeeded for actual worker running. This would give more complete information about the status of the connection.

I consider this more of an "I have added a new connection now, I want to test if it is working or not" as a user can validate before they put or use the connection in their DAG. So not "when was this connection last successfully" as the connection to "test" like running a "select 1" query. We can extend this feature once this PR is merged to more connections and to "CLI".

Actually "testing" will help us propagate the error message to UI, API (later CLI) unlike clubbing it to an exiting usage of connection is some Hook or Operator.

Also, once we have all the Hooks with this method, users/companies can use it as a pre-validation task.

@potiuk
Copy link
Member

potiuk commented May 12, 2021

Completely agree with you @potiuk here. In fact, I had similar questions about this feature. But then with my experience of working with a couple of companies over the past 5yrs, the webserver is running the same code of what workers are running, so the feature would be useful for such installations. For those, who don't have access from the webserver, it can act as a quick validator :).

Yeah, agree it's an edge case. Note that it''s not only the "image" of webserver vs. worker or "packages installed" - but also the "environment" it is running at. For example in our K8S deployment we had a side-car running next to the worker that kept refreshing kerberos tokens and this side-car shared a volume with the worker (and only worker had this enabled).

But yeah I think in vast majority of cases it will work. Maybe a good solution to that will be to be very explicit about it - i.e. in a comment/tooltip/API description be very explicit that this is the status of connections from webserver and it might be different if run from the workers?

I just really want to avoid questions "My connection is reported as not working in the UI, What should i do to make it works?". It's not obvious for the potential user that this might be caused by the worker/vs. webserver configuration.

@subashcanapathy
Copy link
Contributor

It comes down to adding a note that the test/feature is subject to network egress rules setup for your webserver. MWAA and Composer(likely?) will have plugin restrictions on the webserver fleet, and additionally the customer's VPC/PrivateNetwork might pose issues reaching to connection provider.
In the default setting, if the connections are stored in the providers that comes with the installation - this should work as is. If the customer is using a connections backend that requires a pip install of additional package and/or a plugin, then this feature won't work for the customers on the webui.

@kaxil
Copy link
Member

kaxil commented May 12, 2021

It comes down to adding a note that the test/feature is subject to network egress rules setup for your webserver. MWAA and Composer(likely?) will have plugin restrictions on the webserver fleet, and additionally the customer's VPC/PrivateNetwork might pose issues reaching to connection provider.
In the default setting, if the connections are stored in the providers that comes with the installation - this should work as is. If the customer is using a connections backend that requires a pip install of additional package and/or a plugin, then this feature won't work for the customers on the webui.

Yup, since the API also only show connections from Connections saved in DB not the ones set via other secrets backend like Env Vars or Vault

@potiuk
Copy link
Member

potiuk commented May 12, 2021

Sounds good :D

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@msumit msumit changed the title [WIP] Ability to test connections from UI or API Ability to test connections from UI or API May 20, 2021
@msumit
Copy link
Contributor Author

msumit commented May 20, 2021

@ashb @kaxil @potiuk @ephraimbuddy this PR can be reviewed now.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions

@bbovenzi bbovenzi self-requested a review May 21, 2021 12:05
@ashb
Copy link
Member

ashb commented May 21, 2021

Oh. Oh no.

By default, the REST API is set to deny_all, meaning that out of the box the test button won't work :(

@msumit
Copy link
Contributor Author

msumit commented May 21, 2021

Oh. Oh no.

By default, the REST API is set to deny_all, meaning that out of the box the test button won't work :(

Yeah, just realized this while chatting with @bbovenzi.. we can handle the error in JS & let the user know the reason..

@ashb
Copy link
Member

ashb commented May 21, 2021

I wonder if we should special case and detect the case when the API is disabled and disable the form control with a tool tip?

Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it working locally. Looking good. Left a few comments on the javascript. Two other things:

  • the js needs a linting pass. check out other files on the use of global at the beginning of the file
  • I noticed an alert popup to log into the API. Is that part of this PR or from something else? Ideally we don't want to use popups like that.

airflow/www/static/js/connection_form.js Outdated Show resolved Hide resolved
airflow/www/static/js/connection_form.js Outdated Show resolved Hide resolved
airflow/www/static/js/connection_form.js Outdated Show resolved Hide resolved
airflow/www/static/js/connection_form.js Outdated Show resolved Hide resolved
airflow/www/static/js/connection_form.js Outdated Show resolved Hide resolved
@msumit
Copy link
Contributor Author

msumit commented May 24, 2021

I wonder if we should special case and detect the case when the API is disabled and disable the form control with a tool tip?

done

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

msumit and others added 2 commits May 24, 2021 18:59
Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
@msumit msumit merged commit 335fa10 into apache:master May 24, 2021
@msumit msumit deleted the test_connection branch May 24, 2021 15:28
@ashb ashb added this to the Airflow 2.2 milestone Jun 22, 2021
@BasPH BasPH mentioned this pull request Aug 9, 2021
@potiuk potiuk linked an issue Oct 27, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:core-operators Operators, Sensors and hooks within Core Airflow area:webserver Webserver related Issues full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connection health check
6 participants