-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-2840] - add update connections cli option #3684
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3684 +/- ##
=========================================
+ Coverage 76.68% 77.5% +0.82%
=========================================
Files 199 205 +6
Lines 16189 15817 -372
=========================================
- Hits 12414 12259 -155
+ Misses 3775 3558 -217
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connection ID is (by design) non-unique - its a mechanism of load balancing across multiple DBs for instance - you can have three conns with an ID of "mysql" and Airflow will pick a random one each time.
Can you explicitly handle/detect this case (probably by refusing to update any of them?) and add tests that cover it too? Given it's relatively infrequent to do this I think it is okay to just refuse to update any of the conns in this case.
Would updating all of the connections be acceptable? Maybe with a flag to indicate this behavior? Or is refusing to update preferable? |
I've also noticed the cli option for adding connections doesn't support adding connections with duplicate |
Please move the core of the code out of the cli (I.e. split between api/Commons and changes to the connection model) and make sure it works through the REST API as well. Direct access to the database should be limited in the cli as we will move away from this in the future. |
O sorry @ashb I think I accidentally removed the label |
@bolkedeburin no worries on the label, I was just using them as a way of tracking in the PRs page which ones we've looked at - entirely manual for now. I think the default should refuse to update multiple, with an --update-all flag |
@ashb @bolkedebruin I've been working on refactoring the connection api and wanted to run the implementation by you before going too far. My thought is in Then each client (cli, local, json) would be responsible for parsing arguments and constructing meaningful print/return statements to the user (e.g. the cli client would formulate the message thoughts? |
Something like that sounds sensible - removing duplication is good. Don't forget about the Flask web UI(s - right now we still have two) |
I wouldn't put the CRUD operations in separate files. Just use api/common/experimental/connections.py for simplicity |
Can you give some feedback on the json_api? I'm not really sure what I should be doing there or how to test it. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't had time to give this a full re-review yet, sorry.
Regarding @ashb first comment from Aug about multiple connections. There is also a bug to consider: https://issues.apache.org/jira/browse/AIRFLOW-2784 Your tests can't assume that all conn_id with same name are for the same conn type until this bug is fixed. |
@ron819 I'm going to address this bug in my PR. @ashb I'm still looking for some clarity on what is going on with the json_client. Even looking at the |
@ashb Think I got the json_client figured out, will try to get my changes committed tonight. |
Great! Sorry I didn’t respond - paid work has been busy :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small changes/comments. Looking good overall.
One thing that wasn't clear from a read of the code: does the API return the passwords on list or not? (I don't think it should, as right now none of the current methods of viewing connection in the CLI or WebUI give the value of the password)
airflow/bin/cli.py
Outdated
return | ||
conns = api_client.list_connections() | ||
# format it for the cli | ||
conns = list(map(lambda c: (c['conn_id'], c['conn_type'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: we don't need the list()
here
try: | ||
print(api_client.delete_connection(args.conn_id, args.delete_all)) | ||
except MissingArgument as ma: | ||
print(ma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should exit with a Non-zero code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is raising the exception good enough? or is there a specific way to exit with non-zero code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising the ma
exception would work but would provide a messy output (i.e. it would include the stack trace which is not useful here):
except MissingArgument as ma:
raise SystemExit(ma)
Should probably do the trick (and not include any errant stack-traces. If that still includes a stack trace, or errors then SystemExit(str(ma)))
'--add flag and --conn_uri flag: {invalid!r}\n') | ||
msg = msg.format(invalid=invalid_args) | ||
try: | ||
new_conn = api_client.add_connection( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be easier to do new_conn = api_client.add_connection(vars(args))
? (that does rely on the names from argparser matching the named args in the api client, but they do right now)
Not sure on this one.
conn_extra=args.conn_extra) | ||
|
||
except MissingArgument as ma: | ||
print(ma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-zero exit code here too
password="airflow", | ||
schema="airflow", | ||
port="5432" | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look it appears that all the test cases use this connection - could we create it once in setUp() to reduce the amount of code in each test function?
|
||
session.query(models.Pool).delete() | ||
session.query(models.Variable).delete() | ||
session.query(models.Connection).delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is causing your test failures:
AirflowException: The conn_id
sqlite_default
isn't defined
from elsewhere in the tests.
Adds an Update Connection CLI option.\nThis is useful for updating connections in a CI/CD environment\n where deleting and adding connections\n could cause DAGS to fail. \nAlso exposes connection api via rest
I see 5 tests failing, 1 of which is my |
"Successfully added `conn_id`=new3 : postgres://airflow:airflow@host:5432/airflow", | ||
"Successfully added `conn_id`=new4 : postgres://airflow:airflow@host:5432/airflow", | ||
"Successfully added `conn_id`=new5 : hive_metastore://airflow:airflow@host:9083/airflow", | ||
"Successfully added `conn_id`=new6 : google_cloud_platform://:@:", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the "RFC 3986: Uniform Resource Identifier (URI): Generic Syntax" specification, this is not a valid address.
Schema part must be compatible with the format:
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
I created a issue https://issues.apache.org/jira/browse/AIRFLOW-3616 and working on PR: PolideaInternal#30
which deals with this topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please improve compatibility with RFC 3986 Section 3.1
:return: The updated Connection | ||
""" | ||
endpoint = '/api/experimental/connections/' | ||
url = urljoin(self._api_base_url, endpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done in the _request
method. Can you correct this or will we do it in a separate PR?
_log.error(err) | ||
response = jsonify(error="{}".format(err)) | ||
response.status_code = err.status_code | ||
return response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code snippet is repeated in any method. Should we think about how to avoid repeating this piece of code? Maybe introducing a decorator?
will reopen when working |
Make sure you have checked all steps below.
Jira
Description
Adds an Update Connection CLI option. This is useful for updating connections in a CI/CD environment where deleting and adding connections could cause DAGS to fail.
Tests
My PR adds a
update connection
step to an existing unit test intest/core.py
Commits
Documentation
Code Quality
git diff upstream/master -u -- "*.py" | flake8 --diff