-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-6975] Base AWSHook AssumeRoleWithSAML #7619
Conversation
I am unsure how to unit test this - suggestions welcome :). |
method = self._assume_role_with_saml | ||
else: | ||
raise NotImplementedError( | ||
'assume_role_method=%s' % assume_role_method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I get this exception I've got no idea what went wrong. The information should be more meaningful and probably NotImplementedError is not a best here. Also, we can us f-strings for formatting :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, thanks. What makes sense to the developer might make no sense to the user ;) I will update these cases accordingly.
sts_client, | ||
extra_config, | ||
role_arn, | ||
assume_role_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add type annotation? Those really help :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
mutual_auth = saml_config['mutual_authentication'] | ||
if mutual_auth == 'REQUIRED': | ||
auth = requests_gssapi.HTTPSPNEGOAuth( | ||
requests_gssapi.REQUIRED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why we cannot keep this in one line? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked VS Code and it couldn't tell me why it put it on another line xD. Fixed
requests_gssapi.DISABLED) | ||
else: | ||
raise NotImplementedError( | ||
'mutual_authentication=%s' % mutual_auth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, the exception does not help the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed thanks
if not saml_assertion: | ||
raise ValueError('Invalid SAML Assertion') | ||
else: | ||
raise NotImplementedError('idp_auth_method=%s' % idp_auth_method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also fixed
Hello @baolsen -> I have a proposal - we have implemented a lot of "system tests" in GCP and maybe that will be the right time to try the same approach in AWS? @nuclearpinguin is just implementing some latest fixes to our "reusable" system test class and we have it already pretty well described in the documentation https://github.com/apache/airflow/blob/master/TESTING.rst#airflow-system-tests Breeze is now very well prepared to run such system tests automatically and I also have an open discussions about backporting the "providers" packages to 1.10 (part of AIP-21) and I want to propose that one of the conditions there is to have system tests semi-automatically that are covering the provider's package functionality. And we want to fully automate it with AIP-4. Maybe we can work together - me, @nuclearpinguin , @mik-laj and you to add first AWS system tests and be able to run it automatically? Maybe we can also - with your involvement - improve it all and make it easily usable by others? It's rather straightforward - please take a look at the docs. It boils down to having an "example_dag" that is actually runnable (providing authorisation/configuration information). The dag should perform setup/some operations/teardown on a real external system (GCP or AWS in this case) - we have some useful helpers written to make it easy to encapsulate it in a Pytest test with @pytest.markers.system("amazon") in your case. Then it can be easily run automatically (we will just have to provide authorisation file in the form of files/airflow-breeze-config/variables.env file - with authorisation variables. It's a bit involved as those tests usually run for a long time - but at the same time you have an environment to test your changes on your own with a real system (something that you do manually now and can be automated for any future changes). What do you think? We would love to help with that! |
Hey @potiuk . Sure, I'd be willing to give it a try :) Probably best to create a Slack channel... It is blocked at my work, but I can use it via mobile to keep in touch. |
Yeah - cool. That would be great. I will create a channel for that. I think it should be ok to proceed with that one without system tests and merge it and then add the system test as separate PR as some refactoring might be needed |
Build can be restarted - test timed out :) |
Restarted :) |
Timed out again... Please re-restart :) |
@baolsen can you rebase onto actual master? We've got some system test failing (should not be run at all). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _assume_role_with_saml
method looks a bit complex to me, because you are importing so many different libraries and you are doing a lot of semantic blocks which are showing that it could be useful to split it - So I would probably refactor it.
In my opinion the whole _get_credentials
is too complex. Maybe it is time to add a sub-class AwsCredentialsExtractor
or similar where we can split this whole function into smaller pieces we also fully test.
But in general it really looks good to me. 👍 You added the documentation to the how-to which is great :)
# requests_gssapi will need paramiko > 2.6 since you'll need | ||
# 'gssapi' not 'python-gssapi' from PyPi. | ||
# https://github.com/paramiko/paramiko/pull/1311 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion this would be better to have in the commit message and/or the PR description.
But it's fine I guess - not that important to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the PR description; will quickly add it to the how-to documentation as well :)
It is getting quite complex. I think I'd rather leave it for now, until the AWS system tests are in place and then refactoring and unit testing will be a possibility :) |
Codecov Report
@@ Coverage Diff @@
## master #7619 +/- ##
==========================================
- Coverage 86.76% 86.15% -0.62%
==========================================
Files 897 904 +7
Lines 42819 43787 +968
==========================================
+ Hits 37153 37725 +572
- Misses 5666 6062 +396
Continue to review full report at Codecov.
|
@nuclearpinguin Ready to merge :) |
Issue link: AIRFLOW-6975
Note that using AssumeRoleWithSAML uses
requests_gssapi
to obtain a SAML assertion from your IDP.You must then
pip install gssapi
rather thanpip install python-gssapi
to avoid a conflict withparamiko
.Also per code comment, this PR updates the required Paramiko version to be compatible.
requests_gssapi will need paramiko > 2.6 since you'll need
'gssapi' not 'python-gssapi' from PyPi.
paramiko/paramiko#131
Make sure to mark the boxes below before creating PR: [x]
[AIRFLOW-NNNN]
. AIRFLOW-NNNN = JIRA ID** For document-only changes commit message can start with
[AIRFLOW-XXXX]
.In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.