-
Notifications
You must be signed in to change notification settings - Fork 48
Support YARN endpoints protected by Kerberos/SPNEGO #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kevin-bates
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great Luciano - thanks! I just had a couple comments.
| if self.address is None: | ||
| raise ConfigurationError('API address is not set') | ||
| elif self.port is None: | ||
| raise ConfigurationError('API port is not set') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add checks against None for address and port at the very end of the constructor to maintain parity.
Also, this is removing a property that allowed the user to get a connection to the particular server. Seems like that needs to be preserved unless we know there are no applications using that property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch... added it back together with fixes for test cases that were failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not finding where property http_conn was added back in. Please advise.
yarn_api_client/node_manager.py
Outdated
| """ | ||
| def __init__(self, address=None, port=8042, timeout=30): | ||
| self.address, self.port, self.timeout = address, port, timeout | ||
| def __init__(self, address=None, port=8042, timeout=30, kerberosEnaled=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix typo to kerberosEnabled
yarn_api_client/resource_manager.py
Outdated
| """ | ||
| def __init__(self, address=None, port=8088, timeout=30): | ||
| self.address, self.port, self.timeout = address, port, timeout | ||
| def __init__(self, address=None, port=8088, timeout=30, kerberosEnaled=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix typo to kerberosEnabled
1db7505 to
5f72136
Compare
|
There are still some dependency issues with the tests, around the SPNEGO dependency, I will try to figure that out tomorrow. |
7f3b6e9 to
01d5144
Compare
|
Ok, All good with the dependencies, it was a tox configuration issue. |
setup.py
Outdated
| install_requires = install_requires, | ||
| install_requires = [ | ||
| 'requests>=2.7,<3.0', | ||
| 'requests-kerberos==0.12.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not a good idea to pin requirements here, as it may conflict with project wide requirements
perhaps, it's possible to specify in terms of >=, <=?
| import requests | ||
| import requests_mock | ||
|
|
||
| from tests import TestCase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I would use pytest for testing
But fine for now. I may find some time to update tests
|
|
||
| client.request('/ololo', foo='bar') | ||
| def test_valid_request(self): | ||
| with requests_mock.mock() as requests_get_mock: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps, use requests_mock as a decorator?
tox.ini
Outdated
|
|
||
| [testenv] | ||
| deps = | ||
| requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please keep this list sorted
| """ | ||
| def __init__(self, address=None, port=8088, timeout=30): | ||
| self.address, self.port, self.timeout = address, port, timeout | ||
| def __init__(self, address=None, port=8088, timeout=30, kerberosEnabled=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't mix variable naming styles. Project follows pep8 convention
| from urllib import urlencode | ||
| except ImportError: | ||
| from urllib.parse import urlencode | ||
| import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with requests it becomes so much easies 👍
| if params: | ||
| path = api_path + '?' + params | ||
| params = query_args | ||
| api_endpoint = 'http://{}:{}{}'.format(self.address, self.port, api_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a doubt about hardcoding http here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabling https requires some more work throughout the code, we could address that in the future
|
Thanks for the PR! |
.travis.yml
Outdated
| install: | ||
| - pip install tox coveralls | ||
| - pip install --upgrade setuptools pip | ||
| - pip install --upgrade requests requests_mock requests_kerberos tox coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these deps should be managed by tox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except tox and coveralls
| @@ -0,0 +1,82 @@ | |||
| # -*- coding: utf-8 -*- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to make integration tests as a part of CI process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integration tests require a running yarn, we could probably add docker, etc... let me create an issue to address that in the near future.
| appstats = self.resourceManager.cluster_application_statistics() | ||
| pprint(appstats.data) | ||
| self.assertIsNotNone(appstats.data['appStatInfo']) | ||
| # TODO: test arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove TODO
|
@toidi and @kevin-bates All comments have been addressed, and a few issues created around the things that are really general enhancements and not added by these commits. Any other issues, otherwise I would like to add these commits to master and perform a release so I can use this in EG. |
kevin-bates
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good except I can't find where the http_conn property was restored.
| if self.address is None: | ||
| raise ConfigurationError('API address is not set') | ||
| elif self.port is None: | ||
| raise ConfigurationError('API port is not set') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not finding where property http_conn was added back in. Please advise.
|
In BaseYarnAPI there is now and also the validation tests in |
|
Also note that, because we moved to now use the requests package, we are not using http_connection anymore. |
|
And yes, looking from the diffs are a little strange, so I had to go into the dif, find the base.py and show the file from the diff page. You can probably do the same from the commits list as well. |
|
I understand we're not using http_connection any more, but |
cfc8b30 to
0df6fb7
Compare
|
Ok, I now understand your question. This is indeed a breaking change, as the HTTP connection is now managed internally by requests package. How about naming the release 0.3.0 and updating the changelog to explicitly describe this breaking change: |
yarn_api_client/resource_manager.py
Outdated
| loc_args = ( | ||
| ('states', states), | ||
| ('applicationTypes', applicationTypes)) | ||
| ('application_types', application_types)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first item in tuple goes to hadoop api
does it expect ?construct_parameters=X instead of ?applicationTypes=X in query string?
ediskandarov-cur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks good to me
maybe except rename of application_types query string argument
|
@kevin-bates your concern about |
kevin-bates
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration test that, given a provided YARN ENDPOINT, execute some real scenario test against that server. Note that, if no YARN ENDPOINT is provided, the tests are ignored.
There are a few commits here, mostly around Kerberos/SPNEGO and Integration Tests.
Also, bumping the version to 0.2.6 to perform a release with these changes to be used in Jupyter Enterprise Gateway.
Please review, but don't squash/merge.