New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kerberos authentication support #355
Kerberos authentication support #355
Conversation
Thanks so much! Nice to see 👍 @pranayhasan, I hope this you and your team can get started on this review. Can you help us with the review to make sure it works for your scenario as well? I would suggest taking the commit and trying it out on your system. Feel free to make any code/test comments as well. Thanks! |
@joychak you may want to take a look :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you have any questions!
Thank you so much for the PR :)
sparkmagic/setup.py
Outdated
@@ -88,6 +88,7 @@ def version(path): | |||
'ipykernel>=4.2.2,<5', | |||
'ipywidgets>5.0.0,<7.0', | |||
'notebook>=4.2,<5.0', | |||
'tornado>=4' | |||
'tornado>=4', | |||
'requests_kerberos' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add min, max versions. A good idea is to have the max version be the next major version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, requirements.txt needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks for catching this!
@@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin | |||
value='password', | |||
width=widget_width | |||
) | |||
self.auth_type = self.ipywidget_factory.get_dropdown( | |||
options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update the README with information on the new authentication types supported in the features section.
We also need a section on how to use the Kerberos implementation here: basically an overview of how it works/what sparkmagic does vs what you need to do out of band regarding ticket management: e.g. you need to do kinit out of band.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Password widget has been added for the 7.0.0 version of ipywidgets and should be available soon. For now we can use 7.0.0a2 version of ipywidgets until 7.0.0 is published. But before this, we need to fix problems in display after ipywidget is updated to use 6.0.0 or later versions so as to make it usable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not use any alpha/dev versions of ipywidgets please.
I was not aware of these display issues with 6.0.0. Is this introduced by this PR or has it always been like this with 6.0.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is that we need to the functionality with 7.0.0a2, so that we can use 7.0.0 once it gets published. We already have display issues with 6.0.0 version where the fields are displayed in a row as above. Can we revert back the commit and address UI issue to output the fields in columns? We can use the password widget only once we fix these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to fix the issue with 6.0 first. I just created an issue for it (#358). I do not think I can prioritize it now. @pranayhasan do you want to take a crack at it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aggFTW Took this up. Thanks for creating an issue!
options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, | ||
constants.NO_AUTH: constants.NO_AUTH}, | ||
description=u"Auth type:" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to show/hide the password widget based on the authentication type selected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at #284, and we had this comment:
On change, we should hide/show the appropriate fields. So:
- On change to NO_AUTH: hide username and password.
- On change to AUTH_KERBEROS: hide username and password.
- On change to AUTH_SSL: show username and password.
Ipywidgets has a way to make widgets visible or not:
https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aggFTW Can you help me understand how to use that API? Any examples are appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line number changed: https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L134
Basically, you can take any widget and change its .layout.display
property to flex
or none
to show/hide. You would have to create a callback that is called whenever the value changes and knows to show/hide the appropriate widgets. Here, we add a callback, for example:
y_column_view = self.ipywidget_factory.get_dropdown(options=options_y_view,
description="Y", value=self.encoding.y)
y_column_view.on_trait_change(self._y_changed_callback, 'value')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks. Let me check and update the PR.
@@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None): | |||
@argument("-u", "--username", type=str, help="Username to use.") | |||
@argument("-p", "--password", type=str, help="Password to use.") | |||
@argument("-s", "--server", type=str, help="Url of server to use.") | |||
@argument("-t", "--auth_type", type=str, help="Auth type for authentication") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change will break
code = '%{} -s {} -u {} -p {}'.format(KernelMagics._do_not_call_change_endpoint.__name__, endpoint, username, password) |
Please update the handler too so that it receives an auth parameter and passes it back.
self._auth = HTTPKerberosAuth(mutual_authentication=REQUIRED) | ||
elif self._endpoint.auth_type == constants.AUTH_BASIC: | ||
self._auth = (self._endpoint.username, self._endpoint.password) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a check for non-valid auth types and throw an error for them. As it is, non-valid ones would just silently fail.
@@ -112,7 +114,7 @@ def spark(self, line, cell="", local_ns=None): | |||
# info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the documentation in the comment above to add the -t
parameter for the add
subcommand.
@@ -63,7 +64,7 @@ def base64_kernel_python3_credentials(): | |||
|
|||
@_with_override | |||
def kernel_scala_credentials(): | |||
return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'} | |||
return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the same for the other kernels too:
- kernel_r_credentials
- kernel_python3_credentials
@@ -202,7 +203,7 @@ def _credentials_override(f): | |||
If 'base64_password' is not set, it will fallback to to 'password' in config. | |||
""" | |||
credentials = f() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where you could see if auth_type is there to begin with or not. Please also test it with an actual file.
|
||
|
||
class Endpoint(object): | ||
def __init__(self, url, username="", password=""): | ||
def __init__(self, url, auth_type=conf.default_livy_endpoint_auth_type(), username="", password=""): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a default value here, do we? If we don't need it, then we could get rid of the default_livy_endpoint_auth_type
setting altogether.
|
||
|
||
@_with_override | ||
def default_livy_endpoint_auth_type(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not needed.
Let me look into this tomorrow or this weekend.
Thanks,
Joy
…Sent from my iPhone
On May 11, 2017, at 8:31 PM, Alejandro Guerrero Gonzalez ***@***.***> wrote:
@aggFTW requested changes on this pull request.
Let me know if you have any questions!
Thank you so much for the PR :)
In sparkmagic/setup.py:
> @@ -88,6 +88,7 @@ def version(path):
'ipykernel>=4.2.2,<5',
'ipywidgets>5.0.0,<7.0',
'notebook>=4.2,<5.0',
- 'tornado>=4'
+ 'tornado>=4',
+ 'requests_kerberos'
Please add min, max versions. A good idea is to have the max version be the next major version.
In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py:
> @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin
value='password',
width=widget_width
)
+ self.auth_type = self.ipywidget_factory.get_dropdown(
+ options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,
Can you please update the README with information on the new authentication types supported. We also need a section on how to use the Kerberos implementation here: basically an overview of how it works/what sparkmagic does vs what you need to do out of band regarding ticket management: e.g. you need to do kinit out of band.
In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py:
> @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin
value='password',
width=widget_width
)
+ self.auth_type = self.ipywidget_factory.get_dropdown(
+ options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,
+ constants.NO_AUTH: constants.NO_AUTH},
+ description=u"Auth type:"
+ )
Do we need to show/hide the password widget based on the authentication type selected?
In sparkmagic/sparkmagic/kernels/kernelmagics.py:
> @@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None):
@argument("-u", "--username", type=str, help="Username to use.")
@argument("-p", "--password", type=str, help="Password to use.")
@argument("-s", "--server", type=str, help="Url of server to use.")
+ @argument("-t", "--auth_type", type=str, help="Auth type for authentication")
This change will break https://github.com/jupyter-incubator/sparkmagic/blob/ed232d76f90ae9032d920efca21c90d1abcfc4d6/sparkmagic/sparkmagic/serverextension/handlers.py#L55
Please update the handler too so that it receives an auth parameter and passes it back.
In sparkmagic/sparkmagic/livyclientlib/reliablehttpclient.py:
> @@ -17,6 +20,11 @@ def __init__(self, endpoint, headers, retry_policy):
self._endpoint = endpoint
self._headers = headers
self._retry_policy = retry_policy
+ if self._endpoint.auth_type == constants.AUTH_KERBEROS:
+ self._auth = HTTPKerberosAuth(mutual_authentication=REQUIRED)
+ elif self._endpoint.auth_type == constants.AUTH_BASIC:
+ self._auth = (self._endpoint.username, self._endpoint.password)
+
Please add a check for non-valid auth types and throw an error for them. As it is, non-valid ones would just silently fail.
In sparkmagic/sparkmagic/utils/configuration.py:
> @@ -44,7 +45,7 @@ def session_configs():
@_with_override
def kernel_python_credentials():
- return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'}
+ return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH}
For this to be back compatible with old config.json files, we need to move the logic of:
constants.NO_AUTH if username == '' and password == '' else constants.BASIC_AUTH
In sparkmagic/sparkmagic/tests/test_configuration.py:
>
@with_setup(_setup)
def test_configuration_override_fallback_to_password():
- kpc = { 'username': 'U', 'password': 'P', 'url': 'L' }
+ kpc = { 'username': 'U', 'password': 'P', 'url': 'L', 'auth_type': None }
We need some extra tests:
when auth_type is not there at all in the config file, conf.base64_kernel_python_credentials (and all other languages) should return auth_type as constants.NO_AUTH if the username and password are the empty string
when auth_type is not there at all in the config file, conf.base64_kernel_python_credentials (and all other languages) should return auth_type as constants.AUTH_BASIC if the username or password are not the empty string
This will ensure backwards compatibility to existing config.json files.
In sparkmagic/sparkmagic/tests/test_reliablehttpclient.py:
> import requests
+from pandas.util.testing import assertIsInstance
Please use assert isinstance() instead: https://docs.python.org/2/library/functions.html#isinstance since you are not checking Pandas dataframes here.
In sparkmagic/sparkmagic/tests/test_reliablehttpclient.py:
> @@ -207,3 +210,18 @@ def test_will_retry_error_no():
retry_policy.should_retry.assert_called_once_with(None, True, 0)
***@***.***_setup(_setup, _teardown)
+def test_basic_auth_check_auth():
We need tests for NO_AUTH and for non-valid auth methods.
In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py:
> @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin
value='password',
width=widget_width
)
+ self.auth_type = self.ipywidget_factory.get_dropdown(
+ options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,
+ constants.NO_AUTH: constants.NO_AUTH},
+ description=u"Auth type:"
+ )
I'm looking at #284, and we had this comment:
On change, we should hide/show the appropriate fields. So:
On change to NO_AUTH: hide username and password.
On change to AUTH_KERBEROS: hide username and password.
On change to AUTH_SSL: show username and password.
Ipywidgets has a way to make widgets visible or not:
https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L126
In sparkmagic/sparkmagic/kernels/kernelmagics.py:
> @@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None):
@argument("-u", "--username", type=str, help="Username to use.")
@argument("-p", "--password", type=str, help="Password to use.")
@argument("-s", "--server", type=str, help="Url of server to use.")
+ @argument("-t", "--auth_type", type=str, help="Auth type for authentication")
Let's change the long name to --auth to be consistent with the other magic
In sparkmagic/sparkmagic/magics/remotesparkmagics.py:
> @@ -112,7 +114,7 @@ def spark(self, line, cell="", local_ns=None):
# info
We need to update the documentation in the comment above to add the -t parameter for the add subcommand.
In sparkmagic/sparkmagic/utils/configuration.py:
> @@ -63,7 +64,7 @@ def base64_kernel_python3_credentials():
@_with_override
def kernel_scala_credentials():
- return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'}
+ return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH}
Ditto
In sparkmagic/sparkmagic/utils/configuration.py:
> @@ -63,7 +64,7 @@ def base64_kernel_python3_credentials():
@_with_override
def kernel_scala_credentials():
- return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'}
+ return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH}
We need the same for the other kernels too:
kernel_r_credentials
kernel_python3_credentials
In sparkmagic/sparkmagic/utils/configuration.py:
> @@ -202,7 +203,7 @@ def _credentials_override(f):
If 'base64_password' is not set, it will fallback to to 'password' in config.
"""
credentials = f()
This is where you could see if auth_type is there to begin with or not. Please also test it with an actual file.
In sparkmagic/sparkmagic/livyclientlib/endpoint.py:
>
class Endpoint(object):
- def __init__(self, url, username="", password=""):
+ def __init__(self, url, auth_type=conf.default_livy_endpoint_auth_type(), username="", password=""):
I don't think we need a default value here, do we? If we don't need it, then we could get rid of the default_livy_endpoint_auth_type setting altogether.
In sparkmagic/sparkmagic/utils/configuration.py:
> @@ -212,3 +213,8 @@ def _credentials_override(f):
msg = "base64_password for %s contains invalid base64 string: %s %s" % (f.__name__, exception_type, exception)
raise BadUserConfigurationException(msg)
return base64_decoded_credentials
+
+
+@_with_override
+def default_livy_endpoint_auth_type():
I think this is not needed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks @joychak. I am mostly interested in being able to support your use case presented here: https://spark-summit.org/east-2017/events/secured-kerberos-based-spark-notebook-for-data-science/ I think we could simply add a new authentication method (e.g. PROXY_USER_AUTH or some similar name), and follow the entire Endpoint/HttpClient pattern here in the future. |
Thanks for the comments @aggFTW. Let me look into the comments and update the PR this weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make the authentication type, endpoints configurable from config.json so that there shall be no need to re-enter each time. It can be solved as a part of this issue or I can open a new issue as an enhancement and contribute to it.
sparkmagic/setup.py
Outdated
@@ -88,6 +88,7 @@ def version(path): | |||
'ipykernel>=4.2.2,<5', | |||
'ipywidgets>5.0.0,<7.0', | |||
'notebook>=4.2,<5.0', | |||
'tornado>=4' | |||
'tornado>=4', | |||
'requests_kerberos' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, requirements.txt needs to be updated.
@@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin | |||
value='password', | |||
width=widget_width | |||
) | |||
self.auth_type = self.ipywidget_factory.get_dropdown( | |||
options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Password widget has been added for the 7.0.0 version of ipywidgets and should be available soon. For now we can use 7.0.0a2 version of ipywidgets until 7.0.0 is published. But before this, we need to fix problems in display after ipywidget is updated to use 6.0.0 or later versions so as to make it usable.
I have a general question (after reviewing the pull request) about this approach of Kerberized mechanism for the following 2 scenarios- Scenario one:When SparkMagic is being used inside JupyterHub then the kerberos authentication happens at JupyterHub (userid/passwd are authenticated at JupyterHub before opening any notebook) and the token (not any user-id/passwd) is normally passed to the SparkMagic for further authentication. Scenario two:Kerberos authentication can also being done using a keytab file which is very common in corporate environment. When you get authenticated using keytab file (using "KINIT") then you just need to pass the kerberos principal name to the HTTPKerberosAuth object inside "reliablehttpclient.py" file at "_send_request_helper" method. The HTTPKerberosAuth object then picks up the kerberos ticket from the KRB cache file for the given principal. The above 2 scenarios doesnt need any mechanism within SparkMagic (inside notebook) context to accept userId/passwd. We need to make sure that we cover these scenarios which is missing currently because the changes in reliablehttpclient.py can only work with UserId/Passwd based Kerberos authentication but not for keytab or KRB-ticket based Kerberos authentication. |
@@ -11,6 +11,7 @@ | |||
from .constants import HOME_PATH, CONFIG_FILE, MAGICS_LOGGER_NAME, LIVY_KIND_PARAM | |||
from .utils import get_livy_kind | |||
from sparkmagic.livyclientlib.exceptions import BadUserConfigurationException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to modify the example_config.json file
@pranayhasan, regarding
I believe this PR already addresses it. Look at the configuration.py file. Do you have something else in mind? |
@joychak Thanks for reviewing! Those scenarios make sense and are needed. Is there anything on this PR that would prevent those scenarios from being implemented in a later PR? I'm thinking we could have a 4th and 5th authentication type (e.g. It seems you have implemented at least one of this. Would you be able to submit a PR after this one to add this support or is this design restrictive? |
Adding KERBEROS_TOKEN and KERBEROS_PRINCIPAL makes sense. I would be able to submit the KERBEROS_PRINCIPAL one immediately after the PR and currently working on the KERBEROS_TOKEN. In case of KERBEROS_TOKEN, the notebook UI should have the provision to provide the keytab file path. |
Thank @joychak! That would be wonderful. For the path, could we use a config based approach? |
+1 => config based approach |
@aggFTW In a corporate environment where all the users need to use Kerberos by default, we don't want to show users the authentication type. It should be defaulted to Kerberos with no display of auth_type in widget. So, we should be able to handle that using auth/auth_type by specifying in ~/.sparkmagic/config.json |
You want to add a config to not display the authentication type with the regular python kernel widget magics and instead use the default I think this could be a separate PR altogether. @pranayhasan would you mind sending that out if @praveenkanamarlapudi does not tackle it? |
@aggFTW Yes, sure. |
@praveenkanamarlapudi, any updates? |
@aggFTW Sorry I was overloaded with other stuff. Let me update the PR by this weekend. Thanks |
fd4865c
to
be94edf
Compare
import requests | ||
from pandas.util.testing import assertIsInstance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are still using assertIsInstance
. Previous comment:
Please use
assert isinstance()
instead: https://docs.python.org/2/library/functions.html#isinstance since you are not checking Pandas dataframes here.
@@ -3,10 +3,13 @@ | |||
import json | |||
from time import sleep | |||
import requests | |||
from requests_kerberos import HTTPKerberosAuth, REQUIRED | |||
import subprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove subprocess import
@@ -202,7 +203,7 @@ def _credentials_override(f): | |||
If 'base64_password' is not set, it will fallback to to 'password' in config. | |||
""" | |||
credentials = f() | |||
base64_decoded_credentials = {k: credentials.get(k) for k in ('username', 'password', 'url')} | |||
base64_decoded_credentials = {k: credentials.get(k) for k in ('username', 'password', 'url', 'auth_type')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if auth_type
isn't there? Wouldn't this throw?
@@ -62,7 +64,7 @@ def setUp(self): | |||
|
|||
# Mock request | |||
self.request = MagicMock() | |||
self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint}) | |||
self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint, "auth_type": self.auth_type}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no tests here to test for when auth
is not in the payload at all. There should be two tests for that case:
auth
is not on payload but due to empty username and password, auth should default to None.auth
is not on payload but due to non-empty username and password, auth should default to Basic Auth.
@praveenkanamarlapudi Please look at all comments. |
eb69f63
to
0ee14df
Compare
@aggFTW Update code. Sorry we had differences with our branch, so missed to push few changes earlier. I will update the README soon. |
self.ipywidget_factory.get_html(value="<br/>", width=widget_width), self.submit_widget] | ||
|
||
for child in self.children: | ||
child.parent_widget = self | ||
|
||
self._show_correct_endpoint_fields() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling initially to display right display elements initially as per the auth type.
|
||
def refresh_configuration(self): | ||
credentials = getattr(conf, 'base64_kernel_' + self.language + '_credentials')() | ||
(username, password, url) = (credentials['username'], credentials['password'], credentials['url']) | ||
self.endpoint = Endpoint(url, username, password) | ||
if hasattr(credentials, 'auth'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
credentials is a dictionary, so the check would be if 'auth' in credentials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check would not be here. When you call conf.base64_kernel_X_credentials(), the contract is that auth is always returned. So this code that sets auth if it's not specified by the user needs to be in configuration.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests for this code in test_configuration.py. This back compat is extremely important to get right.
elif self._endpoint.auth == constants.AUTH_BASIC: | ||
self._auth = (self._endpoint.username, self._endpoint.password) | ||
elif self._endpoint.auth != constants.NO_AUTH: | ||
raise ValueError(u"Unsupported auth."+self._endpoint.auth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still ValueError when I asked to change it to Bad Config exception at https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/livyclientlib/exceptions.py#L41
self.endpoints[self.address_widget.value] = endpoint | ||
self.ipython_display.writeln("Added endpoint {}".format(self.address_widget.value)) | ||
|
||
# We need to call the refresh method because drop down in Tab 2 for endpoints wouldn't refresh with the new | ||
# value otherwise. | ||
self.refresh_method() | ||
|
||
def _show_correct_endpoint_fields(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try this signature on trait change? Did it work? I would have expected it to throw and for the fields not to be updated if the signature of the method were wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it and works fine.
Please review my comments again. Let's be thorough on reviewing comments and addressing them. |
0ee14df
to
a72c81b
Compare
README.md
Outdated
@@ -107,6 +108,9 @@ However, there are some important limitations to note: | |||
2. Since all code is run on a remote driver through Livy, all structured data must be serialized to JSON and parsed by the Sparkmagic library so that it can be manipulated and visualized on the client side. | |||
In practice this means that you must use Python for client-side data manipulation in `%%local` mode. | |||
|
|||
## Kerberos support | |||
Sparkmagic uses [requests-kerberos](https://github.com/requests/requests-kerberos) package for kerberos support. The sparkmagic expects kerberos ticket to be avaibale in the system. It picks up the kerbros ticket from cache file. User needs to run kinit to create the kerberos ticket. The sparkmagic doesn't support passing kerberos principal/token. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please change this paragraph to:
## Authentication Methods
Sparkmagic supports:
* No auth
* Basic authentication
* Kerberos
Kerberos support is implemented via the [requests-kerberos](https://github.com/requests/requests-kerberos) package. Sparkmagic expects a kerberos ticket to be available in the system. Requests-kerberos will pick up the kerberos ticket from a cache file. For the ticket to be available, the user needs to have run [kinit](https://web.mit.edu/kerberos/krb5-1.12/doc/user/user_commands/kinit.html) to create the kerberos ticket.
Currently, sparkmagic does not support passing a kerberos principal/token, but we welcome pull requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
conf.override(conf.status_sleep_seconds.__name__, 1) | ||
assert_equals(conf.d, { conf.kernel_python_credentials.__name__: kpc, | ||
conf.status_sleep_seconds.__name__: 1 }) | ||
assert_equals(conf.status_sleep_seconds(), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see these tests! 😄 Thanks for adding them!
Why are you overriding status_sleep_seconds
here?
a223795
to
0ef38e6
Compare
kpc = { 'username': '', 'password': '', 'url': 'L'} | ||
overrides = { conf.kernel_python_credentials.__name__: kpc } | ||
conf.override_all(overrides) | ||
conf.override(conf.status_sleep_seconds.__name__, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, why override status_sleep_seconds
at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me correct it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it will be like extra check. I can remove it if not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the extra check is not needed. Thank you!
@@ -62,7 +64,7 @@ def setUp(self): | |||
|
|||
# Mock request | |||
self.request = MagicMock() | |||
self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add two tests where you don't send the auth
parameter at all and still get no auth and basic auth as values for auth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! 👍
@@ -212,9 +212,10 @@ def _credentials_override(f): | |||
exception_type, exception, traceback = sys.exc_info() | |||
msg = "base64_password for %s contains invalid base64 string: %s %s" % (f.__name__, exception_type, exception) | |||
raise BadUserConfigurationException(msg) | |||
if base64_decoded_credentials['auth'] is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! 👍
Ready for merge! 👍 Thanks so much @praveenkanamarlapudi. I hope a lot of people are going to find this very useful! @joychak, we would love to see those code reviews for other authentication methods 😃 |
* Make location of config.json file configurable using environment variables (#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (#365) * Remove extra line * bump version * Optional coerce (#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config
* Remove extra line * Release 0.12.0 (#373) * Make location of config.json file configurable using environment variables (#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (#365) * Remove extra line * bump version * Optional coerce (#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config * bumping versions
I am trying to get Kerberos authentication working between livy 0.3 (in a Linux edge node of a cluster) and spakmagic 0.12.5 as part of jupyter notebook running on my windows box. On the windows side I am happy to kinit (I don't see the username/password on the UI when I select auth-type as Kerberos). Can you please confirm if the versions mentioned will allow Kerberos authentication between jupyter and livy? |
* Release 0.12.0 (jupyter-incubator#373) * Make location of config.json file configurable using environment variables (jupyter-incubator#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (jupyter-incubator#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (jupyter-incubator#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (jupyter-incubator#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (jupyter-incubator#365) * Remove extra line * bump version * Optional coerce (jupyter-incubator#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (jupyter-incubator#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (jupyter-incubator#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (jupyter-incubator#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config * Merge master to release (jupyter-incubator#390) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Merge master to release (jupyter-incubator#392) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Merge master to release (jupyter-incubator#394) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Test 2.7.13 environment for pypi push to prod (jupyter-incubator#393) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * try to fix pip push for prod pypi by pinning to later version of python * bump versions (jupyter-incubator#395) * Release v0.12.6 (jupyter-incubator#481) * Add python3 option in %manage_spark magic (jupyter-incubator#427) Fixes jupyter-incubator#420 * Links fixed in README * DataError in Pandas moved from core.groupby to core.base (jupyter-incubator#459) * DataError in Pandas moved from core.groupby to core.base * maintain backwards compatability with Pandas 0.22 or lower for DataError * Bump autoviz version to 0.12.6 * Fix unit test failure caused by un-spec'ed mock which fails traitlet validation (jupyter-incubator#480) * Fix failing unit tests Caused by an un-spec'ed mock in a test which fails traitlet validation * Bump travis.yml Python3 version to 3.6 Python 3.3 is not only EOL'ed but is now actively unsupported by Tornado, which causes the Travis build to fail again. * Bumping version numbers for hdijupyterutils and sparkmagic to keep them in sync * add magic for matplotlib display * repair * Patch SparkMagic for latest IPythonKernel compatibility **Description** * The IPython interface was updated to return an asyncio.Future rather than a dict from version 5.1.0. This broke SparkMagic as it still expects a dictionart from the output * This change updates the SparkMagic base kernel to expect a Future and block on its result. * This also updates the dependencies to call out the the new IPython version dependency. **Testing Done** * Unit tests added * Validating that the kernel connects successfully * Validating some basic Spark additional operations on an EMR cluster. * Fix decode json error at trailing empty line (jupyter-incubator#483) * Bump version number to 0.12.7 * add a screenshot of an example for display matplot picture * Fix guaranteed stack trace * Simplify loop a bit * We want to be able to interrupt the sleep, so move that outside the try / except * Add missing session status to session. * Correct to correct URL with full list. * Better tests. * Switch to Livy 0.6. * Sketch of removal of PYSPARK3. * Don't allow selection of Python3, since it's not a separate thing now. * __repr__ for easier debugging of test failures. * Start fixing tests. * Rip out more no-longer-relevant "Python 3" code. Python 3 and Python 2 work again. * Changelog. * Add progress bar to sparkmagic/sparkmagic/livyclientlib/command.py. Tested with livy 0.4-0.6, python2 and python3 * Support Future and non-Future results from ipykernel. * News entry. * Unpin ipykernel so it works with Python 2. * Python 3.7 support. * Also update requirements.txt. * Xenial has 3.7. * from IPython.display import display to silence travis warning * Couple missing entries. * Update versions. * Document release process, as I understand it. * Correct file name. * delete obsolete pyspark3kernel (jupyter-incubator#549) * delete obsolete pyspark3kernel * Update README.md * Update setup.py * Update test_kernels.py * Remove old kernelspec installation from Dockerfile This kernel was removed in jupyter-incubator#549 but the Dockerfile still tries to install it, which fails the build. This corrects that. * Relax constraints even more, and make sure to relax them in duplicate locations. * Don't assume some pre-populated tables, create a new table from the Docker image's examples. * Note new feature. * Additional dependencies for matplotlib to work. * Add support and documentation for extension use, refactor kernel use. * Example in pyspark kernel. * Test for Command's decoding of images. * Switch to plotly 3. * Try to switch to standard mechanism Sparkmagic uses for displaying. * Another entry. * Add documentation for JupyterLab. * Prepare for 0.12.9. * Revert get_session_kind change to be more consistent with upstream repo. * Remove redundant python3 session test. * Remove python3 references in livysession.
Fix #282