Kerberos authentication support #355

praveen-kanamarlapudi · 2017-05-10T08:21:25Z

Enabled kerberos authentication on sparkmagic
Updated test cases.

aggFTW · 2017-05-10T22:26:48Z

Thanks so much! Nice to see 👍
Looking into how busy I'm at work, I cannot get to it now, but I will get to the review not later than Friday.

@pranayhasan, I hope this you and your team can get started on this review. Can you help us with the review to make sure it works for your scenario as well? I would suggest taking the commit and trying it out on your system. Feel free to make any code/test comments as well.

Thanks!

aggFTW · 2017-05-12T00:11:38Z

@joychak you may want to take a look :)

aggFTW

Let me know if you have any questions!
Thank you so much for the PR :)

aggFTW · 2017-05-11T21:24:53Z

sparkmagic/setup.py

@@ -88,6 +88,7 @@ def version(path):
          'ipykernel>=4.2.2,<5',		
          'ipywidgets>5.0.0,<7.0',
          'notebook>=4.2,<5.0',
-          'tornado>=4'
+          'tornado>=4',
+          'requests_kerberos'


Please add min, max versions. A good idea is to have the max version be the next major version.

Also, requirements.txt needs to be updated.

+1

Thanks for catching this!

aggFTW · 2017-05-11T21:26:54Z

sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py

@@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin
            value='password',
            width=widget_width
        )
+        self.auth_type = self.ipywidget_factory.get_dropdown(
+            options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,


Can you please update the README with information on the new authentication types supported in the features section.

We also need a section on how to use the Kerberos implementation here: basically an overview of how it works/what sparkmagic does vs what you need to do out of band regarding ticket management: e.g. you need to do kinit out of band.

Password widget has been added for the 7.0.0 version of ipywidgets and should be available soon. For now we can use 7.0.0a2 version of ipywidgets until 7.0.0 is published. But before this, we need to fix problems in display after ipywidget is updated to use 6.0.0 or later versions so as to make it usable.

Let's not use any alpha/dev versions of ipywidgets please.

I was not aware of these display issues with 6.0.0. Is this introduced by this PR or has it always been like this with 6.0.0?

What I meant is that we need to the functionality with 7.0.0a2, so that we can use 7.0.0 once it gets published. We already have display issues with 6.0.0 version where the fields are displayed in a row as above. Can we revert back the commit and address UI issue to output the fields in columns? We can use the password widget only once we fix these.

Yes, we need to fix the issue with 6.0 first. I just created an issue for it (#358). I do not think I can prioritize it now. @pranayhasan do you want to take a crack at it?

@aggFTW Took this up. Thanks for creating an issue!

aggFTW · 2017-05-11T21:43:06Z

sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py

+            options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,
+                     constants.NO_AUTH: constants.NO_AUTH},
+            description=u"Auth type:"
+        )


Do we need to show/hide the password widget based on the authentication type selected?

I'm looking at #284, and we had this comment:

On change, we should hide/show the appropriate fields. So:

On change to NO_AUTH: hide username and password.

On change to AUTH_KERBEROS: hide username and password.

On change to AUTH_SSL: show username and password.

Ipywidgets has a way to make widgets visible or not:
https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L126

@aggFTW Can you help me understand how to use that API? Any examples are appreciated.

Line number changed: https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L134

Basically, you can take any widget and change its .layout.display property to flex or none to show/hide. You would have to create a callback that is called whenever the value changes and knows to show/hide the appropriate widgets. Here, we add a callback, for example:

y_column_view = self.ipywidget_factory.get_dropdown(options=options_y_view, description="Y", value=self.encoding.y) y_column_view.on_trait_change(self._y_changed_callback, 'value')

Ok thanks. Let me check and update the PR.

aggFTW · 2017-05-11T21:50:48Z

sparkmagic/sparkmagic/kernels/kernelmagics.py

@@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None):
    @argument("-u", "--username", type=str, help="Username to use.")
    @argument("-p", "--password", type=str, help="Password to use.")
    @argument("-s", "--server", type=str, help="Url of server to use.")
+    @argument("-t", "--auth_type", type=str, help="Auth type for authentication")


This change will break

sparkmagic/sparkmagic/sparkmagic/serverextension/handlers.py

Line 55 in ed232d7

code = '%{} -s {} -u {} -p {}'.format(KernelMagics._do_not_call_change_endpoint.__name__, endpoint, username, password)

Please update the handler too so that it receives an auth parameter and passes it back.

aggFTW · 2017-05-11T22:25:07Z

sparkmagic/sparkmagic/livyclientlib/reliablehttpclient.py

+            self._auth = HTTPKerberosAuth(mutual_authentication=REQUIRED)
+        elif self._endpoint.auth_type == constants.AUTH_BASIC:
+            self._auth = (self._endpoint.username, self._endpoint.password)
+


Please add a check for non-valid auth types and throw an error for them. As it is, non-valid ones would just silently fail.

aggFTW · 2017-05-12T00:21:44Z

sparkmagic/sparkmagic/magics/remotesparkmagics.py

@@ -112,7 +114,7 @@ def spark(self, line, cell="", local_ns=None):
        # info


We need to update the documentation in the comment above to add the -t parameter for the add subcommand.

aggFTW · 2017-05-12T00:25:42Z

sparkmagic/sparkmagic/utils/configuration.py

@@ -63,7 +64,7 @@ def base64_kernel_python3_credentials():

 @_with_override
 def kernel_scala_credentials():
-    return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'}
+    return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH}


We need the same for the other kernels too:

kernel_r_credentials

kernel_python3_credentials

aggFTW · 2017-05-12T00:28:41Z

sparkmagic/sparkmagic/utils/configuration.py

@@ -202,7 +203,7 @@ def _credentials_override(f):
    If 'base64_password' is not set, it will fallback to to 'password' in config.
    """
    credentials = f()


This is where you could see if auth_type is there to begin with or not. Please also test it with an actual file.

aggFTW · 2017-05-12T00:30:19Z

sparkmagic/sparkmagic/livyclientlib/endpoint.py



 class Endpoint(object):
-    def __init__(self, url, username="", password=""):
+    def __init__(self, url, auth_type=conf.default_livy_endpoint_auth_type(), username="", password=""):


I don't think we need a default value here, do we? If we don't need it, then we could get rid of the default_livy_endpoint_auth_type setting altogether.

aggFTW · 2017-05-12T00:30:34Z

sparkmagic/sparkmagic/utils/configuration.py

+
+
+@_with_override
+def default_livy_endpoint_auth_type():


I think this is not needed.

joychak · 2017-05-12T00:37:47Z

Let me look into this tomorrow or this weekend. Thanks, Joy

…

Sent from my iPhone

On May 11, 2017, at 8:31 PM, Alejandro Guerrero Gonzalez ***@***.***> wrote: @aggFTW requested changes on this pull request. Let me know if you have any questions! Thank you so much for the PR :) In sparkmagic/setup.py: > @@ -88,6 +88,7 @@ def version(path): 'ipykernel>=4.2.2,<5', 'ipywidgets>5.0.0,<7.0', 'notebook>=4.2,<5.0', - 'tornado>=4' + 'tornado>=4', + 'requests_kerberos' Please add min, max versions. A good idea is to have the max version be the next major version. In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py: > @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin value='password', width=widget_width ) + self.auth_type = self.ipywidget_factory.get_dropdown( + options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, Can you please update the README with information on the new authentication types supported. We also need a section on how to use the Kerberos implementation here: basically an overview of how it works/what sparkmagic does vs what you need to do out of band regarding ticket management: e.g. you need to do kinit out of band. In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py: > @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin value='password', width=widget_width ) + self.auth_type = self.ipywidget_factory.get_dropdown( + options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, + constants.NO_AUTH: constants.NO_AUTH}, + description=u"Auth type:" + ) Do we need to show/hide the password widget based on the authentication type selected? In sparkmagic/sparkmagic/kernels/kernelmagics.py: > @@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None): @argument("-u", "--username", type=str, help="Username to use.") @argument("-p", "--password", type=str, help="Password to use.") @argument("-s", "--server", type=str, help="Url of server to use.") + @argument("-t", "--auth_type", type=str, help="Auth type for authentication") This change will break https://github.com/jupyter-incubator/sparkmagic/blob/ed232d76f90ae9032d920efca21c90d1abcfc4d6/sparkmagic/sparkmagic/serverextension/handlers.py#L55 Please update the handler too so that it receives an auth parameter and passes it back. In sparkmagic/sparkmagic/livyclientlib/reliablehttpclient.py: > @@ -17,6 +20,11 @@ def __init__(self, endpoint, headers, retry_policy): self._endpoint = endpoint self._headers = headers self._retry_policy = retry_policy + if self._endpoint.auth_type == constants.AUTH_KERBEROS: + self._auth = HTTPKerberosAuth(mutual_authentication=REQUIRED) + elif self._endpoint.auth_type == constants.AUTH_BASIC: + self._auth = (self._endpoint.username, self._endpoint.password) + Please add a check for non-valid auth types and throw an error for them. As it is, non-valid ones would just silently fail. In sparkmagic/sparkmagic/utils/configuration.py: > @@ -44,7 +45,7 @@ def session_configs(): @_with_override def kernel_python_credentials(): - return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'} + return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH} For this to be back compatible with old config.json files, we need to move the logic of: constants.NO_AUTH if username == '' and password == '' else constants.BASIC_AUTH In sparkmagic/sparkmagic/tests/test_configuration.py: > @with_setup(_setup) def test_configuration_override_fallback_to_password(): - kpc = { 'username': 'U', 'password': 'P', 'url': 'L' } + kpc = { 'username': 'U', 'password': 'P', 'url': 'L', 'auth_type': None } We need some extra tests: when auth_type is not there at all in the config file, conf.base64_kernel_python_credentials (and all other languages) should return auth_type as constants.NO_AUTH if the username and password are the empty string when auth_type is not there at all in the config file, conf.base64_kernel_python_credentials (and all other languages) should return auth_type as constants.AUTH_BASIC if the username or password are not the empty string This will ensure backwards compatibility to existing config.json files. In sparkmagic/sparkmagic/tests/test_reliablehttpclient.py: > import requests +from pandas.util.testing import assertIsInstance Please use assert isinstance() instead: https://docs.python.org/2/library/functions.html#isinstance since you are not checking Pandas dataframes here. In sparkmagic/sparkmagic/tests/test_reliablehttpclient.py: > @@ -207,3 +210,18 @@ def test_will_retry_error_no(): retry_policy.should_retry.assert_called_once_with(None, True, 0) ***@***.***_setup(_setup, _teardown) +def test_basic_auth_check_auth(): We need tests for NO_AUTH and for non-valid auth methods. In sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py: > @@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin value='password', width=widget_width ) + self.auth_type = self.ipywidget_factory.get_dropdown( + options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC, + constants.NO_AUTH: constants.NO_AUTH}, + description=u"Auth type:" + ) I'm looking at #284, and we had this comment: On change, we should hide/show the appropriate fields. So: On change to NO_AUTH: hide username and password. On change to AUTH_KERBEROS: hide username and password. On change to AUTH_SSL: show username and password. Ipywidgets has a way to make widgets visible or not: https://github.com/jupyter-incubator/sparkmagic/blob/master/autovizwidget/autovizwidget/widget/encodingwidget.py#L126 In sparkmagic/sparkmagic/kernels/kernelmagics.py: > @@ -349,23 +349,25 @@ def _do_not_call_change_language(self, line, cell="", local_ns=None): @argument("-u", "--username", type=str, help="Username to use.") @argument("-p", "--password", type=str, help="Password to use.") @argument("-s", "--server", type=str, help="Url of server to use.") + @argument("-t", "--auth_type", type=str, help="Auth type for authentication") Let's change the long name to --auth to be consistent with the other magic In sparkmagic/sparkmagic/magics/remotesparkmagics.py: > @@ -112,7 +114,7 @@ def spark(self, line, cell="", local_ns=None): # info We need to update the documentation in the comment above to add the -t parameter for the add subcommand. In sparkmagic/sparkmagic/utils/configuration.py: > @@ -63,7 +64,7 @@ def base64_kernel_python3_credentials(): @_with_override def kernel_scala_credentials(): - return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'} + return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH} Ditto In sparkmagic/sparkmagic/utils/configuration.py: > @@ -63,7 +64,7 @@ def base64_kernel_python3_credentials(): @_with_override def kernel_scala_credentials(): - return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998'} + return {u'username': u'', u'base64_password': u'', u'url': u'http://localhost:8998', u'auth_type': constants.NO_AUTH} We need the same for the other kernels too: kernel_r_credentials kernel_python3_credentials In sparkmagic/sparkmagic/utils/configuration.py: > @@ -202,7 +203,7 @@ def _credentials_override(f): If 'base64_password' is not set, it will fallback to to 'password' in config. """ credentials = f() This is where you could see if auth_type is there to begin with or not. Please also test it with an actual file. In sparkmagic/sparkmagic/livyclientlib/endpoint.py: > class Endpoint(object): - def __init__(self, url, username="", password=""): + def __init__(self, url, auth_type=conf.default_livy_endpoint_auth_type(), username="", password=""): I don't think we need a default value here, do we? If we don't need it, then we could get rid of the default_livy_endpoint_auth_type setting altogether. In sparkmagic/sparkmagic/utils/configuration.py: > @@ -212,3 +213,8 @@ def _credentials_override(f): msg = "base64_password for %s contains invalid base64 string: %s %s" % (f.__name__, exception_type, exception) raise BadUserConfigurationException(msg) return base64_decoded_credentials + + +@_with_override +def default_livy_endpoint_auth_type(): I think this is not needed. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

aggFTW · 2017-05-12T01:53:36Z

Thanks @joychak. I am mostly interested in being able to support your use case presented here: https://spark-summit.org/east-2017/events/secured-kerberos-based-spark-notebook-for-data-science/

I think we could simply add a new authentication method (e.g. PROXY_USER_AUTH or some similar name), and follow the entire Endpoint/HttpClient pattern here in the future.

praveen-kanamarlapudi · 2017-05-12T17:14:31Z

Thanks for the comments @aggFTW. Let me look into the comments and update the PR this weekend.

pranayhasan

We need to make the authentication type, endpoints configurable from config.json so that there shall be no need to re-enter each time. It can be solved as a part of this issue or I can open a new issue as an enhancement and contribute to it.

pranayhasan · 2017-05-14T11:13:50Z

sparkmagic/setup.py

@@ -88,6 +88,7 @@ def version(path):
          'ipykernel>=4.2.2,<5',		
          'ipywidgets>5.0.0,<7.0',
          'notebook>=4.2,<5.0',
-          'tornado>=4'
+          'tornado>=4',
+          'requests_kerberos'


Also, requirements.txt needs to be updated.

pranayhasan · 2017-05-14T11:19:10Z

sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py

@@ -32,21 +33,26 @@ def __init__(self, spark_controller, ipywidget_factory, ipython_display, endpoin
            value='password',
            width=widget_width
        )
+        self.auth_type = self.ipywidget_factory.get_dropdown(
+            options={constants.AUTH_KERBEROS: constants.AUTH_KERBEROS, constants.AUTH_BASIC: constants.AUTH_BASIC,


Password widget has been added for the 7.0.0 version of ipywidgets and should be available soon. For now we can use 7.0.0a2 version of ipywidgets until 7.0.0 is published. But before this, we need to fix problems in display after ipywidget is updated to use 6.0.0 or later versions so as to make it usable.

joychak · 2017-05-15T14:04:05Z

I have a general question (after reviewing the pull request) about this approach of Kerberized mechanism for the following 2 scenarios-

Scenario one:

When SparkMagic is being used inside JupyterHub then the kerberos authentication happens at JupyterHub (userid/passwd are authenticated at JupyterHub before opening any notebook) and the token (not any user-id/passwd) is normally passed to the SparkMagic for further authentication.

Scenario two:

Kerberos authentication can also being done using a keytab file which is very common in corporate environment. When you get authenticated using keytab file (using "KINIT") then you just need to pass the kerberos principal name to the HTTPKerberosAuth object inside "reliablehttpclient.py" file at "_send_request_helper" method. The HTTPKerberosAuth object then picks up the kerberos ticket from the KRB cache file for the given principal.

The above 2 scenarios doesnt need any mechanism within SparkMagic (inside notebook) context to accept userId/passwd.

We need to make sure that we cover these scenarios which is missing currently because the changes in reliablehttpclient.py can only work with UserId/Passwd based Kerberos authentication but not for keytab or KRB-ticket based Kerberos authentication.

aggFTW · 2017-05-15T18:58:19Z

sparkmagic/sparkmagic/utils/configuration.py

@@ -11,6 +11,7 @@
 from .constants import HOME_PATH, CONFIG_FILE, MAGICS_LOGGER_NAME, LIVY_KIND_PARAM
 from .utils import get_livy_kind
 from sparkmagic.livyclientlib.exceptions import BadUserConfigurationException


We also need to modify the example_config.json file

aggFTW · 2017-05-15T18:58:50Z

@pranayhasan, regarding

We need to make the authentication type, endpoints configurable from config.json so that there shall be no need to re-enter each time.

I believe this PR already addresses it. Look at the configuration.py file. Do you have something else in mind?

aggFTW · 2017-05-15T19:03:41Z

@joychak Thanks for reviewing!

Those scenarios make sense and are needed. Is there anything on this PR that would prevent those scenarios from being implemented in a later PR? I'm thinking we could have a 4th and 5th authentication type (e.g. KERBEROS_TOKEN and KERBEROS_PRINCIPAL or something clearer if you have suggestions).

It seems you have implemented at least one of this. Would you be able to submit a PR after this one to add this support or is this design restrictive?

joychak · 2017-05-15T19:14:58Z

Adding KERBEROS_TOKEN and KERBEROS_PRINCIPAL makes sense. I would be able to submit the KERBEROS_PRINCIPAL one immediately after the PR and currently working on the KERBEROS_TOKEN. In case of KERBEROS_TOKEN, the notebook UI should have the provision to provide the keytab file path.

aggFTW · 2017-05-15T19:59:39Z

Thank @joychak! That would be wonderful. For the path, could we use a config based approach?

joychak · 2017-05-15T20:22:50Z

+1 => config based approach

pranayhasan · 2017-05-16T15:45:15Z

I believe this PR already addresses it. Look at the configuration.py file. Do you have something else in mind?

@aggFTW In a corporate environment where all the users need to use Kerberos by default, we don't want to show users the authentication type. It should be defaulted to Kerberos with no display of auth_type in widget. So, we should be able to handle that using auth/auth_type by specifying in ~/.sparkmagic/config.json

aggFTW · 2017-05-16T19:12:07Z

we don't want to show users the authentication type. It should be defaulted to Kerberos with no display of auth_type in widget. So, we should be able to handle that using auth/auth_type by specifying in ~/.sparkmagic/config.json

You want to add a config to not display the authentication type with the regular python kernel widget magics and instead use the default default_livy_endpoint_auth_type specified in the config (I previously asserted this was not needed, but this change would make it needed).

I think this could be a separate PR altogether. @pranayhasan would you mind sending that out if @praveenkanamarlapudi does not tackle it?

pranayhasan · 2017-05-16T19:21:51Z

@aggFTW Yes, sure.

aggFTW · 2017-05-24T22:48:28Z

@praveenkanamarlapudi, any updates?

praveen-kanamarlapudi · 2017-05-25T17:38:04Z

@aggFTW Sorry I was overloaded with other stuff. Let me update the PR by this weekend. Thanks

aggFTW · 2017-06-05T18:28:13Z

sparkmagic/sparkmagic/tests/test_reliablehttpclient.py

 import requests
+from pandas.util.testing import assertIsInstance


You are still using assertIsInstance. Previous comment:

Please use assert isinstance() instead: https://docs.python.org/2/library/functions.html#isinstance since you are not checking Pandas dataframes here.

aggFTW · 2017-06-05T18:34:52Z

sparkmagic/sparkmagic/livyclientlib/reliablehttpclient.py

@@ -3,10 +3,13 @@
 import json
 from time import sleep
 import requests
+from requests_kerberos import HTTPKerberosAuth, REQUIRED
+import subprocess


Remove subprocess import

aggFTW · 2017-06-05T18:45:30Z

sparkmagic/sparkmagic/utils/configuration.py

@@ -202,7 +203,7 @@ def _credentials_override(f):
    If 'base64_password' is not set, it will fallback to to 'password' in config.
    """
    credentials = f()
-    base64_decoded_credentials = {k: credentials.get(k) for k in ('username', 'password', 'url')}
+    base64_decoded_credentials = {k: credentials.get(k) for k in ('username', 'password', 'url', 'auth_type')}


What happens if auth_type isn't there? Wouldn't this throw?

aggFTW · 2017-06-05T18:50:12Z

sparkmagic/sparkmagic/tests/test_handlers.py

@@ -62,7 +64,7 @@ def setUp(self):

        # Mock request
        self.request = MagicMock()
-        self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint})
+        self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint, "auth_type": self.auth_type})


There's no tests here to test for when auth is not in the payload at all. There should be two tests for that case:

auth is not on payload but due to empty username and password, auth should default to None.

auth is not on payload but due to non-empty username and password, auth should default to Basic Auth.

aggFTW · 2017-06-05T18:58:54Z

@praveenkanamarlapudi Please look at all comments.
Also please be aware of this PR: #362
which will have collisions with yours.

praveen-kanamarlapudi · 2017-06-06T04:58:50Z

@aggFTW Update code. Sorry we had differences with our branch, so missed to push few changes earlier. I will update the README soon.

praveen-kanamarlapudi · 2017-06-06T05:01:43Z

sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py

                         self.ipywidget_factory.get_html(value="<br/>", width=widget_width), self.submit_widget]

        for child in self.children:
            child.parent_widget = self

+        self._show_correct_endpoint_fields()


Calling initially to display right display elements initially as per the auth type.

aggFTW · 2017-06-06T23:02:27Z

sparkmagic/sparkmagic/kernels/kernelmagics.py


    def refresh_configuration(self):
        credentials = getattr(conf, 'base64_kernel_' + self.language + '_credentials')()
        (username, password, url) = (credentials['username'], credentials['password'], credentials['url'])
-        self.endpoint = Endpoint(url, username, password)
+        if hasattr(credentials, 'auth'):


credentials is a dictionary, so the check would be if 'auth' in credentials

This check would not be here. When you call conf.base64_kernel_X_credentials(), the contract is that auth is always returned. So this code that sets auth if it's not specified by the user needs to be in configuration.py

Please add unit tests for this code in test_configuration.py. This back compat is extremely important to get right.

aggFTW · 2017-06-06T23:11:05Z

sparkmagic/sparkmagic/livyclientlib/reliablehttpclient.py

+        elif self._endpoint.auth == constants.AUTH_BASIC:
+            self._auth = (self._endpoint.username, self._endpoint.password)
+        elif self._endpoint.auth != constants.NO_AUTH:
+            raise ValueError(u"Unsupported auth."+self._endpoint.auth)


This is still ValueError when I asked to change it to Bad Config exception at https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/livyclientlib/exceptions.py#L41

aggFTW · 2017-06-06T23:12:27Z

sparkmagic/sparkmagic/controllerwidget/addendpointwidget.py

        self.endpoints[self.address_widget.value] = endpoint
        self.ipython_display.writeln("Added endpoint {}".format(self.address_widget.value))

        # We need to call the refresh method because drop down in Tab 2 for endpoints wouldn't refresh with the new
        # value otherwise.
        self.refresh_method()
+
+    def _show_correct_endpoint_fields(self):


Did you try this signature on trait change? Did it work? I would have expected it to throw and for the fields not to be updated if the signature of the method were wrong.

I tested it and works fine.

aggFTW · 2017-06-06T23:15:46Z

Please review my comments again. Let's be thorough on reviewing comments and addressing them.
I will review when you have addressed all comments and updated the README too.

aggFTW · 2017-06-07T17:06:32Z

README.md

@@ -107,6 +108,9 @@ However, there are some important limitations to note:
 2. Since all code is run on a remote driver through Livy, all structured data must be serialized to JSON and parsed by the Sparkmagic library so that it can be manipulated and visualized on the client side.
 In practice this means that you must use Python for client-side data manipulation in `%%local` mode.

+## Kerberos support
+Sparkmagic uses [requests-kerberos](https://github.com/requests/requests-kerberos) package for kerberos support. The sparkmagic expects kerberos ticket to be avaibale in the system. It picks up the kerbros ticket from cache file. User needs to run kinit to create the kerberos ticket. The sparkmagic doesn't support passing kerberos principal/token.


Can we please change this paragraph to:

## Authentication Methods Sparkmagic supports: * No auth * Basic authentication * Kerberos Kerberos support is implemented via the [requests-kerberos](https://github.com/requests/requests-kerberos) package. Sparkmagic expects a kerberos ticket to be available in the system. Requests-kerberos will pick up the kerberos ticket from a cache file. For the ticket to be available, the user needs to have run [kinit](https://web.mit.edu/kerberos/krb5-1.12/doc/user/user_commands/kinit.html) to create the kerberos ticket. Currently, sparkmagic does not support passing a kerberos principal/token, but we welcome pull requests.

aggFTW · 2017-06-07T17:14:52Z

sparkmagic/sparkmagic/tests/test_configuration.py

+    conf.override(conf.status_sleep_seconds.__name__, 1)
+    assert_equals(conf.d, { conf.kernel_python_credentials.__name__: kpc,
+                                     conf.status_sleep_seconds.__name__: 1 })
+    assert_equals(conf.status_sleep_seconds(), 1)


Great to see these tests! 😄 Thanks for adding them!
Why are you overriding status_sleep_seconds here?

aggFTW · 2017-06-07T17:15:25Z

sparkmagic/sparkmagic/tests/test_configuration.py

+    kpc = { 'username': '', 'password': '', 'url': 'L'}
+    overrides = { conf.kernel_python_credentials.__name__: kpc }
+    conf.override_all(overrides)
+    conf.override(conf.status_sleep_seconds.__name__, 1)


Same, why override status_sleep_seconds at all?

Let me correct it.

I thought it will be like extra check. I can remove it if not needed.

Yes, I think the extra check is not needed. Thank you!

aggFTW · 2017-06-07T17:17:37Z

sparkmagic/sparkmagic/tests/test_handlers.py

@@ -62,7 +64,7 @@ def setUp(self):

        # Mock request
        self.request = MagicMock()
-        self.request.body = json.dumps({"path": self.path, "username": self.username, "password": self.password, "endpoint": self.endpoint})


Please add two tests where you don't send the auth parameter at all and still get no auth and basic auth as values for auth.

aggFTW · 2017-06-07T17:19:50Z

sparkmagic/sparkmagic/utils/configuration.py

@@ -212,9 +212,10 @@ def _credentials_override(f):
            exception_type, exception, traceback = sys.exc_info()
            msg = "base64_password for %s contains invalid base64 string: %s %s" % (f.__name__, exception_type, exception)
            raise BadUserConfigurationException(msg)
+    if base64_decoded_credentials['auth'] is None:


Awesome! 👍

aggFTW · 2017-06-07T18:03:46Z

Ready for merge! 👍 Thanks so much @praveenkanamarlapudi. I hope a lot of people are going to find this very useful!

@joychak, we would love to see those code reviews for other authentication methods 😃

* Make location of config.json file configurable using environment variables (#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (#365) * Remove extra line * bump version * Optional coerce (#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config

* Remove extra line * Release 0.12.0 (#373) * Make location of config.json file configurable using environment variables (#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (#365) * Remove extra line * bump version * Optional coerce (#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config * bumping versions

ghoshtir · 2018-02-21T18:03:27Z

I am trying to get Kerberos authentication working between livy 0.3 (in a Linux edge node of a cluster) and spakmagic 0.12.5 as part of jupyter notebook running on my windows box. On the windows side I am happy to kinit (I don't see the username/password on the UI when I select auth-type as Kerberos). Can you please confirm if the versions mentioned will allow Kerberos authentication between jupyter and livy?

* Release 0.12.0 (jupyter-incubator#373) * Make location of config.json file configurable using environment variables (jupyter-incubator#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (jupyter-incubator#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (jupyter-incubator#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (jupyter-incubator#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (jupyter-incubator#365) * Remove extra line * bump version * Optional coerce (jupyter-incubator#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (jupyter-incubator#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (jupyter-incubator#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (jupyter-incubator#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config * Merge master to release (jupyter-incubator#390) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Merge master to release (jupyter-incubator#392) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Merge master to release (jupyter-incubator#394) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Test 2.7.13 environment for pypi push to prod (jupyter-incubator#393) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * try to fix pip push for prod pypi by pinning to later version of python * bump versions (jupyter-incubator#395) * Release v0.12.6 (jupyter-incubator#481) * Add python3 option in %manage_spark magic (jupyter-incubator#427) Fixes jupyter-incubator#420 * Links fixed in README * DataError in Pandas moved from core.groupby to core.base (jupyter-incubator#459) * DataError in Pandas moved from core.groupby to core.base * maintain backwards compatability with Pandas 0.22 or lower for DataError * Bump autoviz version to 0.12.6 * Fix unit test failure caused by un-spec'ed mock which fails traitlet validation (jupyter-incubator#480) * Fix failing unit tests Caused by an un-spec'ed mock in a test which fails traitlet validation * Bump travis.yml Python3 version to 3.6 Python 3.3 is not only EOL'ed but is now actively unsupported by Tornado, which causes the Travis build to fail again. * Bumping version numbers for hdijupyterutils and sparkmagic to keep them in sync * add magic for matplotlib display * repair * Patch SparkMagic for latest IPythonKernel compatibility **Description** * The IPython interface was updated to return an asyncio.Future rather than a dict from version 5.1.0. This broke SparkMagic as it still expects a dictionart from the output * This change updates the SparkMagic base kernel to expect a Future and block on its result. * This also updates the dependencies to call out the the new IPython version dependency. **Testing Done** * Unit tests added * Validating that the kernel connects successfully * Validating some basic Spark additional operations on an EMR cluster. * Fix decode json error at trailing empty line (jupyter-incubator#483) * Bump version number to 0.12.7 * add a screenshot of an example for display matplot picture * Fix guaranteed stack trace * Simplify loop a bit * We want to be able to interrupt the sleep, so move that outside the try / except * Add missing session status to session. * Correct to correct URL with full list. * Better tests. * Switch to Livy 0.6. * Sketch of removal of PYSPARK3. * Don't allow selection of Python3, since it's not a separate thing now. * __repr__ for easier debugging of test failures. * Start fixing tests. * Rip out more no-longer-relevant "Python 3" code. Python 3 and Python 2 work again. * Changelog. * Add progress bar to sparkmagic/sparkmagic/livyclientlib/command.py. Tested with livy 0.4-0.6, python2 and python3 * Support Future and non-Future results from ipykernel. * News entry. * Unpin ipykernel so it works with Python 2. * Python 3.7 support. * Also update requirements.txt. * Xenial has 3.7. * from IPython.display import display to silence travis warning * Couple missing entries. * Update versions. * Document release process, as I understand it. * Correct file name. * delete obsolete pyspark3kernel (jupyter-incubator#549) * delete obsolete pyspark3kernel * Update README.md * Update setup.py * Update test_kernels.py * Remove old kernelspec installation from Dockerfile This kernel was removed in jupyter-incubator#549 but the Dockerfile still tries to install it, which fails the build. This corrects that. * Relax constraints even more, and make sure to relax them in duplicate locations. * Don't assume some pre-populated tables, create a new table from the Docker image's examples. * Note new feature. * Additional dependencies for matplotlib to work. * Add support and documentation for extension use, refactor kernel use. * Example in pyspark kernel. * Test for Command's decoding of images. * Switch to plotly 3. * Try to switch to standard mechanism Sparkmagic uses for displaying. * Another entry. * Add documentation for JupyterLab. * Prepare for 0.12.9. * Revert get_session_kind change to be more consistent with upstream repo. * Remove redundant python3 session test. * Remove python3 references in livysession.

Enabled kerberos authentication on sparkmagic and updated test cases.

59360e6

praveen-kanamarlapudi mentioned this pull request May 10, 2017

Add Kerberos Support #282

Closed

aggFTW requested changes May 12, 2017

View reviewed changes

pranayhasan suggested changes May 15, 2017

View reviewed changes

aggFTW reviewed May 15, 2017

View reviewed changes

aggFTW changed the title ~~Kerberos authentication for the livy~~ Kerberos authentication for livy May 16, 2017

aggFTW changed the title ~~Kerberos authentication for livy~~ Kerberos authentication support May 16, 2017

Enabled hide and show username/password based on auth_type.

f4cd1af

praveen-kanamarlapudi force-pushed the Kerberos branch 6 times, most recently from fd4865c to be94edf Compare June 4, 2017 06:40

aggFTW reviewed Jun 5, 2017

View reviewed changes

praveen-kanamarlapudi force-pushed the Kerberos branch 2 times, most recently from eb69f63 to 0ee14df Compare June 6, 2017 04:57

praveen-kanamarlapudi commented Jun 6, 2017

View reviewed changes

aggFTW reviewed Jun 6, 2017

View reviewed changes

Updated as per comments.

a72c81b

praveen-kanamarlapudi force-pushed the Kerberos branch from 0ee14df to a72c81b Compare June 7, 2017 05:06

aggFTW reviewed Jun 7, 2017

View reviewed changes

Updated documentation for kerberos support

0ef38e6

aggFTW reviewed Jun 7, 2017

View reviewed changes

praveen-kanamarlapudi force-pushed the Kerberos branch from a223795 to 0ef38e6 Compare June 7, 2017 17:14

aggFTW reviewed Jun 7, 2017

View reviewed changes

Added test cases to test backward compatibility of auth in handlers

fd63ccb

aggFTW approved these changes Jun 7, 2017

View reviewed changes

aggFTW merged commit 8b1e6bc into jupyter-incubator:master Jun 7, 2017

		@@ -112,7 +114,7 @@ def spark(self, line, cell="", local_ns=None):
		# info

		import requests
		from pandas.util.testing import assertIsInstance

Kerberos authentication support #355

Kerberos authentication support #355

Conversation

praveen-kanamarlapudi commented May 10, 2017 • edited by aggFTW

aggFTW commented May 10, 2017

aggFTW commented May 12, 2017

aggFTW left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aggFTW May 11, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joychak commented May 12, 2017 via email

aggFTW commented May 12, 2017

praveen-kanamarlapudi commented May 12, 2017 • edited

pranayhasan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joychak commented May 15, 2017

Scenario one:

Scenario two:

Choose a reason for hiding this comment

aggFTW commented May 15, 2017

aggFTW commented May 15, 2017

joychak commented May 15, 2017

aggFTW commented May 15, 2017

joychak commented May 15, 2017

pranayhasan commented May 16, 2017

aggFTW commented May 16, 2017

pranayhasan commented May 16, 2017

aggFTW commented May 24, 2017

praveen-kanamarlapudi commented May 25, 2017

Choose a reason for hiding this comment

aggFTW Jun 5, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aggFTW commented Jun 5, 2017

praveen-kanamarlapudi commented Jun 6, 2017 • edited

praveen-kanamarlapudi Jun 6, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aggFTW commented Jun 6, 2017

aggFTW Jun 7, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aggFTW Jun 7, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

praveen-kanamarlapudi commented May 10, 2017 •

edited by aggFTW

aggFTW May 11, 2017 •

edited

praveen-kanamarlapudi commented May 12, 2017 •

edited

aggFTW Jun 5, 2017 •

edited

praveen-kanamarlapudi commented Jun 6, 2017 •

edited

praveen-kanamarlapudi Jun 6, 2017 •

edited

aggFTW Jun 7, 2017 •

edited

aggFTW Jun 7, 2017 •

edited