Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Dependencies for Impala and Kerberos are insufficient in setup.py #2342

Closed
ZeroCool2u opened this issue Aug 19, 2020 · 4 comments
Closed
Labels
bug Incorrect behavior inside of ibis impala The Apache Impala backend

Comments

@ZeroCool2u
Copy link
Contributor

There exists a situation in which after installing ibis with both the kerberos and impala extras, auth to a kerberized cluster will fail. I have confirmed this on Windows, I'm not sure if the issue is limited to Windows though. The issue can be reproduced by doing the following after creating a new venv:

pip install ibis-framework[kerberos]
pip install ibis-framework[impala]
import ibis
hdfs_host = 'hdfs.corp.com'
impala_host = 'impala.corp.com'
hdfs = ibis.hdfs_connect(host=hdfs_host, auth_mechanism='GSSAPI', verify=False)
client = ibis.impala.connect(host=impala_host, hdfs_client=hdfs, auth_mechanism='GSSAPI', use_ssl=True, ca_cert=None)

This will throw a ModuleNotFoundError, which originates from the impyla library here. Looking at the impyla setup.py file, we can see if we want kerberos auth available, we must install impyla with the kerberos extras argument, which installs thrift_sasl.

The change is small, but there are a couple ways to do this with some trade offs.

  1. We can always install impyla with the kerberos extras argument, so we don't have to worry about using the correct version of thrift_sasl, though this adds some extra bulk to the ibis impala install. Though one could argue that most corporate impala clusters will likely use kerberos auth anyways and the bulk is negligible.
  2. We can make thrift_sasl a part of the ibis kerberos extras argument, though this risks the version requirements between projects conflicting creating a maintenance issue.
  3. Leave it as is, but add a note in the documentation or catch the exception and let the user know they must install thrift_sasl. This doesn't seem ideal to me, but the option exists.

I would advocate for option 1 as the downsides seem negligible, but there may be other problems that I have not identified yet, so I'd prefer to gather some feedback first.

@datapythonista datapythonista added impala The Apache Impala backend bug Incorrect behavior inside of ibis labels Aug 21, 2020
@datapythonista
Copy link
Contributor

I don't understand the cons in 2, that sounds like the best solution to me, but I guess I'm not understanding your point. But @jreback will have a more informed opinion than me on this.

@ZeroCool2u
Copy link
Contributor Author

@datapythonista To clarify, impyla specifies a specific version of thrift_sasl and there's the possibility that our version and the impyla version become out of sync. Option 1 lets us rely on impyla to specify the correct version with no extra maintenance work. Of course, it's not a huge deal, just one less thing to think about.

@jreback
Copy link
Contributor

jreback commented Aug 21, 2020

all about making this 0-mainenance; the impala backend doesn't have a lot of dev support

@ZeroCool2u
Copy link
Contributor Author

Okay, sounds like option 2 is preferred then. I can create a PR to change the dependencies in setup.py, but I'm not as familiar with how conda does the equivalent. Does the change just need to be made to the environment.yml file?

ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 5, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 8, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 9, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 9, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 9, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 10, 2020
ZeroCool2u added a commit to ZeroCool2u/ibis that referenced this issue Sep 14, 2020
@jreback jreback added this to the Next Bugfix Release milestone Sep 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis impala The Apache Impala backend
Projects
None yet
Development

No branches or pull requests

3 participants