Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 69 PyHive and Transport mode - HTTP #135

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
27 changes: 22 additions & 5 deletions pyhive/hive.py
Expand Up @@ -24,7 +24,9 @@
import thrift.protocol.TBinaryProtocol
import thrift.transport.TSocket
import thrift.transport.TTransport
import thrift.transport.THttpClient
import thrift_sasl
import base64

# PEP 249 module globals
apilevel = '2.0'
Expand Down Expand Up @@ -69,7 +71,8 @@ class Connection(object):
"""Wraps a Thrift session"""

def __init__(self, host, port=10000, username=None, database='default', auth='NONE',
configuration=None, kerberos_service_name=None, password=None):
configuration=None, kerberos_service_name=None, password=None,
mode="NONE", http_path=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can you change mode to transport_mode to better match the Hive setting and be more obvious?
  • Please add some docs:
    • transport_mode corresponds to hive.server2.transport.mode
    • http_path corresponds to hive.server2.thrift.http.path

"""Connect to HiveServer2

:param auth: The value of hive.server2.authentication used by HiveServer2
Expand All @@ -85,11 +88,16 @@ def __init__(self, host, port=10000, username=None, database='default', auth='NO
username = username or getpass.getuser()
configuration = configuration or {}

if (password is not None) != (auth == 'LDAP'):
raise ValueError("Password should be set if and only if in LDAP mode; "
"Remove password or add auth='LDAP'")
if (password is not None) != (auth == 'LDAP' or mode == 'HTTP'):
raise ValueError("password should be set if and only if in LDAP mode or HTTP mode")
if (kerberos_service_name is not None) != (auth == 'KERBEROS'):
raise ValueError("kerberos_service_name should be set if and only if in KERBEROS mode")
if (http_path is not None) and (mode != 'HTTP'):
raise ValueError("http_path should be set if and only if in HTTP mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Message is slightly wrong. Should be "http_path may be set only if in HTTP mode", since it's not mandatory to override in HTTP mode.

if (mode =='HTTP' and auth != 'NONE'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: unnecessary parens and space around equals

raise ValueError("http mode only works with None auth mode")

http_path = http_path or 'cliservice'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid for http_path to be an empty string?


if auth == 'NOSASL':
# NOSASL corresponds to hive.server2.authentication=NOSASL in hive-site.xml
Expand All @@ -116,7 +124,16 @@ def sasl_factory():
raise AssertionError
sasl_client.init()
return sasl_client
self._transport = thrift_sasl.TSaslClientTransport(sasl_factory, sasl_auth, socket)

if mode == "HTTP" and auth == "NONE":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to re-check auth here? Also, there's no overlap with the SASL code, so this would be a bit cleaner in the outer if, like the if auth == 'NOSASL': branch.

nitpick: convention in this project is single quotes for code-like strings, double quotes for human language strings.

uri = "http://{}:{}/{}".format(host, port, http_path)
self._transport = thrift.transport.THttpClient.THttpClient(uri)
ap = "%s:%s" % (username,password)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do ap and cr stand for? I can't figure out if they're standard abbreviations or not.

cr = base64.b64encode(ap).strip()
self._transport.setCustomHeaders({"Authorization": "Basic "+cr})
else:
self._transport = thrift_sasl.TSaslClientTransport(sasl_factory, sasl_auth, socket)

else:
raise NotImplementedError(
"Only NONE, NOSASL, LDAP, KERBEROS "
Expand Down