Skip to content

Connectivity of PyIceberg to Hive Metastore on HPE #2892

@dev-sajal

Description

@dev-sajal

Question

I am trying to connect with Hive Metastore with HPE, PyIceberg isn't able to connect with the same. While PySpark could connect to the HMS on the same platform. We are using a mapr keytab. I would like to check if the functionality is supported by PyIceberg at the moment.

Our debugging on the issue is: Kerberized environment without HTTP (default for HPE) isn't supported by thrift or pyiceberg.

from pyiceberg.catalog import load_catalog

catalog = load_catalog(
    "hive",
    **{
        "uri": config["THRIFT_SERVER"],
        "s3.endpoint": config["ENDPOINT_URL"],
        "s3.access-key-id": config["ACCESS_KEY_ID"],
        "s3.secret-access-key": config["SECRET_ACCESS_KEY"],
        "hive.kerberos-authentication": "true",
        "hive.metastore.sasl.enabled": "true",
        "hive.kerberos-service-name": "host"
    }
)

Error log:

Traceback (most recent call last):
  File "/opt/hpe_venv/lib64/python3.11/site-packages/pyiceberg/catalog/hive.py", line 186, in __enter__
    self._transport.open()
  File "/opt/hpe_venv/lib64/python3.11/site-packages/thrift/transport/TTransport.py", line 397, in open
    raise TTransportException(
thrift.transport.TTransport.TTransportException: Bad SASL negotiation status: 3 (b'GSS initiate failed')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/hpe_venv/lib64/python3.11/site-packages/pyiceberg/catalog/hive.py", line 733, in list_tables
    with self._client as open_client:
  File "/opt/hpe_venv/lib64/python3.11/site-packages/pyiceberg/catalog/hive.py", line 190, in __enter__
    self._transport.open()
  File "/opt/hpe_venv/lib64/python3.11/site-packages/thrift/transport/TTransport.py", line 397, in open
    raise TTransportException(
thrift.transport.TTransport.TTransportException: Bad SASL negotiation status: 3 (b'GSS initiate failed')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions