Too large queries produce MaxRetryError #413

rth · 2024-07-11T15:00:42Z

Previously when a too big query was made #383 we got 0 rows as output (as discussed in that issue) . With the changes in #405 for me it now produces an MaxRetryError which is better but the error message is misleading (and also retying so many times is slow).

The minimal code I'm using is,

    from databricks import sql as databricks_sql

    db = databricks_sql.connect(
        server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
        http_path=os.getenv("DATABRICKS_HTTP_PATH"),
        access_token=os.getenv("DATABRICKS_TOKEN"),
        _tls_no_verify=True
    )
    cursor = db.cursor()
    cursor.execute("<my-query>")
    data = cursor.fetchall()

If the query is small, it works with no warnings.

If the query is too big produces the following MaxRetryError with a nested SSLError. There is no way to detect a too big query in the HTTP response status without retying N times and reaching MaxRetryError? And also I have the impression that _tls_no_verify is not passed somewhere in this case, which produces those SSLError. cc @kravets-levko

urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "python3.10/site-packages/urllib3/connectionpool.py", line 873, in urlopen
    return self.urlopen(
  File "python3.10/site-packages/urllib3/connectionpool.py", line 873, in urlopen
    return self.urlopen(
  File "python3.10/site-packages/urllib3/connectionpool.py", line 873, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='xxxx.blob.core.windows.net', port=443): Max retries exceeded with url: /jobs/999999/sql/2024-07-11/14/results_2024-07-11T14:44:24Z_ef4c56f4-6fc6-43ca-b8dd-009eeb472cd4?sig=xxxx&se=2024-07-11T14%3A59%3A26Z&sv=2019-02-02&spr=https&sp=r&sr=b (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "fetch-databricks.py", line 13, in main
    cursor.execute("select * from sgg.site_cbs_coatify_view")
  File "databricks/sql/client.py", line 768, in execute
    execute_response = self.thrift_backend.execute_command(
  File "databricks/sql/thrift_backend.py", line 869, in execute_command
    return self._handle_execute_response(resp, cursor)
  File "databricks/sql/thrift_backend.py", line 966, in _handle_execute_response
    return self._results_message_to_execute_response(resp, final_operation_state)
  File "databricks/sql/thrift_backend.py", line 770, in _results_message_to_execute_response
    arrow_queue_opt = ResultSetQueueFactory.build_queue(
  File "databricks/sql/utils.py", line 84, in build_queue
    return CloudFetchQueue(
  File "databricks/sql/utils.py", line 175, in __init__
    self.table = self._create_next_table()
  File "databricks/sql/utils.py", line 238, in _create_next_table
    downloaded_file = self.download_manager.get_next_downloaded_file(
  File "databricks/sql/cloudfetch/download_manager.py", line 68, in get_next_downloaded_file
    file = task.result()
  File "python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "databricks/sql/cloudfetch/downloader.py", line 95, in run
    response = session.get(
  File "python3.10/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "python3.10/site-packages/requests/adapters.py", line 698, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='xxxxxx.blob.core.windows.net', port=443): Max retries exceeded with url: /jobs/123445/sql/2024-07-11/14/results_2024-07-11T14:44:24Z_ef4c56f4-6fc6-43ca-b8dd-009eeb472cd4?sig=xxxxxxxse=2024-07-11T14%3A59%3A26Z&sv=2019-02-02&spr=https&sp=r&sr=b (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)')))

Versions

requests                      2.32.3
urllib3                       2.2.2
databricks-sql-python  main

The text was updated successfully, but these errors were encountered:

kravets-levko · 2024-07-11T15:10:20Z

@rth Can you please enable debug logging and share the log?

import logging
from databricks import sql

logging.basicConfig(level=logging.DEBUG)

# other your code

Also, can you try to access that failed link using wget/curl/browser? I'm curious if there are some SSL issues on server, or if that's something on our side

kravets-levko · 2024-07-11T15:16:27Z

+ additional question: do you use any kind of proxy, firewall, VPN, or something that may affect ssl cert validation?

kravets-levko · 2024-07-11T15:20:38Z

+ if you are able to path databricks/sql on your machine - can you try this? locate this file and line - https://github.com/databricks/databricks-sql-python/blob/main/src/databricks/sql/cloudfetch/downloader.py#L96 - and add a verify=False parameter

rth · 2024-07-11T15:48:42Z

Thanks for your feedback @kravets-levko !

Yes, I'm behind a corporate proxy that does SSL cerificate rewrites, so by itself SSLError s are expected.
If I modify downloader.py#L96 to add verify=False actually it works even for a big query that previously failed. It's just confusing because I though I disabled SSL verification since smaller queries worked fine.

Any chance you could allows users to disable ssl verification in that section without editing the code? For instance either via the _tls_no_verify (or ssl_verify) parameter passed to connect or even via monkeypatching some object in databricks.sql.cloudfetch.downloader if option a) would be difficult.

If still relevant some of the debug logs in the first case where it failed are below,

DEBUG:databricks.sql.thrift_backend:retry parameter: _retry_delay_min given_or_default 1.0
DEBUG:databricks.sql.thrift_backend:retry parameter: _retry_delay_max given_or_default 60.0
DEBUG:databricks.sql.thrift_backend:retry parameter: _retry_stop_after_attempts_count given_or_default 30
DEBUG:databricks.sql.thrift_backend:retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
DEBUG:databricks.sql.thrift_backend:retry parameter: _retry_delay_default given_or_default 5.0
DEBUG:databricks.sql.thrift_backend:Sending request: OpenSession(<REDACTED>)
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.14.azuredatabricks.net:443
DEBUG:urllib3.connectionpool:https://xxxx.azuredatabricks.net:443 "POST /sql/protocolv1/o/xxxx/0605-141634-2g07xge9 HTTP/11" 200 171
DEBUG:databricks.sql.thrift_backend:Received response: TOpenSessionResp(<REDACTED>)
INFO:databricks.sql.client:Successfully opened session 37f9a404-d4f2-4f21-aafc-464e03cf22e0
DEBUG:databricks.sql.thrift_backend:Sending request: ExecuteStatement(<REDACTED>)
DEBUG:urllib3.connectionpool:https://xxxx.azuredatabricks.net:443 "POST /sql/protocolv1/o/xxxxx/0605-141634-2g07xge9 HTTP/11" 200 14371
DEBUG:databricks.sql.thrift_backend:Received response: TExecuteStatementResp(<REDACTED>)
DEBUG:databricks.sql.utils:Initialize CloudFetch loader, row set start offset: 0, file list:
DEBUG:databricks.sql.utils:- start row offset: 0, row count: 49152
DEBUG:databricks.sql.utils:- start row offset: 49152, row count: 49152
DEBUG:databricks.sql.utils:- start row offset: 98304, row count: 22480
DEBUG:databricks.sql.utils:- start row offset: 120784, row count: 49152
DEBUG:databricks.sql.utils:- start row offset: 169936, row count: 49152
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: adding file link, start offset 0, row count: 49152
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: adding file link, start offset 49152, row count: 49152
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: adding file link, start offset 98304, row count: 22480
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: adding file link, start offset 120784, row count: 49152
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: adding file link, start offset 169936, row count: 49152
DEBUG:databricks.sql.utils:CloudFetchQueue: Trying to get downloaded file for row 0
DEBUG:databricks.sql.cloudfetch.download_manager:ResultFileDownloadManager: schedule downloads
DEBUG:databricks.sql.cloudfetch.download_manager:- start: 0, row count: 49152
DEBUG:databricks.sql.cloudfetch.downloader:ResultSetDownloadHandler: starting file download, offset 0, row count 49152
DEBUG:databricks.sql.cloudfetch.download_manager:- start: 49152, row count: 49152
DEBUG:databricks.sql.cloudfetch.downloader:ResultSetDownloadHandler: starting file download, offset 49152, row count 49152
DEBUG:databricks.sql.cloudfetch.download_manager:- start: 98304, row count: 22480
DEBUG:databricks.sql.cloudfetch.downloader:ResultSetDownloadHandler: starting file download, offset 98304, row count 22480
DEBUG:databricks.sql.cloudfetch.download_manager:- start: 120784, row count: 49152
DEBUG:databricks.sql.cloudfetch.downloader:ResultSetDownloadHandler: starting file download, offset 120784, row count 49152
DEBUG:databricks.sql.cloudfetch.download_manager:- start: 169936, row count: 49152
DEBUG:databricks.sql.cloudfetch.downloader:ResultSetDownloadHandler: starting file download, offset 169936, row count 49152
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.blob.core.windows.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.blob.core.windows.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.blob.core.windows.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.blob.core.windows.net:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx.blob.core.windows.net:443
DEBUG:urllib3.util.retry:Incremented Retry for (url='/jobs/xxxx/sql/2024-07-11/15/results_2024-07-11T15:40:05Z_05e1b6a6-3439-440c-8522-b428f40d3b6f?sig=xxx%2FxxtrcDCT8U%3D&se=2024-07-11T15%3A55%3A07Z&sv=2019-02-02&spr=https&sp=r&sr=b'): Retry(total=4, connect=None, read=None, redirect=None, status=None)

kravets-levko · 2024-07-11T15:59:56Z

The thing is that for smaller results CloudFetch is not used, that's why you were able to get result. Thank you for your help and all the feedback you provide, and also I really appreciate that you were able to help me with debugging. PR will come in a minute

rth mentioned this issue Jul 11, 2024

Connector reads 0 rows although Cluster returned results #383

Closed

kravets-levko mentioned this issue Jul 11, 2024

Disable SSL verification for CloudFetch links #414

Merged

kravets-levko closed this as completed in #414 Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too large queries produce MaxRetryError #413

Too large queries produce MaxRetryError #413

rth commented Jul 11, 2024

kravets-levko commented Jul 11, 2024

kravets-levko commented Jul 11, 2024 •

edited

Loading

kravets-levko commented Jul 11, 2024

rth commented Jul 11, 2024 •

edited

Loading

kravets-levko commented Jul 11, 2024

Too large queries produce MaxRetryError #413

Too large queries produce MaxRetryError #413

Comments

rth commented Jul 11, 2024

Versions

kravets-levko commented Jul 11, 2024

kravets-levko commented Jul 11, 2024 • edited Loading

kravets-levko commented Jul 11, 2024

rth commented Jul 11, 2024 • edited Loading

kravets-levko commented Jul 11, 2024

kravets-levko commented Jul 11, 2024 •

edited

Loading

rth commented Jul 11, 2024 •

edited

Loading