Skip to content

[KYUUBI #7106] Make response.results.columns optional #7107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

fbertsch
Copy link

Why are the changes needed?

Bugfix. Spark 3.5 is returning None for response.results.columns, while Spark 3.3 returned actual values.

The response here: https://github.com/apache/kyuubi/blob/master/python/pyhive/hive.py#L507

For a query that does nothing (mine was an add jar s3://a/b/c.jar), here are the responses I received.

Spark 3.3:

TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=[TColumn(boolVal=None, byteVal=None, i16Val=None, i32Val=None, i64Val=None, doubleVal=None, stringVal=TStringColumn(values=[], nulls=b'\x00'), binaryVal=None)], binaryColumns=None, columnCount=None))

Spark 3.5:

TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=None, binaryColumns=None, columnCount=None))

How was this patch tested?

I tested by applying it locally and running my query against Spark 3.5. I was not able to get any unit tests running, sorry!

Was this patch authored or co-authored using generative AI tooling?

No.

@fbertsch
Copy link
Author

Fixes #7106

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (5237227) to head (13d1440).

Additional details and impacted files
@@          Coverage Diff           @@
##           master   #7107   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         697     697           
  Lines       43214   43214           
  Branches     5855    5855           
======================================
  Misses      43214   43214           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@turboFei turboFei requested a review from Copilot June 18, 2025 20:21
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

A bugfix to handle Spark 3.5 behavior when response.results.columns is None by making columns optional in the fetch results operation.

  • Introduces a new boolean flag (has_new_data) to check for actual returned data.
  • Safely updates the request state to finished when no new data is available.

zip(response.results.columns, schema)]
new_data = list(zip(*columns))
self._data += new_data
has_new_data = (True if new_data else False)
Copy link
Preview

Copilot AI Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider simplifying the assignment by using 'bool(new_data)' instead of the ternary operator to improve readability and clarity.

Suggested change
has_new_data = (True if new_data else False)
has_new_data = bool(new_data)

Copilot uses AI. Check for mistakes.

Copy link
Member

@turboFei turboFei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@pan3793
Copy link
Member

pan3793 commented Jun 19, 2025

@fbertsch thank you for fixing this issue. Do you happen to know which Spark PR causes this behavior change?

@fbertsch
Copy link
Author

@fbertsch thank you for fixing this issue. Do you happen to know which Spark PR causes this behavior change?

I haven't been able to confirm, but I believe it's this change: https://issues.apache.org/jira/browse/SPARK-39041

That redid all the HiveThriftServer responses, and probably also changed the column responses.

@fbertsch
Copy link
Author

@turboFei are you able to release a new version of PyHive with this included?

@turboFei
Copy link
Member

@turboFei are you able to release a new version of PyHive with this included?

cc @pan3793

@pan3793 pan3793 closed this in b49ed02 Jun 23, 2025
@pan3793 pan3793 added this to the v1.11.0 milestone Jun 23, 2025
@pan3793
Copy link
Member

pan3793 commented Jun 23, 2025

Thanks, merged to master.

are you able to release a new version of PyHive with this included?

This is definitely on our TODO list, hopefully we can achieve the first release in July.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants