Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Presto clients to receive query results in binary format #20932

Merged
merged 1 commit into from Sep 22, 2023

Conversation

mbasmanova
Copy link
Contributor

@mbasmanova mbasmanova commented Sep 21, 2023

Introduce 'binaryResults=true' query parameter to request query results returned in binary format https://prestodb.io/docs/current/develop/serialized-page.html

With binaryResults=true, the result json will contain 'binaryData' field of type array with one or
more pages of the results. The 'data' field will not be included in the result JSON.

See #20886

== NO RELEASE NOTE ==

@mbasmanova mbasmanova requested a review from a team as a code owner September 21, 2023 11:16
@mbasmanova mbasmanova changed the title Allow Presto Client to return results in binary format Allow Presto clients to receive query results in binary format Sep 21, 2023
@mbasmanova mbasmanova force-pushed the binary-results branch 3 times, most recently from 5bca17d to 9955f8c Compare September 21, 2023 14:36
@tdcmeehan
Copy link
Contributor

Please consider adding either a query or path parameter to toggle the client results, as clients may set arbitrary headers, and without migration of clients to support this format, it will cause them to break unexpectedly (the data field will be null, which will indicate query completion).

@mbasmanova
Copy link
Contributor Author

Please consider adding either a query or path parameter to toggle the client results, as clients may set arbitrary headers, and without migration of clients to support this format, it will cause them to break unexpectedly (the data field will be null, which will indicate query completion).

@tdcmeehan Tim, thank you for reviewing. I updated to use query parameter binaryResults=true instead of the header. Would you, please, take another look?

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, whereas data is Object because we don't know the underlying types (primitives, lists, etc.), don't we know that binaryData is String since it's Base64 encoded?

data.addAll(queryResults.getBinaryData());
}
}
assertNull(queryResults.getError());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps print out the error in case this test becomes flaky

@@ -442,25 +447,48 @@ private synchronized QueryResults getNextResult(long token, UriInfo uriInfo, Str
// last page is removed. If another thread observes this state before the response is cached
// the pages will be lost.
Iterable<List<Object>> data = null;
List<Object> binaryData = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: don't we know that the parameter of the List is String?

@mbasmanova
Copy link
Contributor Author

@tdcmeehan Good points, Tim. Updated. Please, take another look.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % a few more nits, feel free to merge after addressing!

if (serializedPage == null) {
break;
if (binaryResults) {
binaryData = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use immutable list (consistent with below branch)?

DynamicSliceOutput sliceOutput = new DynamicSliceOutput(1000);
PagesSerdeUtil.writeSerializedPage(sliceOutput, serializedPage);

String encodedPage = Base64.getEncoder().encodeToString(sliceOutput.slice().byteArray());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add a constant?

Suggested change
String encodedPage = Base64.getEncoder().encodeToString(sliceOutput.slice().byteArray());
String encodedPage = BASE64_ENCODER.encodeToString(sliceOutput.slice().byteArray());

bytes += serializedPage.getSizeInBytes();

DynamicSliceOutput sliceOutput = new DynamicSliceOutput(1000);
PagesSerdeUtil.writeSerializedPage(sliceOutput, serializedPage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: static import?

Suggested change
PagesSerdeUtil.writeSerializedPage(sliceOutput, serializedPage);
writeSerializedPage(sliceOutput, serializedPage);

@mbasmanova
Copy link
Contributor Author

@tdcmeehan Tim, thank you for the review and offline discussion. Addressed all comments and updated the PR.

@mbasmanova mbasmanova force-pushed the binary-results branch 2 times, most recently from c03ecc5 to 9550d77 Compare September 21, 2023 18:05
@github-actions
Copy link

github-actions bot commented Sep 21, 2023

Codenotify: Notifying subscribers in CODENOTIFY files for diff dd5b168...d218176.

Notify File(s)
@steveburnett presto-docs/src/main/sphinx/develop/client-protocol.rst

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

@mbasmanova mbasmanova merged commit da04bac into prestodb:master Sep 22, 2023
55 checks passed
@mbasmanova mbasmanova deleted the binary-results branch September 22, 2023 17:14
.scheme(getScheme(xForwardedProto, uriInfo))
.replacePath("/v1/statement/queued/")
.replacePath("/v1/statement/queued")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a Q : Why was the trailing slash removed ?

if (serializedPage == null) {
break;
if (binaryResults) {
ImmutableList.Builder<String> pages = ImmutableList.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: How do you feel about merging these two of the loops ?

Also is it expensive to recreate DynamicSliceOutput(1000) each time in the loop ? No idea, so just asking :-)

}
}

if (queryResults.getError() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is there a Util class where we can create this helper method that is an improvement on assertNull(queryResults.getError()) that is used in a couple of places now (this whole check and fail descriptively thing) :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants