Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-33475: [Java][FlightRPC] Send prepared statement parameters #33961

Closed
wants to merge 26 commits into from

Conversation

avantgardnerio
Copy link

@avantgardnerio avantgardnerio commented Jan 31, 2023

Rationale for this change

Presently, the flight-sql-jdbc-driver does not send prepared statement parameter values to the server. This prevents Arrow FlightSql server implementors for receiving these values and users from being able to use prepared statements properly. This causes some significant issues, in our case not being able to performance test our database with TPC-C.

Will resolve: #33475

What changes are included in this PR?

  1. Parsing and saving the schema of statement parameters sent by the server
  2. (Hopefully soon) sending parameter values back to the server.

Are these changes tested?

Currently, only by an integration test against our proprietary, unreleased database.

Are there any user-facing changes?

Parameterized queries should work when this PR is completed.

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@avantgardnerio avantgardnerio changed the title Send prepared statement parameters [GH-33475]: [flight-sql] Send prepared statement parameters Jan 31, 2023
@lidavidm
Copy link
Member

Oh cool, you worked this out. I had half a PR but never finished it.

return false;
}

static int toJdbcType(ArrowType.ArrowTypeID type) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am uncertain where these methods should go.

case Duration:
return Timestamp.class.getCanonicalName();
}
return null;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlightSql schema are quite the superset of JDBC, so the mapping of deeply nested structs and such should probably be a subsequent PR?

* limitations under the License.
*/

package org.apache.arrow.vector.table;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to delete this file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily yes. I could not get it to compile and will put it back before I un-draft.

try(PreparedStatement ps = con.prepareStatement("select * from person where id=$1")) {
ParameterMetaData md = ps.getParameterMetaData();
assertEquals(1, md.getParameterCount());
assertEquals("Int", md.getParameterTypeName(1));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions pass against our server implementation, which is part #1 of this PR.

ParameterMetaData md = ps.getParameterMetaData();
assertEquals(1, md.getParameterCount());
assertEquals("Int", md.getParameterTypeName(1));
ps.setInt(1, 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to affect parameterBindingRoot so the parameters are not sent back to the server.

I am a bit lost and would appreciate some help following the flow of how those parameters should be sent and why they are not going through at present.

@lidavidm
Copy link
Member

In general you can look at the arrow-jdbc module for existing mapping between Arrow and JDBC types, as you've noted, not all types are easily handled

@@ -1,856 +0,0 @@
/*
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put this back before I un-draft the PR. I could not get it to compile locally, so I deleted it for now.

@@ -133,6 +126,28 @@ public void testShouldConnectWhenProvidedWithValidUrl() throws Exception {
}
}

@Test
public void testParameters() throws Exception {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to receive pointers on how to properly unit test this, since we will probably need to remove this integration test before submitting as a non-draft PR.

@alamb
Copy link
Contributor

alamb commented Jan 31, 2023

I also happen to be working on support for parameterized statements in InfluxDB IOx (WIP at https://github.com/influxdata/influxdb_iox/pull/6790) and I had just hit this. I plan to try and test / review this PR tomorrow

@lidavidm
Copy link
Member

note there was a prior PR in the issue description, I don't know if that answers any of your questions here

FieldVector v = new
}
VectorSchemaRoot parameters = new VectorSchemaRoot();
this.preparedStatement.setParameters();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the missing piece. If we can translate the AvaticaParameters into a VectorSchemaRoot the server will get what it needs.

@lidavidm lidavidm changed the title [GH-33475]: [flight-sql] Send prepared statement parameters GH-33475: [Java][FlightRPC] Send prepared statement parameters Feb 1, 2023
@alamb
Copy link
Contributor

alamb commented Feb 1, 2023

Previous PR #14627

}
}
VectorSchemaRoot parameters = new VectorSchemaRoot(fields);
this.preparedStatement.setParameters(parameters);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step #2 of this PR seems complete: now we can both:

  1. Get the schema from the server
  2. Set the parameter values to pass to the server

@avantgardnerio
Copy link
Author

Latest update: everything is working (if not a little ugly)

Unfortunately it's working against a proprietary server, so I'll begin the work to put it into the example server in arrow-rs.

Then I could use some help getting this PR ready for merge.

@lidavidm
Copy link
Member

lidavidm commented Feb 1, 2023

The prior PR uses a mocked server for unit testing - we can do that here, right?

@avantgardnerio
Copy link
Author

The prior PR uses a mocked server for unit testing - we can do that here, right?

Yes, I'll look into it. Happy to pull over those changes from your PR if that's the easiest way.

@lidavidm
Copy link
Member

lidavidm commented Mar 3, 2023

@avantgardnerio was there a problem with the PR?

@avantgardnerio
Copy link
Author

@avantgardnerio was there a problem with the PR?

Oh, you're right - I hadn't looked at this branch in so long I thought it was stale, but I do think this needs to go in.

@avantgardnerio avantgardnerio reopened this Mar 6, 2023
@amol-
Copy link
Member

amol- commented Mar 30, 2023

Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍

@amol- amol- closed this Mar 30, 2023
@sukwanlee
Copy link

Will prepared statements be supported anytime in the near future?

@lidavidm
Copy link
Member

lidavidm commented Sep 5, 2023

Contributions are welcome. I'm not aware of any active development on the JDBC driver.

@aiguofer
Copy link
Contributor

aiguofer commented Sep 5, 2023

This will be a priority for our team within the next few months. If no one has done it by then I will likely take this up.

@avantgardnerio
Copy link
Author

@aiguofer what I have in this branch basically works, I think it just needs to cleanup and tests to merge. Sorry I don't have time to help right now...

@aiguofer
Copy link
Contributor

aiguofer commented Sep 5, 2023

Sweet that's what I'm hoping for. I'd likely strip out all the metrics stuff since it's unrelated, but I'm hoping between this PR and @lidavidm's original PR that it should be pretty easy to put together. We have some other higher priorities right now but in a month or so I might be able to get to this (or if I somehow find myself extremely bored on a weekend before then haha).

lidavidm pushed a commit that referenced this pull request Nov 4, 2023
…C driver (#38404)

This PR is a combination of #33961 and #14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver.

An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods:
- createParameter: Create an `AvaticaParameter` from the given Arrow `Field`.
- bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index.

This PR purposely leaves out a few features:
- We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types.
- We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed.
- Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets.

* Closes: #33475
* Closes: #35536

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…in JDBC driver (apache#38404)

This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver.

An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods:
- createParameter: Create an `AvaticaParameter` from the given Arrow `Field`.
- bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index.

This PR purposely leaves out a few features:
- We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types.
- We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed.
- Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets.

* Closes: apache#33475
* Closes: apache#35536

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…in JDBC driver (apache#38404)

This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver.

An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods:
- createParameter: Create an `AvaticaParameter` from the given Arrow `Field`.
- bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index.

This PR purposely leaves out a few features:
- We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types.
- We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed.
- Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets.

* Closes: apache#33475
* Closes: apache#35536

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants