-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Prepared Statements are widely used by SQL clients when issuing queries to a database. Major use cases include improved transaction processing latency as well as preventing SQL injection attacks (parameterized query arguments are often implemented as a feature of prepared statements).
Supporting prepared statements will increase the number of client applications that can work with DataFusion.
Task List
- feat: support prepare statement #4490 (support
PREPAREforSELECT) - Support
EXECUTEfor prepared statements feat: basic support for executing prepared statements #13242 - Support PREPARE statements without explicit parameters Allow place holders like
$1in more types of queries. #13632 - Store Prepare Logical Plan #4549
- Convert a Prepare Logical Plan into a Logical Plan with all parameters replaced with values #4550
Background
Here is a schematic of how prepared statements work:
( )
╔═══════════╗ SELECT * │`─────────────'│
║ ║ FROM foo │ │
║ Client ║ WHERE id = $1 │ Database │
║ ║━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━▶│ Server │
║ ║ │ │
╚═══════════╝ │.─────────────.│
( )
Step 1: Client send parameterized query to `─────────────'
server to "prepare"
HANDLE: 0x.....
SCHEMA: (VARCHAR, INT, FLOAT)
PARAMS: {$1: INT}
◀━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Step 2: Server prepares to run query and sends back opaque
"handle", result schema, and needed bind parameters to client
HANDLE: 0x.....
PARAMS: { $1 = 12345 }
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━▶
Step 3: Client returns the handle and values of "bind"
parameters to the server
Results: [('Hi', 12345, 5423.13)]
◀━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Step 4: Server fills in bind parameters, and returns results
as if the entire query had been supplied
Steps:
- The client sends two messages to the server. One to prepare the statement leaving placeholders called bind parameters.
- The server responds with a handle for the client to identify the prepared query, the result schema, and needed parameters.
- The client sends a second message with the handle and bind parameter values.
- The server fills in the parameter values, executes the query and returns the results. It is typically possible to execute the same prepared statement multiple times using different bind parameters with a single additional message each.
Some protocols (like the postgres FrontEnd - BackEnd , FEBE, protocol) allow optionally sending both messages in the same transmission to avoid a network round trip.
Describe the solution you'd like
We would like:
- Support for
PREPAREstatements. - Support for
EXECUTEstatements PreparedStatements with bind parameters.
Both PREPARE and EXECUTE should offer a basic implementation in SessionContext and the ability to extend by other systems (similar to CREATE VIEW and CREATE TABLE)
cc @NGA-TRAN