aitools: add 'tools statement' lifecycle commands#5095
Merged
simonfaltum merged 10 commits intomainfrom Apr 28, 2026
Merged
Conversation
This was referenced Apr 27, 2026
5be29eb to
ff58192
Compare
Contributor
|
🔍 Reviewed by nitpicker |
arsenyinfo
approved these changes
Apr 28, 2026
ff58192 to
a34f39e
Compare
Refactor `executeAndPoll` in `experimental/aitools/cmd/query.go` to extract a pure `pollStatement(ctx, api, resp)` helper. The helper polls until the statement reaches a terminal state and returns the response without any signal handling, spinner, or server-side cancellation; those concerns stay in `executeAndPoll` where they belong. Also pin `OnWaitTimeout: CONTINUE` explicitly on the `ExecuteStatement` call. The SDK default happens to be CONTINUE today, but relying on it is a hidden coupling: a server-side default flip would silently break the poll loop by killing the statement before our first GET. Behavior is unchanged for the existing `query` command. Follow-up PRs (parallel batch queries, statement lifecycle command tree) will reuse the helper. Co-authored-by: Isaac
Allow `databricks experimental aitools tools query` to accept several SQLs
in a single invocation and run them in parallel against the warehouse.
Pass multiple positional arguments and/or repeat `--file` to fan out:
databricks experimental aitools tools query \
--warehouse <wh> --output json \
"SELECT count(*) FROM t" \
"SELECT min(ts), max(ts) FROM t" \
"SELECT col, count(*) FROM t GROUP BY 1"
Multi-query output is always a JSON array of one object per input,
preserving input order. The shape is `{sql, statement_id, state,
elapsed_ms, columns, rows, error}`. Individual statement failures don't
abort siblings; each is encoded in the per-result `error` field, and the
exit code is non-zero when any statement failed.
A new `--concurrency` flag (default 8) caps in-flight statements. On
Ctrl+C the still-running statements are cancelled server-side via
CancelExecution before exit.
Single-query behavior is unchanged. The previous restriction that
forbade mixing `--file` and a positional SQL is lifted, since both now
contribute to the batch.
Co-authored-by: Isaac
Address two findings from a cursor PR review: 1. --concurrency was passed straight into errgroup.SetLimit. A value of 0 deadlocks (errgroup refuses to add goroutines), and a negative value silently removes the cap. Add a PreRunE check that rejects anything <= 0 with errInvalidBatchConcurrency, matching the shape used by cmd/fs/cp.go for the same flag. 2. The Long help previously said multi-query results come back "in input order", which was ambiguous when --file and positional SQLs are mixed. The actual behavior (already covered by TestResolveSQLsMixedFileAndPositional) is: --file inputs first in flag order, then positional SQLs in arg order. Tighten the help text to state that contract precisely. Adds two unit tests that verify --concurrency 0 and -1 are rejected before any API call. Co-authored-by: Isaac
… cases Two pairs of cobra-level tests were each testing one rejection code path with two flag values. Fold them into table-driven subtests so the shared assertion lives in one place: - TestQueryCommandBatchTextOutputRejected + ...CsvOutputRejected → TestQueryCommandBatchOutputRejection (text, csv subtests) - TestQueryCommandConcurrencyZeroRejected + ...NegativeRejected → TestQueryCommandConcurrencyRejection (0, -1 subtests) Same coverage, half the test functions. Co-authored-by: Isaac
Address Arseni's P2 finding on the batch PR. cancelInFlight (batch.go) and cancelStatement (query.go) used to derive the cancel-RPC ctx via context.WithTimeout(ctx, cancelTimeout). On the actual hot path (Ctrl+C or parent ctx cancelled), the inbound ctx is already cancelled by the time we reach the cancel sweep. The SDK then short-circuits on ctx.Err() and the cancel RPC never reaches the warehouse, leaving in-flight statements running server-side. Wrap with context.WithoutCancel(ctx) (Go 1.21+) so the timeout context keeps the caller's values but drops the cancellation signal. The cancel RPC now actually fires. Also tighten the existing tests: - TestExecuteBatchContextCancellationCancelsInFlight - TestExecuteAndPollCancelledContextCallsCancelExecution Both previously matched mock.Anything for the ctx argument, so they passed regardless of whether the bug was present. They now use mock.MatchedBy(c.Err() == nil) to assert the cancel-RPC ctx is alive. This is a regression guard; reverting the production fix makes the tests fail with "unexpected call" because the matcher no longer matches. Co-authored-by: Isaac
3e565e9 to
a1c5ca6
Compare
Adds a low-level command tree for asynchronous SQL statement management, complementing the synchronous 'tools query': databricks experimental aitools tools statement submit "SELECT ..." databricks experimental aitools tools statement get <statement_id> databricks experimental aitools tools statement status <statement_id> databricks experimental aitools tools statement cancel <statement_id> submit fires an ExecuteStatement with WaitTimeout=0s and OnWaitTimeout=CONTINUE, returning the statement_id immediately. get polls (via pollStatement from #5092) until terminal and emits rows on success or an error object on failure. status performs a single GET without polling. cancel sends CancelExecution. All four subcommands emit a uniform JSON shape {statement_id, state, warehouse_id, columns, rows, error} with omitempty so the payload only includes fields that subcommand has. Important UX nuance: 'statement get' Ctrl+C stops polling but does NOT cancel the server-side statement. Users that want server-side termination call 'statement cancel' explicitly. (This differs from 'tools query', which cancels server-side on Ctrl+C because the user invoked the synchronous path.) The pollStatement helper from #5092 is already designed to propagate ctx errors without touching the server, so 'get' inherits this behavior for free. Co-authored-by: Isaac
Address a cursor PR review finding: 'statement get' and 'statement status' previously only set info.Error when pollResp.Status.Error was non-nil. The Statements API can return a non-success terminal state (FAILED, CANCELED, CLOSED) with no Error payload, so the JSON contract "emits rows on success or an error object on failure" wasn't actually guaranteed. Skill consumers couldn't branch on `error == null` alone: they had to also inspect `state`. Especially bad for 'get', which exits non-zero on non-success terminal states without giving the caller structured failure detail. Add a shared helper, statementErrorFromStatus, that returns a batchResultError for any terminal non-success state, populated from the SDK's ServiceError when present and synthesizing "statement reached terminal state X" when the backend doesn't supply one. Mirrors the pattern already used by runOneBatchQuery in batch.go, so the contract is uniform across batch and single-statement paths. Both 'get' and 'status' now use the helper. PENDING and RUNNING still emit no error (legitimately mid-flight). New tests: - table-driven coverage of statementErrorFromStatus across nil, succeeded, running, pending, failed-with-error, failed-no-error, canceled-no-error, closed-no-error - getStatementResult with CLOSED state and no Error - getStatementResult with FAILED state and no Error - getStatementStatus with FAILED state and no Error - getStatementStatus with RUNNING state confirms no error is set Co-authored-by: Isaac
…tests Self-review pass on the test suite found ~8 functions worth trimming without losing coverage: Drop (cobra built-ins, not our contract): - TestStatementSubmitArgsBound: tests cobra's MaximumNArgs(1) - TestStatementGetRequiresStatementID: tests cobra's ExactArgs(1) - TestStatementCancelRequiresStatementID: tests cobra's ExactArgs(1) Drop (already covered by TestStatementErrorFromStatus, the table-driven helper test added with the cursor-fix commit): - TestGetStatementResultClosedTerminalSynthesizesError - TestGetStatementResultFailedWithoutBackendErrorSynthesizesError - TestGetStatementStatusFailedWithoutBackendErrorSynthesizesError - TestGetStatementStatusRunningHasNoError Fold: - TestRenderStatementInfo + TestRenderStatementInfoOmitsEmptyFields → one table-driven TestRenderStatementInfo with the full and minimal cases as subtests. Kept the validation we actually wrote (TestStatementSubmitRejectsMultipleSQLs) and the wiring tests that pin distinct contracts (TestGetStatementResultPolls, TestGetStatementResultFailedStateReportsError, TestGetStatementResultDoesNotCancelServerSideOnContextCancel, TestGetStatementStatusSinglePoll, TestGetStatementStatusReportsError, the cancel pair, and submit pair). Co-authored-by: Isaac
…put before auth Address two findings from Arseni's review. P2 (statement_get.go): getStatementResult used to return (info, err) when fetchAllRows failed after a SUCCEEDED state. RunE then discarded the populated info and surfaced only the raw Go error, so the user got an unstructured "fetch result chunk N: ..." string with no statement_id and no machine-readable error field. That contradicts the contract in the failed-terminal path two cases above, which renders JSON and returns root.ErrAlreadyPrinted. Now: on chunk-fetch failure, populate info.Error with the chunk-fetch message and return (info, nil). RunE renders the partial info as JSON and signals exit-non-zero based on info.Error != nil. The caller still gets statement_id and columns; the error field carries the failure detail. New test TestGetStatementResultChunkFetchFailureRendersPartialInfo locks this in. P3 (statement_submit.go): The PR description claims submit validates input before accessing WorkspaceClient. The code didn't actually deliver that: PreRunE was root.MustWorkspaceClient (auth/profile setup), then RunE did the resolveSQLs / "exactly one" checks. So a malformed invocation hit auth errors before ever surfacing the input error. Move resolveSQLs and the length check into a custom PreRunE that runs before root.MustWorkspaceClient, mirroring the pattern in query.go:113-118. The result is stashed in a closure variable (sqlStatement) for RunE to consume. Existing test TestStatementSubmitRejectsMultipleSQLs is renamed to ...BeforeWorkspaceClient and no longer needs to stub out PreRunE: the new ordering means a bad invocation gets the validation error without ever attempting workspace-client setup. Co-authored-by: Isaac
dba7285 to
9b52b65
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack
This PR is part of a 4-PR stack making
aitoolsdata exploration faster for ai-dev-kit. Each PR is independently reviewable; merge in order.main)Use
git diff <base>...HEADor set the comparison base in the GitHub UI to see only this PR's changes; the default "Files changed" diff againstmainincludes ancestor PRs.Why
Quentin's ai-dev-kit skill works against synchronous
tools query. That covers most cases, but there are workflows where the agent wants a server-side handle it can poll separately: long-running maintenance queries, parallel exploration where the agent does other work in between, and any "submit-now-harvest-later" pattern.tools querywith a single SQL is for "I want results now." This PR adds a low-level command tree,tools statement, for "I want a handle." Cleaner separation than overloadingquerywith--async/--cancelflags (which would be semantically forced — aqueryshouldn't manage someone else's statement_id).Changes
Four new subcommands under
databricks experimental aitools tools statement:Implementation notes:
statementInfoJSON shape:{statement_id, state, warehouse_id, columns, rows, error}withomitemptyon every field exceptstatement_id. Sosubmitdoesn't includecolumns/rows,canceldoesn't includewarehouse_id, etc. Consumer parsing is uniform.submitusesWaitTimeout: "0s"andOnWaitTimeout: CONTINUE(matching the helper from aitools: extract pollStatement helper and pin OnWaitTimeout #5092).getusespollStatement(from aitools: extract pollStatement helper and pin OnWaitTimeout #5092) and inherits its "ctx cancellation does NOT cancel server-side" semantics. This is the important UX difference fromtools query: hitting Ctrl+C ongetstops polling but leaves the statement running on the warehouse. Usecancelfor explicit termination. That asymmetry is intentional, sincegetis poll-only by design — the user already submitted async.statusdoes a singleGetStatementByStatementIdwith no polling.cancelcallsCancelExecutionand optimistically reportsstate=CANCELED. The Statements API returns no body on cancel; the actual server-side state transitions asynchronously. TheLonghelp points users atstatusif they need certainty.statementErrorFromStatuspopulates theerrorfield for every non-success terminal state (FAILED, CANCELED, CLOSED), even when the server returns noStatus.Errorpayload. So skill consumers can branch onerror == nullalone instead of inspectingstate.submitStatement,getStatementResult,getStatementStatus,cancelStatementExecution) extracted from the cobraRunE. Tests target the helpers directly with a mockStatementExecutionInterface.statement.goregisters the four subcommands and is wired intotools.gonext toquery,discover-schema, andget-default-warehouse.submitvalidates input (rejects mixed --file + positional) BEFORE accessingWorkspaceClient, so the error surfaces cleanly without an auth or warehouse roundtrip.Test plan
go test ./experimental/aitools/...passes.make checksclean.make fmtno drift.make lint0 issues.New tests cover:
submitreturns the statement_id and pinsOnWaitTimeout: CONTINUEsubmitwraps transport errors withexecute statement: ...getpolls until terminal and assembles rowsgetreports server-side errors in the JSON without raising a Go errorgetctx cancellation propagates without callingCancelExecution(the deliberate UX difference fromquery)getsynthesizeserrorfor terminal CLOSED / FAILED with no backend payloadstatusdoes a single GET, no pollingstatusreports server-side errors in the JSON; running/pending stay error-freestatussynthesizeserrorfor FAILED with no backend payloadcancelcallsCancelExecutionand reportsstate=CANCELEDcancelwraps API errorsstatementErrorFromStatustable-driven across nil, succeeded, running, failed-with-error, failed/canceled/closed-without-errorrenderStatementInfoJSON shape (full and minimal)submitrejects mixed --file + positional,submitenforces MaximumNArgs(1),getandcancelrequire a positional statement_idManual smoke against a real warehouse: