Summary
With CUBESQL_STREAM_MODE=true, any query whose backing BigQuery call returns an HTTP error mid-stream (e.g. No matching signature for operator = for argument types: TIMESTAMP, DATE) causes the cubejs-server / CubeSQL pod to die with a Node unhandled-rejection (processTicksAndRejections). Connected BI sessions are torn down. The underlying error never reaches the wire as a Postgres ErrorResponse — the client just sees server closed the connection unexpectedly. The exact same query with stream mode off returns a structured ErrorResponse carrying the verbatim BigQuery message and the server stays up.
Cube v1.6.46 (reproduces on master). Affects every BI tool talking to the SQL API.
Repro
Cube model with any cube whose generated SQL can trigger an execution-time error from the backing database. For BigQuery specifically, a type: time dimension queried via the SQL API with this shape hits #10643's DATE() → TIMESTAMP coercion gap and crashes the pod:
SELECT created_at, MEASURE(count)
FROM orders
WHERE ((CAST(created_at AS DATE) = DATE('2024-01-08')) IS NULL)
OR (CAST(created_at AS DATE) <> DATE('2024-01-08'))
OR (CAST(created_at AS DATE) = DATE('2024-01-08'))
GROUP BY 1 ORDER BY 1 LIMIT 5;
Real cause (only visible in container stderr until the fix):
No matching signature for operator = for argument types: TIMESTAMP, DATE
Signature: T1 = T1
Unable to find common supertype for templated argument <T1>:
Input types for <T1>: {TIMESTAMP, DATE}
Client observation:
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
Process exits with status 1, crash signature processTicksAndRejections.
Root cause (revised after empirical verification)
The original revision of this issue claimed the panic site was an .unwrap() at packages/cubejs-backend-native/src/stream.rs:258. That was wrong. A sentinel eprintln! placed at the top of js_stream_push_chunk and rebuilt into the native module never fires for the failing query — meaning the entire FFI bridge for streaming chunks is bypassed. The crash happens upstream.
The actual root cause is in packages/cubejs-bigquery-driver/src/BigQueryDriver.ts (stream method):
public async stream(query, values): Promise<StreamTableData> {
const stream = await this.bigquery.createQueryStream({...});
const rowStream = new HydrationStream();
stream.pipe(rowStream); // ← only forwards data/end, NOT error
return { rowStream };
}
Mechanism:
- The cube native bridge (
packages/cubejs-backend-native/js/index.ts:325-328) listens on rowStream.on('error', ...) to surface streaming errors to the wire layer.
- Node's
stream.pipe(destination) does not forward 'error' events to the destination. When the underlying bigquery.createQueryStream source emits 'error', that event has no listener on rowStream.
- BUT it also has no listener on the source itself in this code — so Node falls back to its default "unhandled
'error' event" → uncaught exception → process death.
- The bridge's outer
try { await fn(...) } catch (e) cannot catch this because await fn(...) already resolved (with the { rowStream } object) before the BigQuery HTTP call asynchronously failed.
The non-streaming path (driver.query) avoids this because the rejection propagates through await and the bridge's try/catch catches it.
Fix
stream.pipe(rowStream) → pipeline(stream, rowStream, () => {}) from node:stream. This:
- Auto-forwards source errors by destroying
rowStream with the same error → bridge's rowStream.on('error', ...) fires → wire layer emits structured Postgres ErrorResponse (XX000) with verbatim BigQuery message.
- Propagates consumer-side cancellation back to the source — destroying
rowStream now destroys the BigQuery source stream too. Without this, an aborted BI query left the driver paging into the void (a separate, also-real resource leak).
Verification
End-to-end against real BigQuery via cube SQL API + psql (cube v1.6.46 with patched BigQueryDriver.js overlaid):
| Path |
Before |
After |
| Successful 100k-row stream |
works |
works (no regression) |
| BigQuery TIMESTAMP=DATE error |
container exits 1, server closed the connection unexpectedly |
ERROR: XX000: Database Execution Error: No matching signature for operator = ... Signature: T1 = T1 ...; container alive |
Two synthetic unit tests added (packages/cubejs-bigquery-driver/test/BigQueryDriverStreamError.test.ts) — verified to time out without the fix and pass with the fix:
- forwards source-stream errors to the returned
rowStream
- propagates
rowStream destruction back to the source stream
A separate defensive hardening
While investigating, I also noticed an unrelated FFI panic vector in packages/cubejs-backend-native/src/stream.rs:258: a raw .unwrap() on transform_response. It is not the cause of this BigQuery crash (verified by the sentinel above), but it is a latent panic-across-the-FFI-boundary that any future driver mis-shaping a chunk would detonate. The PR therefore also contains a second commit converting that .unwrap() to a clean reject() + wait_for_future_and_execute_callback path, with its own synthetic regression test (malformed-chunk fixture).
Happy to put up the PR — coming as cube-js/cube PR shortly.
Related
A note for maintainers
The original root-cause section of this issue (claiming the .unwrap() at stream.rs:258) has been retracted above. I left the defensive hardening for that .unwrap() in the PR because it's a real (if narrow) bug, but the actual production fix is the BigQueryDriver.stream pipeline() change. Apologies for the noisy revision history on this issue — the diagnosis sharpened over the course of the investigation.
Summary
With
CUBESQL_STREAM_MODE=true, any query whose backing BigQuery call returns an HTTP error mid-stream (e.g.No matching signature for operator = for argument types: TIMESTAMP, DATE) causes the cubejs-server / CubeSQL pod to die with a Node unhandled-rejection (processTicksAndRejections). Connected BI sessions are torn down. The underlying error never reaches the wire as a PostgresErrorResponse— the client just seesserver closed the connection unexpectedly. The exact same query with stream mode off returns a structuredErrorResponsecarrying the verbatim BigQuery message and the server stays up.Cube v1.6.46 (reproduces on master). Affects every BI tool talking to the SQL API.
Repro
Cube model with any cube whose generated SQL can trigger an execution-time error from the backing database. For BigQuery specifically, a
type: timedimension queried via the SQL API with this shape hits #10643'sDATE()→TIMESTAMPcoercion gap and crashes the pod:Real cause (only visible in container stderr until the fix):
Client observation:
Process exits with status 1, crash signature
processTicksAndRejections.Root cause (revised after empirical verification)
The original revision of this issue claimed the panic site was an
.unwrap()atpackages/cubejs-backend-native/src/stream.rs:258. That was wrong. A sentineleprintln!placed at the top ofjs_stream_push_chunkand rebuilt into the native module never fires for the failing query — meaning the entire FFI bridge for streaming chunks is bypassed. The crash happens upstream.The actual root cause is in
packages/cubejs-bigquery-driver/src/BigQueryDriver.ts(streammethod):Mechanism:
packages/cubejs-backend-native/js/index.ts:325-328) listens onrowStream.on('error', ...)to surface streaming errors to the wire layer.stream.pipe(destination)does not forward'error'events to the destination. When the underlyingbigquery.createQueryStreamsource emits'error', that event has no listener onrowStream.'error'event" → uncaught exception → process death.try { await fn(...) } catch (e)cannot catch this becauseawait fn(...)already resolved (with the{ rowStream }object) before the BigQuery HTTP call asynchronously failed.The non-streaming path (
driver.query) avoids this because the rejection propagates throughawaitand the bridge'stry/catchcatches it.Fix
stream.pipe(rowStream)→pipeline(stream, rowStream, () => {})fromnode:stream. This:rowStreamwith the same error → bridge'srowStream.on('error', ...)fires → wire layer emits structured PostgresErrorResponse(XX000) with verbatim BigQuery message.rowStreamnow destroys the BigQuery source stream too. Without this, an aborted BI query left the driver paging into the void (a separate, also-real resource leak).Verification
End-to-end against real BigQuery via cube SQL API + psql (cube v1.6.46 with patched
BigQueryDriver.jsoverlaid):server closed the connection unexpectedlyERROR: XX000: Database Execution Error: No matching signature for operator = ... Signature: T1 = T1 ...; container aliveTwo synthetic unit tests added (
packages/cubejs-bigquery-driver/test/BigQueryDriverStreamError.test.ts) — verified to time out without the fix and pass with the fix:rowStreamrowStreamdestruction back to the source streamA separate defensive hardening
While investigating, I also noticed an unrelated FFI panic vector in
packages/cubejs-backend-native/src/stream.rs:258: a raw.unwrap()ontransform_response. It is not the cause of this BigQuery crash (verified by the sentinel above), but it is a latent panic-across-the-FFI-boundary that any future driver mis-shaping a chunk would detonate. The PR therefore also contains a second commit converting that.unwrap()to a cleanreject()+wait_for_future_and_execute_callbackpath, with its own synthetic regression test (malformed-chunk fixture).Happy to put up the PR — coming as cube-js/cube PR shortly.
Related
DATE()→TIMESTAMPcoercion gap. The proximate cause of the most common production hit, but this stream-mode crash would fire on any mid-stream execution error.A note for maintainers
The original root-cause section of this issue (claiming the
.unwrap()atstream.rs:258) has been retracted above. I left the defensive hardening for that.unwrap()in the PR because it's a real (if narrow) bug, but the actual production fix is theBigQueryDriver.streampipeline()change. Apologies for the noisy revision history on this issue — the diagnosis sharpened over the course of the investigation.