Implement Scheduler task with dual-channel coordination\n\nTask ID: task-2.1-implement-scheduler by eric-wang-1990 · Pull Request #356 · adbc-drivers/databricks

eric-wang-1990 · 2026-03-18T07:14:57Z

🥞 Stacked PR

Use this link to review incremental changes.

stack/pr-phase1-foundation [Files changed]
- stack/pr-phase2-core-pipeline [Files changed]
  - stack/pr-phase3-integration [Files changed]

What's Changed

Please fill in a description of the changes here.

This contains breaking changes.

Closes #NNN.

…k ID: task-1.1-update-cloudfetch-config

…emove-legacy-types

…-define-pipeline-types

…ask-2.1-implement-scheduler

…ask ID: task-2.2-implement-download-workers

…t-consumer

eric-wang-1990 · 2026-03-18T08:30:48Z

[Critical] String-based 403/401 detection is fragile

worker.rs detects auth/expiry errors by checking error_str.contains("401") / contains("403"). This can false-positive on error messages that happen to contain those digit sequences (e.g. "error code 4035"), and will break silently if the error format ever changes.

Prefer a structured check. Since DatabricksErrorHelper::io() produces errors with the message "HTTP 403 - ...", at minimum scope the match more tightly:

let is_auth_error = error_str.contains("HTTP 401")
    || error_str.contains("HTTP 403");

Long-term, add a typed error variant (e.g. ErrorKind::HttpStatus(u16)) so callers can match on status code directly.

This comment was generated with GitHub MCP.

eric-wang-1990 · 2026-03-18T08:30:52Z

[High] Scheduler's bounded result_channel send has no timeout or cancellation escape

If the consumer stops reading (e.g. drops or panics mid-stream without cancelling), result_channel.send(handle).await in the scheduler will block forever since the channel is bounded. The scheduler never checks the cancellation token while blocked on this send.

Wrap the send in a tokio::select! with the cancellation token:

tokio::select! {
    res = result_channel.send(handle) => {
        if res.is_err() { /* consumer dropped */ break; }
    }
    _ = cancel_token.cancelled() => break,
}

This comment was generated with GitHub MCP.

eric-wang-1990 · 2026-03-18T08:30:56Z

[High] Empty-batch chunks cause an infinite loop in the consumer

In consumer.rs, when a chunk returns an empty batches vec the code logs a warning and continues the outer loop. But chunk_index is not advanced, so the consumer re-fetches the same chunk handle repeatedly, looping forever on CPU.

Either treat empty batches as an error, or advance chunk_index before continuing:

if batches.is_empty() {
    warn!("Chunk {} returned empty batches", chunk_index);
    return Err(DatabricksErrorHelper::invalid_state()
        .message(format!("Chunk {} returned no Arrow batches", chunk_index)));
}

This comment was generated with GitHub MCP.

eric-wang-1990 added 6 commits March 17, 2026 19:42

Update CloudFetchConfig with new fields and corrected defaults\n\nTas…

7cb7350

…k ID: task-1.1-update-cloudfetch-config

Remove ChunkEntry and ChunkState from types.rs\n\nTask ID: task-1.2-r…

3cfca74

…emove-legacy-types

Define ChunkDownloadTask and ChunkHandle structs\n\nTask ID: task-1.3…

7eedea9

…-define-pipeline-types

Implement Scheduler task with dual-channel coordination\n\nTask ID: t…

7c3e686

…ask-2.1-implement-scheduler

Implement Download Worker tasks with retry logic and URL refresh\n\nT…

4e91b77

…ask ID: task-2.2-implement-download-workers

Implement Consumer logic for next_batch\n\nTask ID: task-2.3-implemen…

9e51b88

…t-consumer

This was referenced Mar 18, 2026

Refactor StreamingCloudFetchProvider struct to use channels\n\nTask ID: task-3.1-refactor-streaming-provider-struct #357

Draft

feat(rust): update CloudFetchConfig with new fields and corrected defaults #355

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Scheduler task with dual-channel coordination\n\nTask ID: task-2.1-implement-scheduler#356

Implement Scheduler task with dual-channel coordination\n\nTask ID: task-2.1-implement-scheduler#356
eric-wang-1990 wants to merge 6 commits intomainfrom
stack/pr-phase2-core-pipeline

eric-wang-1990 commented Mar 18, 2026 •

edited

Loading

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eric-wang-1990 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🥞 Stacked PR

What's Changed

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

eric-wang-1990 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eric-wang-1990 commented Mar 18, 2026 •

edited

Loading