feat: Add preflight check to destination smoke test tool by aaronsteers · Pull Request #1007 · airbytehq/PyAirbyte

Aaron ("AJ") Steers (aaronsteers) · 2026-03-26T19:27:45Z

Summary

Closes #1006.

Adds an automatic preflight check to the destination smoke test tool. Before running the user-requested scenarios, the tool now writes basic_types to the destination as a lightweight connectivity/credential validation. If the preflight fails, the tool returns early with a structured error instead of running all scenarios against a broken destination.

Also significantly improves error surfacing throughout the smoke test:

_sanitize_error() now resolves errors in priority order: (1) connector log file TRACE ERROR internal_message, (2) original_exception chain, (3) top-level exception. This means failures like bad credentials now show java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available... instead of the generic AirbyteConnectorWriteError: Connector failed.
Readback failures (e.g. cache connection issues that cause table_statistics: null) are now captured in a new warnings field instead of being silently logged.

Changes:

DestinationSmokeTestResult gets preflight_passed: bool | None (True/False/None=skipped) and warnings: list[str] | None
_run_preflight() runs basic_types write, returns (passed, sanitized_error)
_extract_trace_error_from_log() parses the connector's log file for TRACE ERROR JSON and extracts the internal_message field
_sanitize_error() uses a 3-tier resolution: log file → exception chain → top-level message
run_destination_smoke_test() accepts skip_preflight: bool = False
Readback exceptions are captured into warnings with sanitized messages
MCP tool and CLI both accept the new skip_preflight parameter
_prepare_destination_config() moved before preflight so both preflight and main run use the prepared config

Updates since last revision

Log file error extraction: Added _extract_trace_error_from_log() to parse the connector's log file for TRACE ERROR internal_message. Many Java-based connectors emit a generic message (e.g. "Something went wrong in the connector") while the actionable detail lives only in internal_message. This is now tried first in _sanitize_error().
Preflight error preserved: _run_preflight() returns tuple[bool, str | None] so the sanitized connector error is included in the result instead of being replaced by a generic wrapper.
Readback warnings: Readback failures are now captured in a warnings field on the result model, making it clear why table_statistics is null (e.g. PyAirbyteSecretNotFoundError: Secret not found. for Snowflake).

Local test results

Postgres (Docker container, destination-postgres):

✅ skip_preflight=True, basic_types — success, full table_statistics with column stats
✅ Preflight enabled, basic_types — preflight_passed: true, full table_statistics
✅ Bad credentials — preflight_passed: false, error: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 60001ms

Snowflake (GSM creds, destination-snowflake):

✅ skip_preflight=True, basic_types — success, 3 records delivered
✅ Preflight enabled — preflight_passed: true, 3 records delivered
⚠️ table_statistics: null with warning: PyAirbyteSecretNotFoundError: Secret not found. — pre-existing bug in _dest_to_cache.py where Snowflake credentials dict lacks a password attribute (outside scope of this PR, but now surfaced via warnings)

Review & Testing Checklist for Human

_extract_trace_error_from_log reads entire log file into memory: For long-running syncs with verbose logging, the log file could be large. The function reads all lines then scans in reverse. Confirm this is acceptable for expected log sizes.
_sanitize_error prefers log file over exception chain: The log file internal_message is tried first. If the log contains a TRACE ERROR from a different failure than the one that raised the exception, the wrong message could be surfaced. Verify this ordering is correct for the common case.
_run_preflight catches broad Exception: Intentional (preflight should never crash the main flow), but verify this aligns with the project's exception handling philosophy.
basic_types may be written twice: If the user's scenarios include basic_types (e.g. fast), the preflight writes it first, then it's written again. Confirm append semantics are acceptable.
Test with a real destination: Run pyab destination-smoke-test --destination=destination-postgres with good and bad credentials. Verify the error message in the bad-credentials case is actionable (not "Connector failed.").

Notes

No new automated tests added — this feature primarily needs integration testing with real destinations, which is credential-gated.
The Snowflake table_statistics: null issue is a pre-existing bug in snowflake_destination_to_cache() where DestinationSnowflake.credentials is parsed as a dict without a password attribute. This PR surfaces the issue via warnings but does not fix the root cause.
elapsed_seconds is 0.0 in the preflight-failure early return since the preflight is not timed separately.

Link to Devin session: https://app.devin.ai/sessions/e2af697f2eb64507b4bbbced88c9af6c
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Summary by CodeRabbit

Release Notes

New Features
- Destination smoke tests now perform automatic preflight validation before main tests
- Added --skip-preflight CLI option to optionally disable preflight checks
- Tests now collect and report warnings for non-fatal issues encountered
Bug Fixes
- Improved error detection and reporting with enhanced error message extraction from connector logs

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

- Run basic_types scenario before requested scenarios to validate credentials and connectivity - Return early with structured error if preflight fails - Add skip_preflight parameter to core function, MCP tool, and CLI - Improve error surfacing by walking exception chain to find the actual connector error message instead of generic wrapper - Add preflight_passed field to DestinationSmokeTestResult model Closes #1006 Co-Authored-By: AJ Steers <aj@airbyte.io>

devin-ai-integration · 2026-03-26T19:27:49Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2026-03-26T19:28:16Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774553037-preflight-smoke-test' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1774553037-preflight-smoke-test'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/fix-pr - Fixes most formatting and linting issues
/uv-lock - Updates uv.lock file
/test-pr - Runs tests with the updated PyAirbyte
/prerelease - Builds and publishes a prerelease version to PyPI

📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

coderabbitai · 2026-03-26T19:31:48Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an optional preflight (basic_types) write to destination smoke tests (skippable), surfaces connector internal errors from logs, returns preflight_passed and warnings in results, and exposes --skip-preflight CLI flag and MCP parameter.

Changes

Cohort / File(s)	Summary
Core Preflight & Result Model `airbyte/_util/destination_smoke_tests.py`	Introduced `PREFLIGHT_SCENARIO`/`PREFLIGHT_STREAM_NAME`, `_build_preflight_scenario(...)`, `_run_preflight(...)`; added `preflight_passed: bool
CLI Integration `airbyte/cli/pyab.py`	Added `--skip-preflight` click option and forwarded `skip_preflight` into `run_destination_smoke_test(...)`.
MCP Tool Integration `airbyte/mcp/local.py`	Added `skip_preflight: bool = False` parameter to `destination_smoke_test(...)` and forwarded it to `run_destination_smoke_test(...)` with updated help text.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant User
participant CLI_MCP as CLI/MCP
participant Runner as SmokeTestRunner
participant Destination
participant Logger

User->>CLI_MCP: invoke destination_smoke_test (maybe --skip-preflight)
CLI_MCP->>Runner: run_destination_smoke_test(skip_preflight)
Runner->>Runner: _prepare_destination_config()
alt preflight not skipped
    Runner->>Destination: write(PREFLIGHT_SCENARIO, cache=False, state_cache=False)
    Destination-->>Runner: success / exception
    alt exception
        Runner->>Runner: sanitize error (log trace extraction)
        Runner->>Logger: log preflight failure (exc_info)
        Runner-->>CLI_MCP: return { success:false, records_delivered:0, error, preflight_passed:false }
        CLI_MCP-->>User: output result
        return
    end
end
Runner->>Destination: write(requested scenarios)
Destination-->>Runner: records written / exception
Runner->>Destination: readback()
Destination-->>Runner: readback data / exception
Runner->>Runner: collect/sanitize warnings (including readback failures)
Runner->>Logger: log warnings (if any)
Runner-->>CLI_MCP: return result(success, records_delivered, preflight_passed, warnings)
CLI_MCP-->>User: output result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Would you like a brief changelog entry or an example CLI usage line for --skip-preflight, wdyt?

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main feature: adding a preflight check to the destination smoke test tool, which aligns with the primary changeset objective.
Linked Issues check	✅ Passed	The PR implements all coding requirements from issue `#1006`: preflight check with basic_types, early return on failure, skip_preflight parameter, improved error surfacing via trace logs, and warnings field for readback failures.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the preflight check feature scope: smoke test utility, CLI integration, MCP integration, and error sanitization improvements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch devin/1774553037-preflight-smoke-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

airbyte/_util/destination_smoke_tests.py (1)
269-283: Consider handling non-PyAirbyte exceptions in the chain?

The chain-walking logic is a nice improvement! One thing I noticed: if the deepest exception is a standard Python exception (like ValueError or ConnectionError) without get_message(), we skip it and fall back to the wrapper's message. This could lose the actual root cause message.

For example, with a chain like AirbyteConnectorWriteError(original_exception=ConnectionError("Connection refused")), we'd still get the generic wrapper message instead of "Connection refused".

What do you think about adding a fallback for non-PyAirbyte exceptions, something like this?
💡 Suggested enhancement
     # If we found a deeper exception with a specific message, use it
-    if deepest is not ex and hasattr(deepest, "get_message"):
+    if deepest is not ex:
+      if hasattr(deepest, "get_message"):
         deep_msg = deepest.get_message()
         # Only use the deep message if it's more specific than the wrapper
         if deep_msg and hasattr(ex, "get_message") and deep_msg != ex.get_message():
             return f"{type(ex).__name__}: {deep_msg}"
+      else:
+        # For standard exceptions, use str() to get the message
+        deep_msg = str(deepest)
+        if deep_msg and hasattr(ex, "get_message") and deep_msg != ex.get_message():
+            return f"{type(ex).__name__}: {deep_msg}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/_util/destination_smoke_tests.py` around lines 269 - 283, The
chain-walking currently prefers deepest exceptions only if they implement
get_message(); update the logic in destination_smoke_tests.py around variables
ex and deepest so that if deepest lacks get_message() but str(deepest) yields a
non-empty, more specific message than ex.get_message(), you return
f"{type(ex).__name__}: {str(deepest)}" (i.e., treat standard Python exceptions
like ConnectionError/ValueError as valid root-cause messages); retain the
existing fallback to ex.get_message() and final fallback to str(ex) if no
get_message() exists.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@airbyte/_util/destination_smoke_tests.py`:
- Around line 269-283: The chain-walking currently prefers deepest exceptions
only if they implement get_message(); update the logic in
destination_smoke_tests.py around variables ex and deepest so that if deepest
lacks get_message() but str(deepest) yields a non-empty, more specific message
than ex.get_message(), you return f"{type(ex).__name__}: {str(deepest)}" (i.e.,
treat standard Python exceptions like ConnectionError/ValueError as valid
root-cause messages); retain the existing fallback to ex.get_message() and final
fallback to str(ex) if no get_message() exists.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9307797b-c09e-41ef-b161-6182b4909d09

📥 Commits

Reviewing files that changed from the base of the PR and between 65e1c03 and ab5c061.

📒 Files selected for processing (3)

airbyte/_util/destination_smoke_tests.py
airbyte/cli/pyab.py
airbyte/mcp/local.py

…walking - Add 'warnings' field to DestinationSmokeTestResult to surface readback failures (e.g. cache connection issues) that cause table_statistics to be null - Improve _sanitize_error to handle standard Python exceptions (ConnectionError, ValueError, etc.) in the exception chain, not just PyAirbyte exceptions with get_message() Co-Authored-By: AJ Steers <aj@airbyte.io>

github-actions · 2026-03-26T19:40:29Z

PyTest Results (Fast Tests Only, No Creds)

343 tests ±0 343 ✅ ±0 5m 38s ⏱️ +16s
1 suites ±0 0 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit e100ed2. ± Comparison against base commit 65e1c03.

♻️ This comment has been updated with latest results.

_run_preflight() now returns (bool, str | None) so the caller includes the sanitized connector error in the structured result instead of a generic 'verify credentials' message. Co-Authored-By: AJ Steers <aj@airbyte.io>

github-actions · 2026-03-26T20:19:23Z

PyTest Results (Full)

413 tests ±0 395 ✅ ±0 40m 16s ⏱️ + 15m 4s
1 suites ±0 18 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit e100ed2. ± Comparison against base commit 65e1c03.

♻️ This comment has been updated with latest results.

…ter error surfacing Co-Authored-By: AJ Steers <aj@airbyte.io>

…main run Co-Authored-By: AJ Steers <aj@airbyte.io>

The top-level import of PREDEFINED_SCENARIOS from airbyte.cli.smoke_test_source._scenarios caused a circular import that broke test collection. Replace with inline scenario data (mirrors the basic_types schema) to avoid the cycle. Co-Authored-By: AJ Steers <aj@airbyte.io>

airbyte/_util/destination_smoke_tests.py

Copilot

Pull request overview

Adds an automatic preflight connectivity/credentials check to the destination smoke test workflow and improves how failures and non-fatal readback issues are surfaced to callers (CLI + MCP).

Changes:

Introduces skip_preflight flag to CLI and MCP tool interfaces.
Adds preflight execution (writes a small basic_types-derived scenario) and returns early with a structured error on failure.
Enhances error sanitization (log TRACE internal_message extraction, exception-chain selection) and surfaces readback failures as warnings.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`airbyte/mcp/local.py`	Adds `skip_preflight` parameter to the MCP tool and forwards it into the shared smoke test runner.
`airbyte/cli/pyab.py`	Adds `--skip-preflight` CLI flag and forwards it into the shared smoke test runner.
`airbyte/_util/destination_smoke_tests.py`	Implements preflight logic, extends result model with `preflight_passed`/`warnings`, and improves error/warning surfacing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

airbyte/_util/destination_smoke_tests.py

…tatic logger template Co-Authored-By: AJ Steers <aj@airbyte.io>

… file reading Co-Authored-By: AJ Steers <aj@airbyte.io>

…ions Co-Authored-By: AJ Steers <aj@airbyte.io>

devin-ai-integration bot assigned Aaron ("AJ") Steers (aaronsteers) Mar 26, 2026

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

coderabbitai bot approved these changes Mar 26, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

fix: Surface actual connector error in preflight failure result

0c851c2

_run_preflight() now returns (bool, str | None) so the caller includes the sanitized connector error in the structured result instead of a generic 'verify credentials' message. Co-Authored-By: AJ Steers <aj@airbyte.io>

coderabbitai bot approved these changes Mar 26, 2026

View reviewed changes

feat: Extract TRACE ERROR internal_message from connector log for bet…

192a65c

…ter error surfacing Co-Authored-By: AJ Steers <aj@airbyte.io>

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot and others added 2 commits March 26, 2026 20:50

fix: Use dedicated stream name for preflight to avoid collision with …

a89d3b6

…main run Co-Authored-By: AJ Steers <aj@airbyte.io>

Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 26, 2026 20:56

Copilot AI review requested due to automatic review settings March 26, 2026 20:56

Aaron ("AJ") Steers (aaronsteers) commented Mar 26, 2026

View reviewed changes

airbyte/_util/destination_smoke_tests.py Show resolved Hide resolved

Copilot started reviewing on behalf of Aaron ("AJ") Steers (aaronsteers) March 26, 2026 20:57 View session

This comment was marked as resolved.

Sign in to view

Copilot AI reviewed Mar 26, 2026

View reviewed changes

airbyte/_util/destination_smoke_tests.py Outdated Show resolved Hide resolved

airbyte/_util/destination_smoke_tests.py Outdated Show resolved Hide resolved

airbyte/_util/destination_smoke_tests.py Outdated Show resolved Hide resolved

fix: Prevent preflight from expanding to all fast scenarios and use s…

7ed0018

…tatic logger template Co-Authored-By: AJ Steers <aj@airbyte.io>

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot and others added 2 commits March 26, 2026 21:01

fix: Handle wrapper comparison for standard exceptions and harden log…

3a3aa2c

… file reading Co-Authored-By: AJ Steers <aj@airbyte.io>

docs: Use Markdown single backticks in docstrings per project convent…

e100ed2

…ions Co-Authored-By: AJ Steers <aj@airbyte.io>

coderabbitai bot approved these changes Mar 26, 2026

View reviewed changes

Aaron ("AJ") Steers (aaronsteers) enabled auto-merge (squash) March 26, 2026 21:11

Aaron ("AJ") Steers (aaronsteers) merged commit 3873093 into main Mar 26, 2026
22 checks passed

Aaron ("AJ") Steers (aaronsteers) deleted the devin/1774553037-preflight-smoke-test branch March 26, 2026 21:45

Conversation

Aaron ("AJ") Steers (aaronsteers) commented Mar 26, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Updates since last revision

Local test results

Review & Testing Checklist for Human

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

devin-ai-integration bot commented Mar 26, 2026

🤖 Devin AI Engineer

Uh oh!

github-actions bot commented Mar 26, 2026

👋 Greetings, Airbyte Team Member!

Testing This PyAirbyte Version

PR Slash Commands

Helpful Resources

Community Support

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Fast Tests Only, No Creds)

Uh oh!

This comment was marked as resolved.

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PyTest Results (Full)

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aaron ("AJ") Steers (aaronsteers) commented Mar 26, 2026 •

edited by github-actions bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

github-actions bot commented Mar 26, 2026 •

edited

Loading

github-actions bot commented Mar 26, 2026 •

edited

Loading