Skip to content

feat(mcp): Add tools to modify stream sync settings and refresh catalog on existing connections#994

Open
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits intomainfrom
devin/1773770361-mcp-stream-sync-mode-refresh-catalog
Open

feat(mcp): Add tools to modify stream sync settings and refresh catalog on existing connections#994
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits intomainfrom
devin/1773770361-mcp-stream-sync-mode-refresh-catalog

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Mar 17, 2026

Summary

Adds two new MCP tools (and corresponding CloudConnection core methods) to support automated incremental stream testing and configuration workflows:

  1. refresh_connection_catalog — Triggers a discover operation on a connection's source (via withRefreshedCatalog: true on the Config API) and replaces the connection's catalog with the refreshed result. Equivalent to "Refresh source schema" in the UI.

  2. set_stream_sync_mode — Safely modifies the sync mode for a single stream in a connection's syncCatalog. Validates that the requested mode is in the stream's supportedSyncModes before applying. Optionally sets destinationSyncMode and cursorField.

Core logic lives in CloudConnection (connections.py) and api_util.py; MCP tools in cloud.py are thin wrappers per the presentation-layer pattern.

Closes #993

Review & Testing Checklist for Human

  • Namespace-qualified streams: set_stream_sync_mode matches streams by name only, not (name, namespace). Verify this is acceptable or if namespace matching is needed for multi-namespace sources.
  • refresh_catalog two-step flow: The method does a GET (with withRefreshedCatalog: true) then a separate POST to replace. Confirm this get-then-replace pattern is correct vs. a single update call.
  • End-to-end test: Pin a pre-release connector version to a test connection, call refresh_connection_catalog, then call set_stream_sync_mode to switch a stream from full_refresh to incremental. Verify the catalog updates persist and a sync succeeds.

Notes

  • Both tools are gated by safe mode (check_guid_created_in_session) and marked destructive=True, consistent with other catalog-mutating tools.
  • No unit tests added — the core methods operate against the Config API which requires integration-level testing.
  • The cursorField is set as a single-element list ([cursor_field]) to match the Airbyte catalog wire format.

Link to Devin session: https://app.devin.ai/sessions/dc6642a7916248b5ac6e92c493b8b870
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Summary by CodeRabbit

  • New Features
    • Refresh source schema and catalog for connections to discover new or updated streams.
    • Per-stream configuration to set source sync mode, destination sync mode, and cursor field for incremental syncs.
    • New command/tool endpoints to trigger catalog refreshes and to update per-stream sync settings from the cloud management interface.

Open with Devin

…og on existing connections

Adds two new MCP tools:
- refresh_connection_catalog: Triggers a discover operation on a connection's source
  and updates the catalog with latest stream definitions and sync modes
- set_stream_sync_mode: Safely changes the sync mode for a specific stream on a
  connection, with validation that the mode is supported

Core logic lives in CloudConnection (connections.py) and api_util.py, with MCP
tools as thin wrappers per the presentation layer pattern.

Closes #993

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773770361-mcp-stream-sync-mode-refresh-catalog' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773770361-mcp-stream-sync-mode-refresh-catalog'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

github-actions bot commented Mar 17, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   5m 49s ⏱️ +7s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit e6fce67. ± Comparison against base commit 71597ca.

♻️ This comment has been updated with latest results.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 17, 2026 18:14
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7f2a65b1-c45d-49f4-ae6b-00cbcb2bcdda

📥 Commits

Reviewing files that changed from the base of the PR and between 6e19b02 and e6fce67.

📒 Files selected for processing (2)
  • airbyte/cloud/connections.py
  • airbyte/mcp/cloud.py

📝 Walkthrough

Walkthrough

Added a utility to request a refreshed connection catalog, CloudConnection methods to refresh a connection's catalog and to update a stream's sync settings, and MCP endpoints that expose catalog refresh and per-stream sync-mode modification (note: MCP functions are duplicated in the diff).

Changes

Cohort / File(s) Summary
API Utility Layer
airbyte/_util/api_util.py
Added get_refreshed_connection_catalog(...) which calls the Config API with withRefreshedCatalog: true to return an updated connection catalog.
Connection Object Layer
airbyte/cloud/connections.py
Added refresh_catalog() to trigger discover/refresh and replace a connection's catalog; added set_stream_sync_mode(...) to find a stream, validate supported sync modes and cursor requirements, modify the stream's config, and persist the updated catalog.
MCP Tool Integration
airbyte/mcp/cloud.py
Added MCP endpoints refresh_connection_catalog(...) and set_stream_sync_mode(...) that validate context, resolve workspace, call the CloudConnection methods, and return human-readable confirmations. Note: both functions appear twice in the patch, producing duplicate declarations.

Sequence Diagram(s)

sequenceDiagram
    participant Client as MCP Client
    participant MCP as MCP Endpoint
    participant CloudConn as CloudConnection
    participant APIUtil as API Utility
    participant ConfigAPI as Config API

    rect rgba(100, 150, 200, 0.5)
    Note over Client,ConfigAPI: Refresh Catalog Flow
    Client->>MCP: refresh_connection_catalog(connection_id)
    MCP->>CloudConn: connection.refresh_catalog()
    CloudConn->>APIUtil: get_refreshed_connection_catalog(...)
    APIUtil->>ConfigAPI: GET /connections/get?withRefreshedCatalog=true
    ConfigAPI-->>APIUtil: Updated catalog
    APIUtil-->>CloudConn: Catalog dict
    CloudConn->>CloudConn: Validate & store catalog
    CloudConn-->>MCP: Confirmation message
    MCP-->>Client: Success with stream count & URL
    end

    rect rgba(150, 100, 200, 0.5)
    Note over Client,CloudConn: Set Stream Sync Mode Flow
    Client->>MCP: set_stream_sync_mode(connection_id, stream_name, sync_mode, ...)
    MCP->>CloudConn: connection.set_stream_sync_mode(...)
    CloudConn->>CloudConn: Locate stream in catalog
    CloudConn->>CloudConn: Validate sync_mode in supportedSyncModes
    CloudConn->>CloudConn: Update stream config
    CloudConn->>CloudConn: Save updated catalog
    CloudConn-->>MCP: Success
    MCP-->>Client: Confirmation with stream & connection details
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Out of Scope Changes check ❓ Inconclusive The summary notes that refresh_connection_catalog and set_stream_sync_mode appear duplicated in the diff within airbyte/mcp/cloud.py, which appears to be duplicate function declarations rather than intentional changes. Please clarify whether the duplicate function declarations in airbyte/mcp/cloud.py are intentional or accidental; if accidental, one set should be removed to keep the implementation clean.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main changes: adding MCP tools for stream sync settings and catalog refresh on existing connections.
Linked Issues check ✅ Passed The PR successfully implements both core requirements from #993: refresh_connection_catalog triggers a discover operation with withRefreshedCatalog, and set_stream_sync_mode safely modifies sync settings with supportedSyncModes validation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1773770361-mcp-stream-sync-mode-refresh-catalog
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

…am_sync_mode

Addresses CodeRabbit review feedback:
- Add stream_namespace parameter to disambiguate same-named streams in
  different namespaces. Raises PyAirbyteInputError when name is ambiguous.
- Add fail-fast guard when switching to incremental mode without a usable
  cursor field (no existing cursor, no default cursor, no source-defined cursor).

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@github-actions
Copy link

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   24m 37s ⏱️ +5s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit e6fce67. ± Comparison against base commit 71597ca.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new Airbyte Cloud MCP tools and core CloudConnection helpers to (1) refresh a connection’s catalog via a forced re-discover and (2) update per-stream sync mode settings safely.

Changes:

  • Add MCP tool refresh_connection_catalog to trigger a refreshed discover and replace the connection catalog.
  • Add MCP tool set_stream_sync_mode to update a single stream’s syncMode (optionally namespace-qualified) and related config.
  • Add Config API helper get_refreshed_connection_catalog and CloudConnection methods refresh_catalog / set_stream_sync_mode.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
airbyte/mcp/cloud.py Adds two new MCP tool wrappers that call CloudConnection methods and return human-readable status strings.
airbyte/cloud/connections.py Implements the core catalog refresh + per-stream sync mode update logic against the stored syncCatalog.
airbyte/_util/api_util.py Adds a Config API helper to fetch a connection with withRefreshedCatalog: true.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2712 to +2722
return (
f"Successfully set sync mode for stream '{stream_name}' "
f"on connection '{connection_id}' to '{sync_mode}'"
+ (
f" with destination sync mode '{destination_sync_mode}'"
if destination_sync_mode
else ""
)
+ (f" and cursor field '{cursor_field}'" if cursor_field else "")
+ f". URL: {connection.connection_url}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable observation. Including namespace in the success message when it was explicitly supplied would improve clarity for multi-namespace sources. Will add if the human reviewer agrees.\n\n---\nDevin session

Comment on lines +681 to +688
api_util.replace_connection_catalog(
connection_id=self.connection_id,
configured_catalog_dict=refreshed_catalog,
api_root=self.workspace.api_root,
client_id=self.workspace.client_id,
client_secret=self.workspace.client_secret,
bearer_token=self.workspace.bearer_token,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. import_raw_catalog does the same replace_connection_catalog call with the same auth params. Switching to self.import_raw_catalog(refreshed_catalog) would reduce duplication. Will apply if the human reviewer agrees.\n\n---\nDevin session

Comment on lines +742 to +745
available_streams = [
f"{e.get('stream', {}).get('namespace', '')}.{e.get('stream', {}).get('name', '')}"
for e in catalog["streams"]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — when namespace is None, this produces .users. However, this is a diagnostic context field on an error (not user-facing UI), so the dot-prefix still communicates "no namespace" clearly enough. Happy to clean this up if the human reviewer agrees it's worth addressing.\n\n---\nDevin session

Comment on lines +774 to +781
raise PyAirbyteInputError(
message=(f"Sync mode '{sync_mode}' is not supported by stream '{stream_name}'."),
context={
"stream_name": stream_name,
"requested_sync_mode": sync_mode,
"supported_sync_modes": supported_sync_modes,
},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point on consistency — connection_id is included in the other error contexts in this method. Will add if the human reviewer considers it worth a follow-up commit.\n\n---\nDevin session

config["destinationSyncMode"] = destination_sync_mode

if cursor_field is not None:
config["cursorField"] = [cursor_field]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by design — allowing cursor_field to be set during full_refresh enables pre-configuring the cursor before a subsequent switch to incremental mode (a common two-step workflow). The docstring could be clearer about this, but silently ignoring the parameter would be surprising. Deferring to human reviewer on whether to add a note in the docstring.\n\n---\nDevin session

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(mcp): Add tools to modify stream sync settings and refresh catalog on existing connections

2 participants