Skip to content

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented Jul 31, 2025

What

We want to start migrating some streams from DeclarativeStream to DefaultStream. However, the check operation has not been implemented for DefaultStream. It seems like check and availability strategies are intertwined i.e. both the declarative and file-based stuff calls those. I wanted to clean this before actually implementing the check for DefaultStream to make sure we weren't migrating stuff that wasn't necessary.

History Lesson

Availability strategies were initially added to be called during READ commands in order for on stream not to break all the other streams in the sync so we would execute the availability strategy (i.e. most of the time trying to fetch the first records) and if it wasn't working, we would simply skip this stream. This caused problematic (we needed to fetch some records that we would later fetch again during the actual sync) and could be solved by improving our error handling instead. Hence, we started deprecating this. However, it's been deprecated a couple of months after we started the concurrent framework so this was implemented partially there as well (see [this](Stream.check_availability method)). Once the cleaning started on the legacy CDK, the concurrent part was never cleaned up. There was also a usage that has been added in the File Based CDK but this one seems legit so it seems we need to keep it.

What it means for this PR

  • Removing concurrent CDK availability strategies from the concurrent stuff
    • This includes removing the counter part in the File Based CDK
  • Remove the deprecation mention on the File Based CDK as it is used (rightfully so I guess) here
  • Keep the HttpAvailabilityStrategy as it is the default strategy for HttpStream. I assume this will be moved to a legacy package once we have the AbstractStream version of it and we are ready to do the breaking change

Summary by CodeRabbit

  • Refactor

    • Removed all availability strategy classes and related functionality from concurrent streaming modules.
    • Updated method signatures and removed deprecated decorators for improved consistency.
    • Eliminated unused classes and parameters to streamline codebase.
  • Tests

    • Removed tests related to availability strategy functionality and updated test scenarios to reflect code changes.
  • Chores

    • Added a minor comment for future reference in the codebase.

@maxi297 maxi297 requested a review from brianjlai July 31, 2025 13:37
@maxi297
Copy link
Contributor Author

maxi297 commented Jul 31, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/remove-availability-strategy-except-for-filebased#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/remove-availability-strategy-except-for-filebased

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Jul 31, 2025

PyTest Results (Fast)

3 690 tests   - 5   3 679 ✅  - 5   6m 31s ⏱️ -7s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 0f36dc5. ± Comparison against base commit 209cb22.

This pull request removes 5 tests.
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available0]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available1]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available_using_singleton]
unit_tests.sources.streams.concurrent.test_adapters.StreamFacadeTest ‑ test_check_availability_is_delegated_to_wrapped_stream
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_check_availability

♻️ This comment has been updated with latest results.

Copy link
Contributor

coderabbitai bot commented Jul 31, 2025

📝 Walkthrough

Walkthrough

This change removes all code related to availability strategies from the concurrent stream framework, including imports, class definitions, method signatures, and tests. Constructors and method signatures are updated to eliminate availability strategy parameters and logic. Related test cases and wrappers are also deleted to reflect the removal of this feature.

Changes

Cohort / File(s) Change Summary
Declarative Source Availability Strategy Removal
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
Removed all usage and import of AlwaysAvailableAvailabilityStrategy from stream instantiation.
File-Based Availability Strategy Interface Cleanup
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py,
airbyte_cdk/sources/file_based/availability_strategy/__init__.py
Removed AbstractFileBasedAvailabilityStrategyWrapper class and exports; updated method signatures to make source optional.
File-Based Stream Decorator Removal
airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py
Removed @deprecated decorator from the availability_strategy property.
Concurrent File-Based Stream Adapter Cleanup
airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
Removed import and usage of AbstractFileBasedAvailabilityStrategyWrapper in stream facade creation.
Availability Strategy Comment
airbyte_cdk/sources/streams/availability_strategy.py
Added a single comment before the AvailabilityStrategy class.
Concurrent Abstract Stream Interface Simplification
airbyte_cdk/sources/streams/concurrent/abstract_stream.py
Removed import and abstract method for availability checking.
Concurrent Stream Adapter Refactor
airbyte_cdk/sources/streams/concurrent/adapters.py
Removed all availability strategy references, the check_availability method, and the deprecated AvailabilityStrategyFacade class.
Default Stream Refactor
airbyte_cdk/sources/streams/concurrent/default_stream.py
Removed availability strategy parameter from the constructor, related imports, instance variable, and the check_availability method.
Concurrent Stream Scenario Test Cleanup
unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
Removed all references to AlwaysAvailableAvailabilityStrategy from test scenarios.
Concurrent Stream Adapter Test Cleanup
unit_tests/sources/streams/concurrent/test_adapters.py
Removed all tests and imports related to AvailabilityStrategyFacade and availability delegation.
Default Stream Test Cleanup
unit_tests/sources/streams/concurrent/test_default_stream.py
Removed all usage of the availability strategy mock and the test for availability checking.
Removal of Entire Availability Strategy Module
airbyte_cdk/sources/streams/concurrent/availability_strategy.py
Deleted the entire module defining availability strategy interfaces, classes, and constants.

Sequence Diagram(s)

sequenceDiagram
    participant TestSuite
    participant DefaultStream
    participant AvailabilityStrategy

    %% Previous flow (now removed)
    TestSuite->>DefaultStream: create with availability_strategy
    TestSuite->>DefaultStream: call check_availability()
    DefaultStream->>AvailabilityStrategy: check_availability()
    AvailabilityStrategy-->>DefaultStream: returns availability
    DefaultStream-->>TestSuite: returns availability

    %% New flow (simplified)
    TestSuite->>DefaultStream: create (no availability_strategy)
    %% No check_availability method exists
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • darynaishchenko

Would you like to consider adding a note in the changelog or documentation about the removal of availability strategy support from concurrent streams, just to help future maintainers? Wdyt?

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d1e2f4 and 0f36dc5.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (0 hunks)
💤 Files with no reviewable changes (1)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-google-drive
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Analyze (python)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch maxi297/remove-availability-strategy-except-for-filebased

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/streams/availability_strategy.py (1)

17-18: Replace vague “FIXME” with actionable context?

The bare # FIXME this doesn’t communicate what needs fixing, by when, or why the line/class is still alive in a PR that otherwise removes availability-strategy usage. Could we either (a) upgrade the comment to a descriptive TODO: explaining the next step, or (b) delete the file entirely if it’s now unused—wdyt?

-# FIXME this
+# TODO(airbyte-team): Evaluate whether AvailabilityStrategy is still required.
+# If not referenced anywhere, schedule removal in the next major release.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4cbaaf and 76ac6f7.

📒 Files selected for processing (12)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (0 hunks)
  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1 hunks)
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1 hunks)
  • airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py (0 hunks)
  • airbyte_cdk/sources/file_based/stream/concurrent/adapters.py (0 hunks)
  • airbyte_cdk/sources/streams/availability_strategy.py (1 hunks)
  • airbyte_cdk/sources/streams/concurrent/abstract_stream.py (0 hunks)
  • airbyte_cdk/sources/streams/concurrent/adapters.py (0 hunks)
  • airbyte_cdk/sources/streams/concurrent/default_stream.py (0 hunks)
  • unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py (0 hunks)
  • unit_tests/sources/streams/concurrent/test_adapters.py (0 hunks)
  • unit_tests/sources/streams/concurrent/test_default_stream.py (0 hunks)
💤 Files with no reviewable changes (9)
  • airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py
  • unit_tests/sources/streams/concurrent/test_default_stream.py
  • airbyte_cdk/sources/streams/concurrent/abstract_stream.py
  • airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
  • airbyte_cdk/sources/streams/concurrent/default_stream.py
  • unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
  • airbyte_cdk/sources/streams/concurrent/adapters.py
  • unit_tests/sources/streams/concurrent/test_adapters.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: the files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from ...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: when code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repositor...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: in the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-g...
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py
📚 Learning: when modifying the `yamldeclarativesource` class in `airbyte_cdk/sources/declarative/yaml_declarativ...
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
🧬 Code Graph Analysis (1)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)
  • AbstractFileBasedAvailabilityStrategy (19-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)

1-7: No dangling references to AbstractFileBasedAvailabilityStrategyWrapper

I ran a global search for AbstractFileBasedAvailabilityStrategyWrapper and didn’t find any remaining occurrences—imports and exports are in sync with the cleanup. Nice work! wdyt?

airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)

25-25: check_availability overrides now align with the new signature

I verified that all existing overrides of check_availability in both airbyte_cdk/sources/streams/... and airbyte_cdk/sources/file_based/availability_strategy/... accept the new
source: Optional[Source] = None parameter, so this change is fully backward-compatible and shouldn’t break any implementations.

• AbstractFileBasedAvailabilityStrategy.check_availability (line 25) matches.
• DefaultFileBasedAvailabilityStrategy.check_availability uses _ for the source arg as expected.

One small follow-up—now that every override’s signature matches the base class, shall we remove the # type: ignore[override] comments? wdyt?

Copy link

github-actions bot commented Jul 31, 2025

PyTest Results (Full)

3 693 tests   - 5   3 682 ✅  - 5   11m 50s ⏱️ +17s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 0f36dc5. ± Comparison against base commit 209cb22.

This pull request removes 5 tests.
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available0]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available1]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available_using_singleton]
unit_tests.sources.streams.concurrent.test_adapters.StreamFacadeTest ‑ test_check_availability_is_delegated_to_wrapped_stream
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_check_availability

♻️ This comment has been updated with latest results.

@maxi297 maxi297 requested a review from tolik0 July 31, 2025 14:26
@maxi297 maxi297 changed the title remove chore: remove unused availability strategy stuff Jul 31, 2025
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🪓

@maxi297
Copy link
Contributor Author

maxi297 commented Aug 5, 2025

Will be merged as part of #686

@maxi297 maxi297 closed this Aug 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants