-
Notifications
You must be signed in to change notification settings - Fork 30
chore: remove unused availability strategy stuff #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: remove unused availability strategy stuff #685
Conversation
/autofix
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/remove-availability-strategy-except-for-filebased#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/remove-availability-strategy-except-for-filebased Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
PyTest Results (Fast)3 690 tests - 5 3 679 ✅ - 5 6m 31s ⏱️ -7s Results for commit 0f36dc5. ± Comparison against base commit 209cb22. This pull request removes 5 tests.
♻️ This comment has been updated with latest results. |
📝 WalkthroughWalkthroughThis change removes all code related to availability strategies from the concurrent stream framework, including imports, class definitions, method signatures, and tests. Constructors and method signatures are updated to eliminate availability strategy parameters and logic. Related test cases and wrappers are also deleted to reflect the removal of this feature. Changes
Sequence Diagram(s)sequenceDiagram
participant TestSuite
participant DefaultStream
participant AvailabilityStrategy
%% Previous flow (now removed)
TestSuite->>DefaultStream: create with availability_strategy
TestSuite->>DefaultStream: call check_availability()
DefaultStream->>AvailabilityStrategy: check_availability()
AvailabilityStrategy-->>DefaultStream: returns availability
DefaultStream-->>TestSuite: returns availability
%% New flow (simplified)
TestSuite->>DefaultStream: create (no availability_strategy)
%% No check_availability method exists
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Would you like to consider adding a note in the changelog or documentation about the removal of availability strategy support from concurrent streams, just to help future maintainers? Wdyt? Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/streams/availability_strategy.py (1)
17-18
: Replace vague “FIXME” with actionable context?The bare
# FIXME this
doesn’t communicate what needs fixing, by when, or why the line/class is still alive in a PR that otherwise removes availability-strategy usage. Could we either (a) upgrade the comment to a descriptiveTODO:
explaining the next step, or (b) delete the file entirely if it’s now unused—wdyt?-# FIXME this +# TODO(airbyte-team): Evaluate whether AvailabilityStrategy is still required. +# If not referenced anywhere, schedule removal in the next major release.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(0 hunks)airbyte_cdk/sources/file_based/availability_strategy/__init__.py
(1 hunks)airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
(1 hunks)airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py
(0 hunks)airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
(0 hunks)airbyte_cdk/sources/streams/availability_strategy.py
(1 hunks)airbyte_cdk/sources/streams/concurrent/abstract_stream.py
(0 hunks)airbyte_cdk/sources/streams/concurrent/adapters.py
(0 hunks)airbyte_cdk/sources/streams/concurrent/default_stream.py
(0 hunks)unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
(0 hunks)unit_tests/sources/streams/concurrent/test_adapters.py
(0 hunks)unit_tests/sources/streams/concurrent/test_default_stream.py
(0 hunks)
💤 Files with no reviewable changes (9)
- airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py
- airbyte_cdk/sources/declarative/concurrent_declarative_source.py
- unit_tests/sources/streams/concurrent/test_default_stream.py
- airbyte_cdk/sources/streams/concurrent/abstract_stream.py
- airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
- airbyte_cdk/sources/streams/concurrent/default_stream.py
- unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
- airbyte_cdk/sources/streams/concurrent/adapters.py
- unit_tests/sources/streams/concurrent/test_adapters.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: the files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from ...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
airbyte_cdk/sources/file_based/availability_strategy/__init__.py
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: when code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repositor...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
airbyte_cdk/sources/file_based/availability_strategy/__init__.py
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: in the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-g...
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Applied to files:
airbyte_cdk/sources/file_based/availability_strategy/__init__.py
📚 Learning: when modifying the `yamldeclarativesource` class in `airbyte_cdk/sources/declarative/yaml_declarativ...
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
🧬 Code Graph Analysis (1)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)
AbstractFileBasedAvailabilityStrategy
(19-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Check: source-shopify
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)
1-7
: No dangling references to AbstractFileBasedAvailabilityStrategyWrapperI ran a global search for
AbstractFileBasedAvailabilityStrategyWrapper
and didn’t find any remaining occurrences—imports and exports are in sync with the cleanup. Nice work! wdyt?airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)
25-25
: check_availability overrides now align with the new signatureI verified that all existing overrides of
check_availability
in bothairbyte_cdk/sources/streams/...
andairbyte_cdk/sources/file_based/availability_strategy/...
accept the new
source: Optional[Source] = None
parameter, so this change is fully backward-compatible and shouldn’t break any implementations.• AbstractFileBasedAvailabilityStrategy.check_availability (line 25) matches.
• DefaultFileBasedAvailabilityStrategy.check_availability uses_
for the source arg as expected.One small follow-up—now that every override’s signature matches the base class, shall we remove the
# type: ignore[override]
comments? wdyt?
PyTest Results (Full)3 693 tests - 5 3 682 ✅ - 5 11m 50s ⏱️ +17s Results for commit 0f36dc5. ± Comparison against base commit 209cb22. This pull request removes 5 tests.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🪓
Will be merged as part of #686 |
What
We want to start migrating some streams from DeclarativeStream to DefaultStream. However, the check operation has not been implemented for DefaultStream. It seems like check and availability strategies are intertwined i.e. both the declarative and file-based stuff calls those. I wanted to clean this before actually implementing the check for DefaultStream to make sure we weren't migrating stuff that wasn't necessary.
History Lesson
Availability strategies were initially added to be called during
READ
commands in order for on stream not to break all the other streams in the sync so we would execute the availability strategy (i.e. most of the time trying to fetch the first records) and if it wasn't working, we would simply skip this stream. This caused problematic (we needed to fetch some records that we would later fetch again during the actual sync) and could be solved by improving our error handling instead. Hence, we started deprecating this. However, it's been deprecated a couple of months after we started the concurrent framework so this was implemented partially there as well (see [this](Stream.check_availability method)). Once the cleaning started on the legacy CDK, the concurrent part was never cleaned up. There was also a usage that has been added in the File Based CDK but this one seems legit so it seems we need to keep it.What it means for this PR
HttpAvailabilityStrategy
as it is the default strategy for HttpStream. I assume this will be moved to a legacy package once we have the AbstractStream version of it and we are ready to do the breaking changeSummary by CodeRabbit
Refactor
Tests
Chores