Skip to content

Conversation

darynaishchenko
Copy link
Contributor

@darynaishchenko darynaishchenko commented Jul 24, 2025

based on airbytehq/airbyte#43322

Summary by CodeRabbit

  • New Features

    • Client-side incremental pre-filtering for records and enhanced record filtering behavior.
    • Conditional lazy retrieval for certain streams to avoid work on empty/unchanged streams.
    • Declarative query-property resolution with configurable property chunking.
  • Refactor

    • Unified class-based component construction with dependency injection and shared construction flags.
    • Default component factory remapped to support DI; legacy factory helpers removed for clearer extensibility.

@darynaishchenko darynaishchenko self-assigned this Jul 24, 2025
@darynaishchenko darynaishchenko requested a review from tolik0 July 24, 2025 12:32
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@daryna/low-code/refactor-model-to-component-factory#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch daryna/low-code/refactor-model-to-component-factory

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@github-actions github-actions bot added the chore label Jul 24, 2025
@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jul 24, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

coderabbitai bot commented Jul 24, 2025

📝 Walkthrough

Walkthrough

Introduces a ComponentConstructor pattern with AdditionalFlags for DI, migrates default declarative components (RecordFilter, QueryProperties, SimpleRetriever) to that pattern, updates ModelToComponentFactory to build ComponentConstructor-based defaults, and adds a ClientSideIncrementalRecordFilterDecorator for cursor-based pre-filtering.

Changes

Cohort / File(s) Summary of Changes
Component constructor core
airbyte_cdk/sources/declarative/parsers/component_constructor.py
Added AdditionalFlags dataclass and generic ComponentConstructor[M] with resolve_dependencies and build class methods to enable DI-style construction.
Factory wiring
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Added _flags: AdditionalFlags; mapping values now may be ComponentConstructor classes or legacy callables; _create_component_from_model dispatches to ComponentConstructor.build(...) for component classes and retains legacy callable path; removed legacy create_* factory methods.
Record filtering
airbyte_cdk/sources/declarative/extractors/record_filter.py
RecordFilter now inherits ComponentConstructor[RecordFilterModel]; added resolve_dependencies and instance filter_records; added ClientSideIncrementalRecordFilterDecorator that pre-filters via a Cursor and delegates to base filter when a condition exists.
Query properties
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py
QueryProperties now inherits ComponentConstructor[QueryPropertiesModel]; added resolve_dependencies to resolve property_list, property_chunking, config, and parameters via dependency constructor and flags.
Simple retriever
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
SimpleRetriever now inherits Retriever and ComponentConstructor[SimpleRetrieverModel]; added _should_use_lazy_simple_retriever, comprehensive resolve_dependencies assembling decoder/requester/paginator/stream_slicer/cursor and lazy-retriever criteria, plus build returning LazySimpleRetriever or SimpleRetriever.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ModelToComponentFactory
    participant ComponentConstructor
    participant SimpleRetriever
    participant RecordFilter
    participant QueryProperties

    User->>ModelToComponentFactory: request component (model, config)
    ModelToComponentFactory->>ComponentConstructor: build(model, config, dependency_constructor, additional_flags)
    ComponentConstructor->>ComponentConstructor: resolve_dependencies(...)
    ComponentConstructor->>SimpleRetriever: instantiate (or return LazySimpleRetriever)
    ComponentConstructor->>RecordFilter: instantiate
    ComponentConstructor->>QueryProperties: instantiate
    ComponentConstructor-->>ModelToComponentFactory: constructed component
    ModelToComponentFactory-->>User: return component
Loading
sequenceDiagram
    participant Retriever
    participant RecordFilterDecorator
    participant Cursor
    participant BaseRecordFilter

    Retriever->>RecordFilterDecorator: filter_records(records, state, slice, token)
    RecordFilterDecorator->>Cursor: should_be_synced(record, state)
    Cursor-->>RecordFilterDecorator: keep/drop
    alt condition present
        RecordFilterDecorator->>BaseRecordFilter: filter_records(cursor_filtered_records, ...)
        BaseRecordFilter-->>RecordFilterDecorator: filtered output
        RecordFilterDecorator-->>Retriever: final records
    else no condition
        RecordFilterDecorator-->>Retriever: cursor_filtered_records
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

airbyte-python-cdk

Suggested reviewers

  • brianjlai
  • maxi297
  • tolik0

Want a short example showing how to call ModelToComponentFactory.build() for a RecordFilter and a SimpleRetriever, wdyt?

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.78% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "chore(low code): refactor model to component factory" accurately and concisely captures the primary change in this PR — refactoring the model-to-component construction to a component-factory pattern and wiring dependency injection — and is specific to the updated modules like model_to_component_factory and ComponentConstructor. It is short, relevant, and avoids extraneous detail, so a reviewer can quickly understand the main intent.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch daryna/low-code/refactor-model-to-component-factory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)

34-62: Consider refining the return type annotation, wdyt?

The build method logic is solid, but the return type annotation ComponentConstructor[M] might be too restrictive. Since cls(**resolved_dependencies) returns an instance of the actual subclass (not the base class), would using Self or making it more generic be more accurate?

+from typing import Self  # Add to imports if using Python 3.11+
...
-    ) -> "ComponentConstructor[M]":
+    ) -> Self:

The orchestration logic itself looks great - first resolve dependencies, then build the component.


63-73: Potential code duplication - is this intentional, wdyt?

I notice this _json_schema_type_name_to_type method is identical to the one in model_to_component_factory.py (lines 1023-1032). If this is part of moving the functionality to a more appropriate location, that's great! But if it's unintentional duplication, we might want to consolidate them.

The implementation itself is solid - handles the None case gracefully and provides the correct type mappings.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

628-630: Remove unused type variable D

The type variable D is defined but never used in the code. Consider removing it to keep the code clean, wdyt?

 M = TypeVar("M", bound=BaseModel)
-D = TypeVar("D", bound=BaseModel)

840-852: Fix variable type annotation

The component_instance variable holds the actual component instance returned by build(), not a ComponentConstructor. Consider updating the type annotation for clarity, wdyt?

 if inspect.isclass(component) and issubclass(component, ComponentConstructor):
     # Default components flow
-    component_instance: ComponentConstructor[BaseModel] = component.build(
+    component_instance = component.build(
         model=model,
         config=config,
         dependency_constructor=self._create_component_from_model,
         additional_flags=self._flags,
         **kwargs,
     )
     return component_instance
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51cfea5 and 2d8129b.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/extractors/record_filter.py (2 hunks)
  • airbyte_cdk/sources/declarative/parsers/component_constructor.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)

Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.

airbyte_cdk/sources/declarative/extractors/record_filter.py (1)

Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)

Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.

Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in airbyte_cdk/cli/source_declarative_manifest/, including _run.py, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in airbyte_cdk/cli/source_declarative_manifest/ is being imported from another repository, avoid suggesting modifications to it during the import process.

🧬 Code Graph Analysis (2)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • ValueType (1407-1411)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • resolve_dependencies (34-46)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
  • _json_schema_type_name_to_type (1024-1033)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)
  • ComponentConstructor (17-73)
  • build (35-61)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • RecordFilter (21-68)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (17)
  • RecordFilter (1266-1277)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[error] 679-4107: Mypy type checking failed with 96 errors including missing type parameters, unexpected keyword arguments, incompatible argument types, incompatible return types, and attribute errors. See mypy output for details.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
🔇 Additional comments (9)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)

1-14: LGTM! Clean imports and type setup.

The imports are well-organized and the type variable M bound to BaseModel provides good type safety for the generic component constructor.


16-32: Solid foundation for the component constructor pattern.

The generic dataclass design with the template method pattern is well-executed. The resolve_dependencies method provides a safe default while establishing a clear contract for subclasses to implement their specific dependency resolution logic.

airbyte_cdk/sources/declarative/extractors/record_filter.py (3)

5-5: Clean import additions for the new component constructor pattern.

The new imports are well-organized and necessary for integrating with the ComponentConstructor framework. The alias pattern for RecordFilterModel is a nice touch to avoid potential naming conflicts.

Also applies to: 13-16


21-21: Excellent integration with the component constructor pattern.

The inheritance change from a plain dataclass to ComponentConstructor[RecordFilterModel] cleanly integrates RecordFilter into the new declarative component construction framework while maintaining strong type safety.


33-46: Perfect implementation of the dependency resolution pattern.

The resolve_dependencies method cleanly extracts the necessary parameters from the model and configuration, with sensible defaults for optional fields. The returned mapping aligns perfectly with the RecordFilter's initialization requirements, and the method signature correctly follows the ComponentConstructor contract.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

633-638: Great documentation addition!

The docstring clearly explains the factory's purpose and the separation between default and custom component implementations.


669-677: Clean implementation of reusable configuration flags

The _flags dictionary provides a nice way to pass commonly needed configuration to component constructors without repetition.


679-681: Correct type annotation for dual constructor support

The updated type annotation properly reflects support for both ComponentConstructor subclasses and legacy callable constructors, maintaining backward compatibility.


759-759: Simplified mapping for RecordFilter

Good change to directly use the RecordFilter class instead of a factory method, aligning with the new ComponentConstructor pattern.

Copy link

github-actions bot commented Jul 24, 2025

PyTest Results (Fast)

3 746 tests  +3   3 734 ✅ +3   6m 27s ⏱️ +4s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 2fca25a. ± Comparison against base commit fd3a330.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jul 24, 2025

PyTest Results (Full)

3 749 tests  +3   3 737 ✅ +3   10m 49s ⏱️ -15s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 2fca25a. ± Comparison against base commit fd3a330.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

828-833: Return type annotation issue already identified.

This matches the previous review comment about the misleading return type annotation. The TODO comment indicates this is intentionally temporary during the transition period, which seems reasonable for now, wdyt?

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

828-858: Excellent dual-pattern support logic with a small typo to fix!

The logic perfectly handles both ComponentConstructor and legacy patterns. I love how it uses the recursive dependency_constructor for building nested components and passes the additional_flags for configuration sharing.

One tiny fix needed - there's a typo on line 857: "Instance shource be" should be "Instance should be", wdyt?

-            f"Instance shource be one of ComponentConstructor or method implemented in ModelToComponentFactory"
+            f"Instance should be one of ComponentConstructor or method implemented in ModelToComponentFactory"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d8129b and e1a4e7e.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)

Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.

Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in airbyte_cdk/cli/source_declarative_manifest/, including _run.py, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Learnt from: aaronsteers
PR: #58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in airbyte_cdk/cli/source_declarative_manifest/ is being imported from another repository, avoid suggesting modifications to it during the import process.

🧬 Code Graph Analysis (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)
  • ComponentConstructor (17-73)
  • build (35-61)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • RecordFilter (21-68)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (17)
  • RecordFilter (1266-1277)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (6)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6)

21-21: LGTM! Clean addition of generic type support.

The TypeVar import and generic type variables M and D are properly defined and bound to BaseModel, which aligns perfectly with the ComponentConstructor pattern. The ComponentConstructor import is also correctly placed. Nice work on maintaining type safety throughout the refactor!

Also applies to: 628-630, 450-450


633-638: Excellent documentation update!

The updated docstring clearly explains the architectural goal of decoupling standard and custom component construction approaches. This helps future maintainers understand the design decisions, wdyt?


669-677: Smart design pattern for configuration sharing!

Creating a centralized _flags dictionary to hold reusable factory configuration is an elegant solution. This ensures consistent access to factory settings across all ComponentConstructor instances and reduces the need to pass individual parameters repeatedly. Well thought out!


679-681: Perfect type annotation for the transition period!

The Union type correctly captures that the mapping can contain either new ComponentConstructor classes or legacy callable methods. This provides excellent type safety during the gradual migration while being flexible enough to handle the mixed approach, wdyt?


759-759: Great migration to the ComponentConstructor pattern!

Changing the mapping from a factory method to the RecordFilter class directly is exactly what we want for the new ComponentConstructor pattern. Since RecordFilter now inherits from ComponentConstructor and implements resolve_dependencies, this change makes perfect sense and promotes consistency, wdyt?


632-858: Outstanding refactor execution!

This is a textbook example of how to introduce a significant architectural change. The approach:

✅ Maintains backward compatibility during transition
✅ Provides clear type safety with Union types
✅ Centralizes configuration through the _flags pattern
✅ Uses proper dependency injection with recursive dependency_constructor
✅ Includes helpful documentation and TODO comments

The gradual migration starting with RecordFilter is smart - it allows testing the pattern before rolling it out broadly. Really well done, wdyt?

@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jul 25, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)

35-63: Nice implementation of the resolve_dependencies pattern!

The method cleanly handles both direct list and constructed property_list cases. One question though - I noticed that when constructing property_list via dependency_constructor (lines 47-49), the additional_flags parameter isn't passed along, unlike some other dependency constructor calls in the codebase. Is this intentional because property_list components don't need flags, or should we include them for consistency? wdyt?

airbyte_cdk/sources/declarative/parsers/component_constructor.py (1)

18-43: Consider using dataclass fields instead of manual init

The AdditionalFlags class is decorated with @dataclass but defines a manual __init__ method. This defeats some benefits of dataclasses like automatic __repr__, __eq__, and field declarations. Would it make sense to use dataclass fields directly? wdyt?

 @dataclass
 class AdditionalFlags:
-    def __init__(
-        self,
-        emit_connector_builder_messages: bool,
-        disable_retries: bool,
-        message_repository: MessageRepository,
-        connector_state_manager: ConnectorStateManager,
-        limit_pages_fetched_per_slice: Optional[int],
-        limit_slices_fetched: Optional[int],
-    ):
-        self.emit_connector_builder_messages = emit_connector_builder_messages
-        self.disable_retries = disable_retries
-        self.message_repository = message_repository
-        self.connector_state_manager = connector_state_manager
-        self.limit_pages_fetched_per_slice = limit_pages_fetched_per_slice
-        self.limit_slices_fetched = limit_slices_fetched
+    emit_connector_builder_messages: bool
+    disable_retries: bool
+    message_repository: MessageRepository
+    connector_state_manager: ConnectorStateManager
+    limit_pages_fetched_per_slice: Optional[int]
+    limit_slices_fetched: Optional[int]
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

631-632: Question about the D TypeVar

I notice that D is declared but doesn't seem to be used anywhere in this file. The M TypeVar makes sense for the generic component pattern, but could we remove D if it's not needed here, wdyt? This would keep the code cleaner and avoid confusion.


832-862: Solid implementation of the dual construction pattern!

The logic flow correctly handles both ComponentConstructor subclasses and legacy callable constructors. The recursive dependency resolution is well implemented by passing self._create_component_from_model as the dependency_constructor.

One small suggestion: could we make the error message in the final ValueError a bit more specific about what types are expected, wdyt? Something like:

-        raise ValueError(
-            f"Unexpected component mapping type for {model.__class__}. "
-            f"Instance should be one of ComponentConstructor or method implemented in ModelToComponentFactory"
-        )
+        raise ValueError(
+            f"Unexpected component mapping type for {model.__class__}: {type(component)}. "
+            f"Expected ComponentConstructor subclass or callable method, got {component}"
+        )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b569f5 and 3cdd426.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/extractors/record_filter.py (2 hunks)
  • airbyte_cdk/sources/declarative/parsers/component_constructor.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (3 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/extractors/record_filter.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-102)
  • build (64-90)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • QueryProperties (21-84)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (19)
  • QueryProperties (2770-2787)
  • RecordFilter (1266-1277)
  • SimpleRetriever (2806-2860)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • RecordFilter (24-71)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)
  • SimpleRetriever (93-962)
  • build (409-432)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (10)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)

4-12: LGTM! Clean import organization

The imports are well-organized, properly distinguishing between model schemas and construction utilities.

airbyte_cdk/sources/declarative/parsers/component_constructor.py (2)

45-91: Clean template method pattern implementation!

The ComponentConstructor base class provides a nice abstraction for declarative component construction. The separation between resolve_dependencies and build is well thought out. One minor observation - the return type annotation on line 71 shows ComponentConstructor[M], but it actually returns an instance of the concrete subclass (e.g., QueryProperties, SimpleRetriever). This is technically correct due to inheritance, but might be worth noting in the docstring for clarity? wdyt?


92-103: Useful type conversion helper

The _json_schema_type_name_to_type method provides a clean mapping from JSON schema types to Python types. Good defensive programming with the None check.

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (3)

93-93: Good adoption of the ComponentConstructor pattern

The SimpleRetriever now properly implements the ComponentConstructor interface, enabling standardized dependency resolution.


206-224: Clean closure implementation for URL retrieval

The _get_url closure elegantly handles the deferred URL resolution needed for lazy retriever scenarios.


272-328: Thorough query properties resolution with good error handling

The method carefully handles all possible sources of query properties configuration and provides clear error messages for conflicting definitions. The backward compatibility consideration (line 304) is appreciated.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

450-453: LGTM on the new imports!

These imports are essential for the ComponentConstructor pattern being introduced. The imports are clean and properly organized within the existing import structure.


672-680: Nice centralization of flags!

The _flags initialization looks solid and follows the ComponentConstructor pattern well. All the necessary configuration flags are included and will be properly passed to component constructors for dependency resolution.


683-685: Great refactor to the ComponentConstructor pattern!

The type hint update to support both ComponentConstructor subclasses and callable factory methods is well thought out. The specific entries for QueryProperties, RecordFilter, and SimpleRetriever correctly point to their respective classes that now inherit from ComponentConstructor. This maintains backward compatibility while enabling the new pattern.

Also applies to: 762-763, 770-770


636-640: Excellent documentation of the architectural change!

The updated class docstring clearly explains the new dual construction pattern and the reasoning behind decoupling standard and custom component construction. This will help other developers understand the architectural direction.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)

135-135: Past review concern still applies regarding class-level state modification

This issue was previously identified - the class-level _should_use_lazy_simple_retriever attribute creates shared state problems across instances. The validation method modifies this class attribute, which could lead to race conditions. wdyt about making this an instance variable or passing it through the resolution process instead?


408-432: Critical issue from past review still exists - LazySimpleRetriever created but not returned

The build method has the same issue previously identified: when _should_use_lazy_simple_retriever is True, a LazySimpleRetriever is created on line 431 but discarded, and a regular SimpleRetriever is always returned instead. Shouldn't we return the lazy variant when the conditions are met? wdyt?

         if cls._should_use_lazy_simple_retriever:
-            LazySimpleRetriever(**resolved_dependencies)
-        return SimpleRetriever(**resolved_dependencies)
+            return LazySimpleRetriever(**resolved_dependencies)
+        return SimpleRetriever(**resolved_dependencies)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cdd426 and 06e05b9.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (5)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5)

13-13: LGTM on the imports and class declaration changes!

The new imports and the updated class declaration to implement ComponentConstructor[SimpleRetrieverModel] properly support the component factory pattern refactor. The imports are well-organized and necessary for the new functionality.

Also applies to: 22-22, 26-27, 31-31, 33-67, 78-78, 80-80, 82-82, 93-93


178-206: Method signature and structure look well-designed

The resolve_dependencies method signature properly follows the ComponentConstructor pattern with comprehensive parameter support. The early validation for the required name parameter is good defensive programming. The method structure with helper functions promotes readability. Nice work!


206-256: Helper functions are well-crafted

The helper functions within resolve_dependencies are nicely designed:

  • The _get_url() closure provides lazy URL resolution
  • query_properties_in_request_parameters() handles detection logic cleanly
  • _get_log_formatter() appropriately respects the slice limiting flags

The single-responsibility principle is well-applied here. Great approach! wdyt?


272-328: Query properties resolution logic is robust

The query properties resolution handles the complexity of multiple potential sources very well. The conflict detection between fetch_properties_from_endpoint and request_parameters with clear error messages is excellent defensive programming. The deprecation note for the legacy approach is helpful for maintainers. Solid implementation! wdyt?


329-406: Component construction and dependency resolution is comprehensive

The component construction logic handles the various edge cases well:

  • Cursor assignment from stream_slicer is appropriate
  • The DatetimeBasedCursor special handling for request_options_provider makes sense
  • Conditional application of StreamSlicerTestReadDecorator respects the flags properly
  • The lazy URL evaluation in paginator construction is clever
  • Early return for lazy retriever case maintains the optimization

The final dependency dictionary construction is thorough and complete. Great work on this complex orchestration! wdyt?

@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jul 25, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

422-429: Duplicate comment from previous review - lazy retriever creation issue was addressed.

This issue where LazySimpleRetriever was created but not returned has already been resolved in the current implementation. The build method now correctly returns the LazySimpleRetriever when conditions are met.

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

176-253: The closure pattern for URL retrieval is clever - wdyt about adding documentation?

The _get_url() closure and helper functions are well-designed for lazy evaluation scenarios. The logic handles both direct URL access and fallback to requester methods appropriately. Would you consider adding a bit more documentation to the _get_log_formatter function to clarify when the log formatter is applied vs when it returns None?

        def _get_log_formatter(
            log_formatter: Callable[[Response], Any] | None,
            name: str,
            additional_flags: AdditionalFlags,
        ) -> Callable[[Response], Any] | None:
+           """
+           Returns a log formatter for test slice limiting scenarios.
+           Only applies formatting when slices are being limited for testing.
+           """
            if additional_flags.should_limit_slices_fetched:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d06719 and 218dabb.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-102)
  • build (64-90)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • QueryProperties (21-84)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (19)
  • QueryProperties (2770-2787)
  • RecordFilter (1266-1277)
  • SimpleRetriever (2806-2860)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • RecordFilter (24-71)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)
  • SimpleRetriever (93-959)
  • build (401-429)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py

[error] 423-423: mypy: Argument "name" to "_should_use_lazy_simple_retriever" of "SimpleRetriever" has incompatible type "Any | None"; expected "str" [arg-type]

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (10)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5)

13-87: Import additions look well-structured for the new component factory pattern.

The new imports support the refactoring to use ComponentConstructor and related models. The imports are logically organized and all appear necessary for the new functionality. Good work on keeping the imports clean!


93-93: Clean inheritance from ComponentConstructor - nice architectural improvement!

The class now properly inherits from ComponentConstructor[SimpleRetrieverModel], which aligns perfectly with the component factory pattern described in the PR objectives. This sets up the foundation for standardized dependency resolution across declarative components.


136-174: The lazy retriever validation logic is comprehensive and well-structured.

The conditions for applying the lazy retriever are thorough - checking partition router type, stream state, lazy read pointers, incremental sync constraints, and decoder type. The error messages are clear and informative. This addresses the previous concerns about class-level state modification by returning a boolean instead of modifying class state.


255-331: Excellent dependency resolution for decoder, record selector, and query properties handling.

The logic for handling query properties from multiple locations (request_parameters, fetch_properties_from_endpoint, query_properties) is robust with good error handling for conflicting definitions. The fallback to JsonDecoder when no decoder is specified is sensible. The validation preventing multiple QueryProperties definitions is a great safeguard.


332-399: Smart handling of cursor and request options provider relationships.

The logic correctly identifies when to use the stream_slicer as the request_options_provider for custom DatetimeBasedCursor implementations, while defaulting appropriately for standard cases. The slice limiting logic for tests is well-integrated. The early return for lazy retrievers with base dependencies is a clean optimization.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)

21-21: LGTM! New imports and type variables are well-structured.

The addition of TypeVar import and the new ComponentConstructor/AdditionalFlags imports align perfectly with the refactoring goals. The type variables M and D are appropriately bound to BaseModel and follow good naming conventions, wdyt?

Also applies to: 450-453, 631-632


636-641: Nice documentation update!

The enhanced docstring clearly explains the separation between default and custom component construction patterns. This helps developers understand the architectural changes, wdyt?


672-680: Excellent abstraction with AdditionalFlags!

Creating a reusable container for common configuration flags is a clean design pattern. The selected flags (emit_connector_builder_messages, disable_retries, etc.) are exactly what components would need during construction. This should make dependency injection much cleaner, wdyt?


683-685: Perfect transition to the dual constructor pattern!

The updated type annotation correctly captures that the mapping now supports both ComponentConstructor subclasses and legacy factory methods. The specific changes for QueryPropertiesModel, RecordFilterModel, and SimpleRetrieverModel show a clean migration to the new pattern - mapping directly to classes instead of factory methods. This maintains backward compatibility while enabling the new approach, wdyt?

Also applies to: 762-763, 770-770


832-862: Excellent implementation of the dual construction pattern!

This refactored method elegantly handles both the new ComponentConstructor pattern and legacy factory methods. The logic flow is very clean:

  1. Type inspection to detect ComponentConstructor subclasses
  2. Calling .build() with proper dependency injection for new components
  3. Fallback to legacy factory methods
  4. Comprehensive error handling

The recursive dependency construction via self._create_component_from_model is particularly well-designed. The past review comment about the return type annotation was correctly addressed, wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

178-400: Comprehensive dependency resolution, but wondering about decomposition?

This method handles the dependency orchestration really well - I can see how it resolves all the complex interdependencies between components. The helper functions like _get_url and query_properties_in_request_parameters help organize the logic nicely.

The query properties resolution logic is particularly thorough, handling conflicts between different definition sources with clear error messages. The early return pattern for lazy retrievers is also clean.

That said, at ~220 lines, this method is doing quite a lot. Would you consider breaking it down into smaller, more focused methods? Something like _resolve_query_properties, _resolve_core_components, _resolve_pagination_components? It might make testing and maintenance easier. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 218dabb and b212ed9.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (3)

13-93: LGTM! Clean imports and class signature update.

The import additions support the new component factory pattern nicely, and the class signature change to inherit from ComponentConstructor[SimpleRetrieverModel] aligns perfectly with the refactor objectives. The imports are well-organized and necessary for the new functionality.


136-177: Nice validation logic for lazy retriever conditions!

The method does a great job ensuring LazySimpleRetriever is only used when all constraints are met. The validation for incremental sync type, decoder type, and parent stream configurations looks comprehensive. I appreciate how it raises clear error messages when constraints are violated - this will help developers understand the limitations quickly. wdyt?


402-432: Clean build method implementation!

The build method follows the ComponentConstructor pattern perfectly - delegating to resolve_dependencies and then making the right decision about whether to return a LazySimpleRetriever or regular SimpleRetriever. The flow is clear and the implementation looks solid. wdyt?

@maxi297 maxi297 self-requested a review July 28, 2025 14:10
@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Aug 5, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes sense to me. Thanks for cleaning stuff up! Just a couple of questions/nits but I'll approve anyway

**kwargs,
)
return component_instance
if inspect.ismethod(component) or inspect.isfunction(component):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: elif here to clarify that both if clauses are mutually exclusive

**kwargs: Any,
) -> Mapping[str, Any]:
return {
"condition": model.condition or "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to better understand the pattern: In an ideal world where this isn't a dataclass that exposes everything as public, would the logic of having the condition as InterpolatedBoolean be done in resolve_dependencies or would we still have this in the init?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we will still have this in __init__ just to have less thing to refactor and avoid missing something. But it's a good idea and I think we consider changing the place of interpolation here or in __init__

return cls(**resolved_dependencies)

@staticmethod
def _json_schema_type_name_to_type(value_type: Optional[ValueType]) -> Optional[Type[Any]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the usage of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok! So in the PoC PR (or whatever we call this PR), we are using it but in this PR specifically, it is not used yet, is that it? I think I'm fine with having this here then. Thanks for the added context

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

831-861: Excellent dynamic dispatch implementation!

The _create_component_from_model refactor handles both patterns elegantly:

  • ComponentConstructor subclasses use the new build() method with dependency injection
  • Legacy callables maintain backward compatibility

The type checking with inspect.isclass and issubclass is robust, and the error messages are clear. One small suggestion - would it make sense to use elif on line 856 instead of if to clarify the mutual exclusivity, as mentioned in the past review comments?

-        if inspect.ismethod(component) or inspect.isfunction(component):
+        elif inspect.ismethod(component) or inspect.isfunction(component):

This would make the flow more explicit, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b212ed9 and 5e71790.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/extractors/record_filter.py (2 hunks)
  • airbyte_cdk/sources/declarative/parsers/component_constructor.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/parsers/component_constructor.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Applied to files:

  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
📚 Learning: 2024-12-11T16:34:46.319Z
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.

Applied to files:

  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🧬 Code Graph Analysis (3)
airbyte_cdk/sources/declarative/extractors/record_filter.py (3)
airbyte_cdk/sources/declarative/interpolation/interpolated_boolean.py (1)
  • InterpolatedBoolean (29-66)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (17)
  • RecordFilter (1266-1277)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-90)
  • resolve_dependencies (48-61)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (16)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (23)
  • JsonDecoder (864-865)
  • DatetimeBasedCursor (1801-1909)
  • HttpRequester (2542-2681)
  • IncrementingCountCursor (1779-1798)
  • QueryProperties (2772-2789)
  • SimpleRetriever (2808-2862)
  • SubstreamPartitionRouter (2971-2978)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
airbyte_cdk/sources/declarative/decoders/json_decoder.py (1)
  • JsonDecoder (19-47)
airbyte_cdk/sources/declarative/incremental/datetime_based_cursor.py (2)
  • DatetimeBasedCursor (28-448)
  • get_stream_state (130-131)
airbyte_cdk/sources/declarative/incremental/resumable_full_refresh_cursor.py (2)
  • ResumableFullRefreshCursor (12-93)
  • get_stream_state (18-19)
airbyte_cdk/sources/declarative/requesters/http_requester.py (1)
  • HttpRequester (37-484)
airbyte_cdk/sources/declarative/partition_routers/substream_partition_router.py (2)
  • SubstreamPartitionRouter (80-422)
  • get_stream_state (397-418)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (5)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-90)
  • resolve_dependencies (48-61)
  • should_limit_slices_fetched (37-42)
  • build (64-90)
airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py (1)
  • DefaultFileUploader (32-97)
airbyte_cdk/sources/declarative/retrievers/retriever.py (1)
  • Retriever (13-54)
airbyte_cdk/sources/declarative/stream_slicers/stream_slicer_test_read_decorator.py (1)
  • StreamSlicerTestReadDecorator (14-28)
airbyte_cdk/sources/declarative/stream_slicers/stream_slicer.py (1)
  • StreamSlicer (15-25)
airbyte_cdk/sources/http_logger.py (1)
  • format_http_message (12-51)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
  • streams (300-332)
airbyte_cdk/sources/declarative/partition_routers/single_partition_router.py (2)
  • get_stream_state (59-63)
  • SinglePartitionRouter (13-63)
airbyte_cdk/sources/declarative/requesters/request_options/request_options_provider.py (1)
  • RequestOptionsProvider (13-79)
airbyte_cdk/sources/declarative/requesters/request_options/default_request_options_provider.py (1)
  • DefaultRequestOptionsProvider (15-60)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-90)
  • build (64-90)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (19)
  • QueryProperties (2772-2789)
  • RecordFilter (1266-1277)
  • SimpleRetriever (2808-2862)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (196-197)
  • Config (210-211)
  • Config (224-225)
  • Config (238-239)
  • Config (252-253)
  • Config (266-267)
  • Config (280-281)
  • Config (294-295)
  • Config (310-311)
  • Config (324-325)
  • Config (338-339)
  • Config (372-373)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • QueryProperties (21-84)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)
  • SimpleRetriever (94-939)
  • build (403-431)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (12)
airbyte_cdk/sources/declarative/extractors/record_filter.py (4)

8-14: LGTM! Imports align well with the refactoring pattern.

The imports properly support the component constructor pattern with AdditionalFlags and ComponentConstructor, and the RecordFilterModel alias creates clean separation between the model and implementation.


20-20: Component inheritance looks good!

The change from dataclass to ComponentConstructor[RecordFilterModel] inheritance properly implements the new factory pattern while maintaining the existing functionality.


32-45: Solid dependency resolution implementation!

The resolve_dependencies method cleanly extracts the required dependencies from the model and provides sensible defaults. The mapping structure is clear and follows the expected pattern.


70-109: Great addition of the decorator pattern!

The ClientSideIncrementalRecordFilterDecorator is well-implemented with proper cursor integration. The conditional chaining logic in filter_records (lines 102-108) nicely handles the case where both cursor filtering and condition filtering are needed. The cursor pre-filtering followed by optional condition filtering makes sense, wdyt?

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (4)

94-94: Perfect inheritance update!

The multiple inheritance from both Retriever and ComponentConstructor[SimpleRetrieverModel] properly supports the new factory pattern while maintaining backward compatibility.


137-177: Comprehensive lazy retriever validation logic!

The _should_use_lazy_simple_retriever method has thorough validation with clear error messages. The conditions for lazy retriever usage are well-defined:

  • SubstreamPartitionRouter with lazy_read_pointer
  • No existing stream state
  • DatetimeBasedCursor only
  • JsonDecoder only
  • Single slice constraint

The validation provides helpful error messages for each constraint violation, which will make debugging easier for users, wdyt?


179-400: Excellent dependency resolution orchestration!

The resolve_dependencies method is comprehensive and well-structured. I particularly like:

  • The closure pattern for _get_url() to handle lazy URL resolution
  • The query properties conflict detection and resolution
  • The conditional logic for request_options_provider based on cursor types
  • The clean separation between lazy and non-lazy dependency sets

The method handles all the complex interdependencies between decoder, requester, query properties, and cursor components very elegantly.


402-431: Clean build method implementation!

The build method properly orchestrates the dependency resolution and correctly returns either LazySimpleRetriever or SimpleRetriever based on the lazy conditions. The flow is clear and follows the expected ComponentConstructor pattern.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

630-639: Nice documentation and type variable additions!

The class docstring clearly explains the separation between default and custom component construction approaches. The type variables M and D provide better type safety for the generic patterns used throughout the factory.


671-679: Smart addition of AdditionalFlags integration!

The _flags attribute centralizes all the reusable configuration flags needed for dependency injection with default components. This creates a clean abstraction for passing runtime configuration down to component constructors.


682-684: Type annotation improvement looks good!

The updated mapping type Dict[Type[BaseModel], Union[Type[ComponentConstructor[Any]], Callable[..., Any]]] properly reflects that mappings can now point to either component classes or legacy callables.


761-762: Clean migration to class-based mappings!

The direct mapping to QueryProperties, RecordFilter, and SimpleRetriever classes (instead of factory methods) properly implements the new ComponentConstructor pattern. This creates cleaner separation and enables the dependency injection flow.

Also applies to: 769-769

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1914-1933: Ensure parameter inference works for ComponentConstructor classes

When the mapping value is a class that implements ComponentConstructor, using getfullargspec on the class can miss keyword-only args accepted by build(...). Shall we detect that case and introspect build instead so $parameters and parent kwargs are forwarded correctly, wdyt?

Apply:

-                model_constructor = self.PYDANTIC_MODEL_TO_CONSTRUCTOR.get(parsed_model.__class__)
-                constructor_kwargs = inspect.getfullargspec(model_constructor).kwonlyargs
+                model_constructor = self.PYDANTIC_MODEL_TO_CONSTRUCTOR.get(parsed_model.__class__)
+                if inspect.isclass(model_constructor) and issubclass(model_constructor, ComponentConstructor):
+                    sig = inspect.signature(model_constructor.build)
+                    constructor_kwargs = [
+                        name for name, p in sig.parameters.items() if p.kind == inspect.Parameter.KEYWORD_ONLY
+                    ]
+                else:
+                    constructor_kwargs = inspect.getfullargspec(model_constructor).kwonlyargs
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

974-975: Fix infinite recursion in LazySimpleRetriever fallback path

self._read_pages(...) calls itself in the else branch, leading to infinite recursion when child_response is falsy. Should this delegate to the parent implementation via super(), wdyt?

Apply:

-        else:
-            yield from self._read_pages(records_generator_fn, stream_state, stream_slice)
+        else:
+            yield from super()._read_pages(records_generator_fn, stream_state, stream_slice)
♻️ Duplicate comments (2)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)

395-402: Guard and pass a string name to satisfy mypy and runtime intent

Let’s validate name before use and pass a guaranteed str to _should_use_lazy_simple_retriever, wdyt?

Apply:

-        if cls._should_use_lazy_simple_retriever(
+        name = kwargs.get("name")
+        if not isinstance(name, str):
+            raise ValueError("name must be a string")
+        if cls._should_use_lazy_simple_retriever(
             model=model,
             additional_flags=additional_flags,
             incremental_sync=kwargs.get("incremental_sync"),
-            name=kwargs.get("name"),
+            name=name,
         ):
             return LazySimpleRetriever(**resolved_dependencies)

133-141: Type mismatch causes mypy failure (name must be str)

name is Optional here, but it’s used to call get_stream_state(name, ...) which expects str. This is the mypy error flagged by CI. Can we make name a required str, wdyt?

Apply:

-        name: Optional[str] = None,
+        name: str,
🧹 Nitpick comments (5)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

652-654: Drop unused TypeVar D

D is defined but unused. Remove to avoid dead code, wdyt?

Apply:

-M = TypeVar("M", bound=BaseModel)
-D = TypeVar("D", bound=BaseModel)
+M = TypeVar("M", bound=BaseModel)

2946-2953: Make wildcard check robust to InterpolatedString

lazy_read_pointer elements can be InterpolatedString. Converting to str avoids a possible type issue. Shall we cast to str, wdyt?

Apply:

-        if model.lazy_read_pointer and any("*" in pointer for pointer in model.lazy_read_pointer):
+        if model.lazy_read_pointer and any("*" in str(pointer) for pointer in model.lazy_read_pointer):
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (3)

195-206: Nit: grammar in error message

“more that one slice per parent” → “more than one slice per parent”. Quick copyedit, wdyt?

Apply:

-                        f"Found more that one slice per parent. LazySimpleRetriever only supports single slice read for stream - {name}."
+                        f"Found more than one slice per parent. LazySimpleRetriever only supports single slice read for stream - {name}."

183-184: Correct type for transformations

transformations is a list everywhere it’s passed/used. Shall we update the annotation to reflect that, wdyt?

Apply:

-        transformations: Optional[RecordTransformation] = None,
+        transformations: Optional[List[RecordTransformation]] = None,

227-246: Optional: preserve custom log_formatter when not limiting slices

Currently the helper returns None unless should_limit_slices_fetched is True, discarding a user-provided log_formatter. Do we want to return the provided formatter in the else branch to honor explicit configuration, wdyt?

Apply:

-            return None
+            return log_formatter
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e71790 and 2fca25a.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (3)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-90)
  • build (64-90)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • QueryProperties (21-84)
airbyte_cdk/sources/declarative/extractors/record_filter.py (1)
  • RecordFilter (20-67)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)
  • SimpleRetriever (89-916)
  • build (374-402)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (7)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (23)
  • JsonDecoder (846-847)
  • CustomRequester (219-230)
  • HttpRequester (2520-2659)
  • QueryProperties (2750-2767)
  • SimpleRetriever (2786-2840)
  • SubstreamPartitionRouter (2949-2956)
  • Config (136-137)
  • Config (150-151)
  • Config (164-165)
  • Config (178-179)
  • Config (192-193)
  • Config (206-207)
  • Config (220-221)
  • Config (234-235)
  • Config (248-249)
  • Config (262-263)
  • Config (276-277)
  • Config (292-293)
  • Config (306-307)
  • Config (320-321)
  • Config (354-355)
  • Config (378-379)
  • NoPagination (1004-1005)
airbyte_cdk/sources/declarative/decoders/json_decoder.py (1)
  • JsonDecoder (19-47)
airbyte_cdk/sources/declarative/requesters/http_requester.py (4)
  • HttpRequester (37-484)
  • _get_url (145-179)
  • get_url (130-143)
  • get_url_base (181-193)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (2)
  • QueryProperties (21-84)
  • resolve_dependencies (36-62)
airbyte_cdk/sources/declarative/parsers/component_constructor.py (5)
  • AdditionalFlags (19-42)
  • ComponentConstructor (46-90)
  • resolve_dependencies (48-61)
  • should_limit_slices_fetched (37-42)
  • build (64-90)
airbyte_cdk/sources/declarative/partition_routers/single_partition_router.py (2)
  • SinglePartitionRouter (13-57)
  • get_stream_state (53-57)
airbyte_cdk/sources/declarative/requesters/request_options/default_request_options_provider.py (1)
  • DefaultRequestOptionsProvider (15-60)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[error] 5-5: Ruff: Import block is unsorted or unformatted. (I001).

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py

[error] 145-145: Mypy type-check failed. Command: poetry run mypy --config-file mypy.ini airbyte_cdk. Argument 1 to 'get_stream_state' of 'ConnectorStateManager' has incompatible type 'str | None'; expected 'str' [arg-type].

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

780-789: Nice switch to DI-backed constructors

Mapping QueryPropertiesModel, RecordFilterModel, and SimpleRetrieverModel directly to their ComponentConstructor classes looks good and aligns with the new DI pattern. 👍


635-638: Fix Ruff import-order violation (I001)

Poetry wasn't available in the sandbox ("poetry: command not found"), so I couldn't run ruff here. Could you run one of these locally to auto-fix (or move the import into the existing airbyte_cdk.sources.declarative.* import block), wdyt?

  • poetry install && poetry run ruff check --fix airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
  • ruff check --fix airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (lines 635–638).

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

145-145: Re-run mypy/ruff after Optional typing fix in get_stream_state

mypy failure is at: and not bool(additional_flags.connector_state_manager.get_stream_state(name, None)) — Optional[str] may be passed to get_stream_state. Local verification failed with "/bin/bash: line 3: poetry: command not found". Can you re-run the provided checks in an environment with poetry (or run python -m mypy --config-file mypy.ini airbyte_cdk/sources/declarative/retrievers/simple_retriever.py and ruff check airbyte_cdk/sources/declarative/retrievers/simple_retriever.py) to confirm green? wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants