Skip to content

Conversation

@maxi297
Copy link
Contributor

@maxi297 maxi297 commented Oct 22, 2025

What

https://github.com/airbytehq/airbyte-internal-issues/issues/14929

How

Adding a cache within PropertiesFromEndpoint. Note that with this solution, every instance of PropertiesFromEndpoint will have a different cache so it may be that the same stream as a child/main stream vs as a parent stream have different properties from endpoint if a field is added between the read of those streams. I don't see a case where this would be problematic though.

Summary by CodeRabbit

  • Performance Improvements

    • Endpoint properties are cached to avoid redundant retrievals on repeated calls.
  • Bug Fixes

    • Property values are consistently returned as strings.
    • Chunking no longer mutates input property lists.
  • Improvements

    • Property chunking and property retrieval APIs simplified and use consistent list types.
  • Tests

    • New tests cover single-call caching, type coercion, and input immutability.
  • Documentation

    • Clarified that stream slices can't be interpolated from this retriever.

@maxi297 maxi297 requested a review from brianjlai October 22, 2025 13:37
@github-actions github-actions bot added the enhancement New feature or request label Oct 22, 2025
@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/cache_properties_from_endpoint#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/cache_properties_from_endpoint

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 22, 2025

📝 Walkthrough

Walkthrough

PropertiesFromEndpoint now caches computed property names and returns them as a List[str]; query-property APIs and call sites removed stream_slice parameters and tightened property_fields to List[str]; tests and SimpleRetriever updated to match signatures and verify caching and type coercion.

Changes

Cohort / File(s) Summary
Core Caching Implementation
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py
Added _cached_properties: Optional[List[str]] = None. get_properties_from_endpoint() now returns List[str], computes properties on first call by invoking retriever.read_records() (no stream_slice), extracts values via new _get_property(property_obj: Mapping[str, Any]) -> str, caches and returns the list.
Type Annotation Tightening & Signature Changes
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py, airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py
Tightened property_fields type to List[str]. Removed stream_slice argument from QueryProperties.get_request_property_chunks() and adjusted internal typing to List[str]. Chunking logic unchanged.
Retriever Call-site Updates & Cleanup
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
Updated iteration over get_request_property_chunks() (no stream_slice arg). For each chunk, constructs StreamSlice with extra_fields={"query_properties": properties}. Removed an unused local variable in read_records.
Unit Tests — PropertiesFromEndpoint
unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py
Updated tests to use returned List[str] directly. Added tests asserting retriever.read_records is called only once across multiple invocations and that integer values are coerced to strings.
Unit Tests — Property Chunking
unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py
Removed iterator conversion in setup, added test ensuring get_request_property_chunks does not mutate the input property_fields when always_include_properties is provided; minor typing annotations added.
Unit Tests — QueryProperties
unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py
Updated tests to call get_request_property_chunks() without stream_slice and to reflect tightened typing.
Schema doc
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Added note: "Note that stream_slices can't be interpolated from this retriever." No behavior change.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant PropertiesFromEndpoint
    participant Retriever

    Caller->>PropertiesFromEndpoint: get_properties_from_endpoint()
    alt cached not set
        PropertiesFromEndpoint->>Retriever: read_records(stream_slice=None)
        Retriever-->>PropertiesFromEndpoint: iterable records
        rect rgb(220,240,220)
            PropertiesFromEndpoint->>PropertiesFromEndpoint: map _get_property over records -> List[str]
            PropertiesFromEndpoint->>PropertiesFromEndpoint: set _cached_properties
        end
        PropertiesFromEndpoint-->>Caller: return List[str]
    else cached
        PropertiesFromEndpoint-->>Caller: return cached List[str]
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • brianjlai
  • darynaishchenko

Should cached properties ever be invalidated (e.g., if endpoint values can change per-run or per-slice), or is it acceptable to assume endpoint properties remain stable for the connector's lifetime — wdyt?

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "feat: cache properties from endpoint" directly and accurately describes the primary objective of this changeset. The PR's main change is the addition of a caching mechanism (via the _cached_properties field) to the PropertiesFromEndpoint class, which stores computed property names and returns cached values on subsequent calls rather than re-fetching from the endpoint. The title is clear, concise, and uses proper conventional commit formatting. A teammate reviewing the git history would immediately understand that this PR implements in-instance caching for endpoint property retrieval.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch maxi297/cache_properties_from_endpoint

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py (1)

1-1: Don't forget the formatting fix!

The pipeline is reporting formatting issues for this file too. Could you run ruff format to fix them, wdyt?

#!/bin/bash
cd unit_tests/sources/declarative/requesters/query_properties
ruff format test_properties_from_endpoint.py
🧹 Nitpick comments (1)
unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py (1)

108-128: Test logic is solid—just needs a formatting touch-up.

The test is excellent for validating the non-mutation guarantee. However, verification confirms the pipeline was right: the file is missing a final newline at the end.

Could you run ruff format unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py to add it, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20ae208 and ad730cf.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py (3 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py (1)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (3)
  • PropertyChunking (25-71)
  • PropertyLimitType (14-21)
  • get_request_property_chunks (42-68)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (3)
airbyte_cdk/sources/declarative/interpolation/interpolated_string.py (1)
  • InterpolatedString (13-79)
airbyte_cdk/sources/types.py (1)
  • StreamSlice (75-169)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)
  • read_records (512-554)
unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py (3)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
  • get_properties_from_endpoint (34-37)
  • PropertiesFromEndpoint (15-44)
airbyte_cdk/sources/types.py (5)
  • StreamSlice (75-169)
  • cursor_slice (107-112)
  • partition (99-104)
  • Record (21-72)
  • data (35-36)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)
  • read_records (512-554)
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py

[error] 1-1: Ruff format check failed. 3 files would be reformatted. Run 'ruff format' to fix code style issues in this file.

airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py

[error] 1-1: Ruff format check failed. 3 files would be reformatted. Run 'ruff format' to fix code style issues in this file.

unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py

[error] 1-1: Ruff format check failed. 3 files would be reformatted. Run 'ruff format' to fix code style issues in this file.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build
🔇 Additional comments (6)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)

42-44: LGTM! Type tightening aligns with implementation.

The change from Iterable[str] to List[str] matches the actual usage pattern—the function converts to a list immediately on line 46 anyway. This makes the contract more explicit.

airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)

37-41: LGTM! Type annotation correctly tightened.

The local fields variable type now accurately reflects that get_properties_from_endpoint returns List[str], making the typing more precise.

airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1)

39-44: Nice helper method for property extraction!

The _get_property method cleanly handles path evaluation and type coercion to string. The logic for handling both string and InterpolatedString nodes looks solid.

unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py (3)

47-50: LGTM! Test updated for new return type.

Removing the list() wrapper is correct since get_properties_from_endpoint now returns List[str] directly.


136-156: Good caching test, but consider testing different stream_slice values?

This test verifies that the retriever is called only once, which is great! However, it uses the same stream_slice for all three calls. Given my concern about the cache not accounting for different slices (see my comment on properties_from_endpoint.py), it might be valuable to add a test that uses different stream_slice values to verify the expected behavior, wdyt?


158-177: Excellent test for type coercion!

This test ensures that integer property values are correctly converted to strings, which is important for the consistent List[str] return type. Nice edge case coverage!

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

PyTest Results (Fast)

3 816 tests  +3   3 804 ✅ +3   6m 32s ⏱️ -13s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit d297e48. ± Comparison against base commit 26a9b98.

♻️ This comment has been updated with latest results.

@maxi297
Copy link
Contributor Author

maxi297 commented Oct 22, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebac375 and f76ee08.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (0 hunks)
💤 Files with no reviewable changes (1)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (3)
airbyte_cdk/sources/declarative/interpolation/interpolated_string.py (1)
  • InterpolatedString (13-79)
airbyte_cdk/sources/types.py (1)
  • StreamSlice (75-169)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)
  • read_records (512-553)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

PyTest Results (Full)

3 819 tests  +3   3 807 ✅ +3   11m 7s ⏱️ +3s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit d297e48. ± Comparison against base commit 26a9b98.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few notes, but I think this change makes sense and nothing to block. Let me know how you feel about the notes I mentioned.

@maxi297 maxi297 requested a review from brianjlai October 23, 2025 19:23
@maxi297
Copy link
Contributor Author

maxi297 commented Oct 23, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1)

43-48: Missing property path returns literal string "[]" instead of empty string.

When dpath.get doesn't find the property path, it returns the default value [] (empty list), which str() then converts to the literal string "[]". This means properties missing from the endpoint response will appear as the string "[]" in the property list rather than an empty string or being filtered out.

Is this the intended behavior, or should missing properties yield an empty string (using default="") or be skipped entirely, wdyt?

If you'd prefer empty strings for missing properties, apply this diff:

-        return str(dpath.get(property_obj, path, default=[]))  # type: ignore # extracted will be a MutableMapping, given input data structure
+        result = dpath.get(property_obj, path, default="")
+        return str(result) if result else ""  # type: ignore # extracted will be a MutableMapping, given input data structure

Alternatively, if missing properties should be skipped, you could filter them out in get_properties_from_endpoint instead.

🧹 Nitpick comments (1)
unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py (1)

88-90: Mock return type should match the actual implementation.

The mock returns an iterator (iter([...])) but get_properties_from_endpoint now returns a concrete List[str]. While Python's duck typing makes this work in practice, the mock should match the actual type for accuracy and to catch potential type-related issues, wdyt?

Apply this diff to align the mock with the actual return type:

-    properties_from_endpoint_mock.get_properties_from_endpoint.return_value = iter(
-        ["alice", "clover", "dio", "k", "luna", "phi", "quark", "sigma", "tenmyouji"]
-    )
+    properties_from_endpoint_mock.get_properties_from_endpoint.return_value = [
+        "alice", "clover", "dio", "k", "luna", "phi", "quark", "sigma", "tenmyouji"
+    ]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f76ee08 and 782a7b1.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py (3 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • unit_tests/sources/declarative/requesters/query_properties/test_properties_from_endpoint.py
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
🧰 Additional context used
🧬 Code graph analysis (3)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (3)
airbyte_cdk/sources/declarative/interpolation/interpolated_string.py (1)
  • InterpolatedString (13-79)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)
  • read_records (510-551)
airbyte_cdk/sources/types.py (3)
  • StreamSlice (75-169)
  • partition (99-104)
  • cursor_slice (107-112)
unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py (2)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • get_request_property_chunks (28-46)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-68)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (3)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-68)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
  • PropertiesFromEndpoint (14-48)
  • get_properties_from_endpoint (33-41)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • PropertiesFromEndpoint (2771-2782)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)

25-26: LGTM! Cache field appropriately initialized.

The _cached_properties field is properly typed as Optional[List[str]] and initialized to None, enabling the lazy-load pattern implemented below.


33-41: LGTM! Caching implementation is sound.

The caching logic correctly checks for None on first call, populates the cache via map, and returns the cached result on subsequent calls. This ensures properties are fetched once per instance, as intended.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1)

10-10: Minor: Consider removing unused StreamSlice import?

After removing the stream_slice parameter from get_properties_from_endpoint, the StreamSlice import on line 10 appears to be unused. Would you like to clean it up to keep imports minimal, wdyt?

#!/bin/bash
# Verify if StreamSlice is used anywhere in the file
rg -n "StreamSlice" airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py | grep -v "^10:"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d22170d and 2adf3ea.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
airbyte_cdk/sources/declarative/interpolation/interpolated_string.py (1)
  • InterpolatedString (13-79)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)
  • read_records (512-553)
🔇 Additional comments (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

3328-3328: LGTM! Clear documentation of the stream_slice limitation.

The note about stream_slices not being interpolatable from this retriever is a helpful clarification that aligns with the API changes in properties_from_endpoint.py. This will prevent users from attempting to use stream_slice context where it's not supported.

airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (3)

25-26: Good addition of the cache field.

The _cached_properties field with Optional[List[str]] type hint and None initialization is a clean way to implement lazy caching. The private naming convention is appropriate since this is an internal implementation detail.


33-41: Clean caching implementation for property retrieval.

The caching logic is straightforward and effective:

  • On first call, retrieves records and maps them through _get_property
  • Subsequent calls return the cached list
  • Using stream_slice=None aligns with the discussions about properties being global to the stream

One question: Line 37's # type: ignore comment mentions that the return type of Retriever.read_records might need updating. Is this something that should be addressed at the interface level, or is the type: ignore acceptable here, wdyt?


43-48: Clarify the intended behavior for missing property fields.

The tests all cover records with complete data—none test scenarios where the property field path is missing from a record. When dpath.get doesn't find the path, it returns [] (the default), which str() converts to the literal string "[]". This means incomplete records would contribute "[]" to the cached properties list.

The gap in test coverage makes it unclear whether this behavior is intentional. Should records with missing property fields contribute empty strings ("") instead, or is the "[]" behavior by design? Could you verify this and add a test case for missing properties to clarify the expected behavior, wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

388-395: Preserve base slice metadata when injecting query_properties

Overwriting stream_slice each iteration drops any existing extra_fields from the original slice. For substreams that rely on those keys (e.g., parent IDs), the follow-up requests end up missing required params, so the sync breaks as soon as chunking is enabled. Could we keep the original slice untouched and build a chunk_stream_slice that merges the existing extra_fields before calling _fetch_next_page, wdyt?

-                    for properties in self.additional_query_properties.get_request_property_chunks():
-                        stream_slice = StreamSlice(
-                            partition=stream_slice.partition or {},
-                            cursor_slice=stream_slice.cursor_slice or {},
-                            extra_fields={"query_properties": properties},
-                        )
-                        response = self._fetch_next_page(
-                            stream_state, stream_slice, next_page_token
-                        )
+                    base_stream_slice = stream_slice
+                    for properties in self.additional_query_properties.get_request_property_chunks():
+                        chunk_extra_fields = dict(base_stream_slice.extra_fields or {})
+                        chunk_extra_fields["query_properties"] = properties
+                        chunk_stream_slice = StreamSlice(
+                            partition=base_stream_slice.partition or {},
+                            cursor_slice=base_stream_slice.cursor_slice or {},
+                            extra_fields=chunk_extra_fields,
+                        )
+                        response = self._fetch_next_page(
+                            stream_state, chunk_stream_slice, next_page_token
+                        )
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)

33-40: Refresh the docstring to match the new signature

The docstring still calls out a stream_slice parameter that no longer exists, which is confusing now that callers just invoke the method bare. Could we trim that paragraph so the docstring only describes the current behavior, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2adf3ea and 656a806.

📒 Files selected for processing (6)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (2 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py (4 hunks)
  • unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/requesters/query_properties/test_property_chunking.py
🧰 Additional context used
🧬 Code graph analysis (3)
unit_tests/sources/declarative/requesters/query_properties/test_query_properties.py (2)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-73)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • get_request_property_chunks (33-69)
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-73)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)
  • get_request_property_chunks (33-69)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (4)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-73)
airbyte_cdk/sources/declarative/requesters/query_properties/property_selector/json_schema_property_selector.py (1)
  • select (30-54)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
  • PropertiesFromEndpoint (14-48)
  • get_properties_from_endpoint (33-41)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • PropertiesFromEndpoint (2796-2807)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py

[error] 385-390: ruff format would reformat files. Run 'ruff format' to apply formatting changes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build

@maxi297
Copy link
Contributor Author

maxi297 commented Oct 24, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

octavia-squidington-iii and others added 2 commits October 24, 2025 13:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1)

53-60: Consider simplifying the redundant conditional logic, wdyt?

The inner check for configured_properties is not None on line 56 is redundant since we're already inside the outer if configured_properties is not None: block. The logic could be streamlined for better readability.

Apply this diff to simplify the logic:

         if self.property_chunking:
             yield from self.property_chunking.get_request_property_chunks(
                 property_fields=fields,
                 always_include_properties=self.always_include_properties,
                 configured_properties=configured_properties,
             )
         else:
             if configured_properties is not None:
-                all_fields = (
-                    [field for field in fields if field in configured_properties]
-                    if configured_properties is not None
-                    else list(fields)
-                )
+                all_fields = [field for field in fields if field in configured_properties]
             else:
                 all_fields = list(fields)
 
             if self.always_include_properties:
                 all_fields = list(self.always_include_properties) + all_fields
 
             yield all_fields
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf06028 and d297e48.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (3)
airbyte_cdk/sources/declarative/requesters/query_properties/property_chunking.py (1)
  • get_request_property_chunks (42-73)
airbyte_cdk/sources/declarative/requesters/query_properties/property_selector/json_schema_property_selector.py (1)
  • select (30-54)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
  • PropertiesFromEndpoint (14-48)
  • get_properties_from_endpoint (33-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/requesters/query_properties/query_properties.py (3)

33-37: LGTM! The signature change and docstring update look good.

The removal of the stream_slice parameter is consistent with the caching implementation in PropertiesFromEndpoint, and the docstring has been properly updated to reflect this change.


38-38: Nice addition of the explicit type annotation.

This makes the expected type clear and aligns well with the return type from PropertiesFromEndpoint.get_properties_from_endpoint() and the parameter type expected by PropertyChunking.get_request_property_chunks().


41-42: The updated call to get_properties_from_endpoint() looks correct.

Removing the stream_slice argument aligns with the new cached implementation in PropertiesFromEndpoint.

@maxi297 maxi297 merged commit 6ea924a into main Oct 24, 2025
28 of 29 checks passed
@maxi297 maxi297 deleted the maxi297/cache_properties_from_endpoint branch October 24, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants