Skip to content

Conversation

@minitriga
Copy link
Contributor

@minitriga minitriga commented Oct 7, 2025

Add data processing strategy for object files with range expansion support

- Introduced `strategy` field in object file specification to control data processing.
- Implemented `DataProcessor` classes for normal and range expansion strategies.
- Updated validation and processing methods to utilize the specified strategy.
- Enhanced tests to cover new strategy functionality and ensure correct behavior.

Summary by CodeRabbit

  • New Features

    • Added an optional strategy setting (normal, range_expand) to control data processing and range expansion. Default is normal.
  • Documentation

    • Updated guides with strategy usage, defaults, YAML examples, and error scenarios for range expansion.
  • Bug Fixes

    • Tightened range-detection to avoid unintended expansion of single-token or non-numeric brackets (e.g., Device[5], Device[expansion] remain literal).
  • Tests

    • Added/updated tests for strategy behaviors, no-expansion paths, multi-range expansion, mismatch errors; removed obsolete whitespace test.

…pport

- Introduced `strategy` field in object file specification to control data processing.
- Implemented `DataProcessor` classes for normal and range expansion strategies.
- Updated validation and processing methods to utilize the specified strategy.
- Enhanced tests to cover new strategy functionality and ensure correct behavior.
@coderabbitai
Copy link

coderabbitai bot commented Oct 7, 2025

Walkthrough

Introduces a strategy-driven object processing pipeline: adds ObjectStrategy (NORMAL, RANGE_EXPAND), an abstract DataProcessor with SingleDataProcessor and RangeExpandDataProcessor, and DataProcessorFactory. InfrahubObjectFileData gains a strategy field (default ObjectStrategy.NORMAL) and _get_processed_data; validation methods now accept and propagate strategy. Tightens range-detection regex to require a dash or comma inside brackets. Documentation and tests updated to reflect NORMAL vs RANGE_EXPAND behavior and changed expansion expectations.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary focus of the changeset by highlighting enhancements to the range expansion feature, making its intent clear and directly related to the main modifications.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch atg-20251007-ifc-1878

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the type/documentation Improvements or additions to documentation label Oct 7, 2025
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Oct 7, 2025

Deploying infrahub-sdk-python with  Cloudflare Pages  Cloudflare Pages

Latest commit: df0f3db
Status: ✅  Deploy successful!
Preview URL: https://adc96c8c.infrahub-sdk-python.pages.dev
Branch Preview URL: https://atg-20251007-ifc-1878.infrahub-sdk-python.pages.dev

View logs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
docs/docs/python-sdk/topics/object_file.mdx (1)

63-66: Clarify when brackets do NOT expand; add Diataxis structure touch-ups.

  • Please add an explicit note/example that single-token brackets like Device[5] are treated as literals and won’t expand. The matcher only expands when a dash or comma is present inside brackets. Example to add:
    • Input: Device[5] → Device[5]
  • Consider a callout (e.g., Note) stating that only patterns containing “-” or “,” expand, and that spaces inside brackets prevent expansion.
  • For Topics (Explanation), consider adding brief “Concepts & definitions” and “Connections / Further reading” sections to align with our docs style. Based on guidelines

Also applies to: 70-80, 211-225, 236-244, 262-267, 288-301, 303-307

infrahub_sdk/spec/range_expansion.py (2)

4-4: Tightened MATCH_PATTERN is correct; consider minor robustness.

  • Good: requires “-” or “,” inside brackets.
  • Consider precompiling at module level (RE = re.compile(MATCH_PATTERN)) to avoid recompiling per call.

19-23: Edge cases in _char_range_expand and error policy.

  • The “no dash/comma” branch is now unreachable due to the tightened MATCH_PATTERN; consider removing it or guarding with a comment.
  • Raising ValueError for non-alphanumeric tokens (e.g., “[a,!]”) will bubble from range_expansion without being handled in tests. Decide whether to:
    • treat invalid tokens as literals, or
    • keep raising and add tests asserting the exception.

Would you like me to add tests for “Device[a,!]” and similar mixed-token cases and adjust behavior accordingly?

Also applies to: 52-56

tests/unit/sdk/spec/test_object.py (1)

43-44: Strategy usage in fixtures looks good; avoid shallow-copy pitfalls.

Use deepcopy for root_location copies to prevent accidental shared nested mutations across fixtures.

Apply in each fixture that does location = root_location.copy():

-    location = root_location.copy()
+    import copy
+    location = copy.deepcopy(root_location)

Also applies to: 57-59, 61-72, 86-87, 101-102

infrahub_sdk/spec/object.py (2)

176-205: Exception policy and consistency for expand_data_with_ranges.

  • Don’t raise ValidationError here; this type is used as a top-level aggregator. Prefer raising ValueError and converting to ObjectValidationError in validate_format (accumulate errors) or let DataProcessor propagate a ValueError that validate_format catches and wraps.
  • Narrow the exception from except Exception to except ValueError to avoid swallowing unrelated errors.

Please confirm ValidationError’s constructor supports a single string; if not, the current raise will fail. If needed, I can patch validate_format to catch ValueError from the processor and convert it to aggregated ValidationError with position context.


414-418: Consistency: strategy-aware processing applied here only.

You applied the processor only for MANY_OBJ_DICT_LIST. With the changes above, creation paths and MANY_OBJ_LIST_DICT validation paths will also respect strategy, ensuring uniform behavior.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34fda2d and a346243.

📒 Files selected for processing (5)
  • docs/docs/python-sdk/topics/object_file.mdx (5 hunks)
  • infrahub_sdk/spec/object.py (10 hunks)
  • infrahub_sdk/spec/range_expansion.py (1 hunks)
  • tests/unit/sdk/spec/test_object.py (6 hunks)
  • tests/unit/sdk/test_range_expansion.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method

Files:

  • tests/unit/sdk/test_range_expansion.py
  • tests/unit/sdk/spec/test_object.py
  • infrahub_sdk/spec/range_expansion.py
  • infrahub_sdk/spec/object.py
tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place and write unit tests under tests/unit/ (isolated component tests)

Files:

  • tests/unit/sdk/test_range_expansion.py
  • tests/unit/sdk/spec/test_object.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use the custom pytest plugin (infrahub_sdk.pytest_plugin) fixtures for clients, configuration, and Infrahub-specific testing

Files:

  • tests/unit/sdk/test_range_expansion.py
  • tests/unit/sdk/spec/test_object.py
docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

docs/**/*.{md,mdx}: Follow the Diataxis framework and classify docs as Tutorials, How-to guides, Explanation, or Reference
Structure How-to guides with required frontmatter and sections: introduction, prerequisites, step-by-step steps, validation, related resources
Structure Topics (Explanation) docs with introduction, concepts & definitions, background & context, architecture & design, connections, further reading
Use a professional, concise, informative tone with consistent structure across documents
Use proper language tags on all code blocks
Include both async and sync examples where applicable using the Tabs component
Add validation steps to guides to confirm success and expected outputs
Use callouts for warnings, tips, and important notes
Define new terms on first use and use domain-relevant terminology consistent with Infrahub’s model/UI
Conform to markdown style rules in .markdownlint.yaml and Vale styles in .vale/styles/

Files:

  • docs/docs/python-sdk/topics/object_file.mdx
🧬 Code graph analysis (3)
tests/unit/sdk/test_range_expansion.py (1)
infrahub_sdk/spec/range_expansion.py (1)
  • range_expansion (88-118)
tests/unit/sdk/spec/test_object.py (1)
infrahub_sdk/spec/object.py (6)
  • spec (701-704)
  • ObjectFile (697-719)
  • ObjectStrategy (49-51)
  • RelationshipDataFormat (38-46)
  • validate_format (260-279)
  • validate_format (712-716)
infrahub_sdk/spec/object.py (2)
infrahub_sdk/schema/__init__.py (2)
  • get (228-256)
  • get (523-563)
infrahub_sdk/exceptions.py (1)
  • ObjectValidationError (132-139)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: integration-tests-latest-infrahub
  • GitHub Check: documentation
🔇 Additional comments (3)
tests/unit/sdk/test_range_expansion.py (2)

62-64: LGTM: single-token bracket now literal.

Expectation update is correct with the tightened pattern.


105-106: LGTM: non-numeric words in brackets remain literal.

Matches new semantics; good coverage.

tests/unit/sdk/spec/test_object.py (1)

144-153: Nice: verifies NORMAL (no expansion) path.

Covers strategy default behavior well.

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 88.88889% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
infrahub_sdk/spec/object.py 88.63% 5 Missing ⚠️
@@             Coverage Diff             @@
##           develop     #571      +/-   ##
===========================================
+ Coverage    76.06%   76.10%   +0.04%     
===========================================
  Files          102      102              
  Lines         9073     9108      +35     
  Branches      1794     1798       +4     
===========================================
+ Hits          6901     6932      +31     
- Misses        1685     1689       +4     
  Partials       487      487              
Flag Coverage Δ
integration-tests 35.12% <77.77%> (+0.15%) ⬆️
python-3.10 48.72% <84.44%> (+0.17%) ⬆️
python-3.11 48.70% <84.44%> (+0.15%) ⬆️
python-3.12 48.68% <84.44%> (+0.13%) ⬆️
python-3.13 48.68% <84.44%> (+0.13%) ⬆️
python-3.9 47.35% <84.44%> (+0.15%) ⬆️
python-filler-3.12 24.76% <0.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
infrahub_sdk/spec/range_expansion.py 88.57% <100.00%> (ø)
infrahub_sdk/spec/object.py 86.00% <88.63%> (+0.29%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@ogenstad ogenstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Perhaps check with Wim if he had any further thoughts around the wording of strategy here.

def get_processor(cls, strategy: ObjectStrategy) -> DataProcessor:
processor_class = cls._processors.get(strategy)
if not processor_class:
raise ValueError(f"Unknown strategy: {strategy}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have a test that shows when we get here too.

Hopefully this should be easy to find within an object file. If it's a huge object file it might be nice to have some additional context as to where the event happened. I.e. if you're running infrahubctl object load I'm wondering what the error would look like if you have an invalid strategy. Not a requirement for this PR but perhaps you can test for yourself and see what it looks like and determine for yourself. It might be that we have something elsewhere that assists with showing some context that I can't determine from this PR alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive added some better catching of the raised errors: a03008a#diff-0da3f5a26233feba6664ea12048ae5aa3c72c548a1bc8b1532225c6021b636a1R241

We will get a some/path: 1 validation error for InfrahubObjectFileData strategy if the wrong strategy is used for the enum via pydantic which is good.

Added a test for that to.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
infrahub_sdk/spec/object.py (4)

283-296: Critical: Strategy not propagated to create_node.

The process() method calls create_node() without passing the strategy parameter (lines 288-295). However, create_node() doesn't accept a strategy parameter in its current signature (lines 472-481), and neither does create_related_nodes() (lines 590-601).

This breaks the strategy-based processing chain and will cause range expansion to be applied incorrectly in create_related_nodes (line 634), which hardcodes expand_data_with_ranges instead of using the processor factory.

To fix this issue completely, you need to:

  1. Add strategy parameter to create_node signature and propagate it through all its calls to validate_object and create_related_nodes
  2. Add strategy parameter to create_related_nodes signature and use DataProcessorFactory.get_processor(strategy) instead of hardcoding expand_data_with_ranges at line 634
  3. Update all call sites to pass the strategy

See the detailed diffs in the previous review comment for the complete fix.


384-404: Strategy not passed in ONE_OBJ validation.

The validate_object call at lines 394-402 doesn't include the strategy parameter, which means nested object validation won't respect the strategy setting.

Apply this diff:

             errors.extend(
                 await cls.validate_object(
                     client=client,
                     position=position,
                     schema=peer_schema,
                     data=data["data"],
                     context=context,
                     branch=branch,
                     default_schema_kind=default_schema_kind,
+                    strategy=strategy,
                 )
             )

435-457: Strategy not passed in MANY_OBJ_LIST_DICT validation.

The validate_object call at lines 447-455 doesn't include the strategy parameter, which means nested object validation won't respect the strategy setting.

Apply this diff:

                 errors.extend(
                     await cls.validate_object(
                         client=client,
                         position=position + [idx + 1],
                         schema=peer_schema,
                         data=item["data"],
                         context=context,
                         branch=branch,
                         default_schema_kind=default_schema_kind,
+                        strategy=strategy,
                     )
                 )

634-634: Critical: Hardcoded range expansion instead of using processor factory.

Line 634 directly calls expand_data_with_ranges(), which means the MANY_OBJ_DICT_LIST relationship will always apply range expansion regardless of the strategy setting. This should use DataProcessorFactory.get_processor(strategy) to respect the strategy.

However, since create_related_nodes doesn't accept a strategy parameter yet (see previous comment), this needs to be fixed as part of the broader strategy propagation fix.

🧹 Nitpick comments (1)
infrahub_sdk/spec/object.py (1)

266-267: Consider removing state mutation from validation method.

Lines 266-267 mutate self.data within a validation method. Validation methods should typically be side-effect-free and only report errors, not modify state. Consider either:

  1. Renaming the method to reflect its dual purpose (e.g., validate_and_process_format)
  2. Performing the data processing in a separate step before/after validation

This mutation could cause unexpected behavior if validation is called multiple times or if code depends on validation being non-destructive.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7555333 and a03008a.

📒 Files selected for processing (2)
  • infrahub_sdk/spec/object.py (11 hunks)
  • tests/unit/sdk/spec/test_object.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/sdk/spec/test_object.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method

Files:

  • infrahub_sdk/spec/object.py
🧬 Code graph analysis (1)
infrahub_sdk/spec/object.py (4)
infrahub_sdk/yaml.py (5)
  • data (147-150)
  • kind (157-158)
  • validate_content (60-61)
  • validate_content (160-163)
  • InfrahubFileKind (20-22)
infrahub_sdk/schema/__init__.py (2)
  • get (228-256)
  • get (523-563)
infrahub_sdk/exceptions.py (2)
  • ObjectValidationError (132-139)
  • ValidationError (117-129)
infrahub_sdk/spec/menu.py (2)
  • spec (25-28)
  • validate_content (30-34)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: unit-tests (3.11)
  • GitHub Check: unit-tests (3.12)
  • GitHub Check: unit-tests (3.13)
  • GitHub Check: unit-tests (3.9)
  • GitHub Check: unit-tests (3.10)
  • GitHub Check: integration-tests-latest-infrahub
🔇 Additional comments (9)
infrahub_sdk/spec/object.py (9)

5-5: LGTM!

The new imports (ABC, abstractmethod, ClassVar) are correctly added to support the abstract DataProcessor class and the DataProcessorFactory implementation.

Also applies to: 7-7


49-52: LGTM!

The ObjectStrategy enum is well-defined with clear values for normal and range expansion processing modes.


177-177: LGTM!

The updated docstring correctly documents the equal expansion length requirement that is enforced by the validation at line 196.


207-227: LGTM!

The strategy pattern implementation is clean and follows best practices:

  • Abstract base class defines the interface
  • Two concrete implementations provide the different behaviors
  • Clear separation of concerns

229-250: LGTM!

The factory implementation is solid:

  • Class-level registry keeps processors organized
  • Clear error message for unknown strategies
  • register_processor method enables future extensibility

254-254: LGTM!

The additions to InfrahubObjectFileData are well-designed:

  • Default strategy of NORMAL ensures backward compatibility
  • _get_processed_data helper encapsulates the processor logic cleanly

Also applies to: 257-260


307-307: LGTM!

The validate_object method correctly accepts the strategy parameter and propagates it to validate_related_nodes.

Also applies to: 357-357


415-418: LGTM!

The MANY_OBJ_DICT_LIST validation path correctly uses the DataProcessorFactory to get the appropriate processor and passes strategy to nested validation calls.

Also applies to: 430-430


704-707: LGTM!

The added error handling around InfrahubObjectFileData initialization improves the user experience by:

  • Catching initialization errors from invalid strategy values or other spec issues
  • Providing clear error messages with the file location as context

Also applies to: 714-717

…expansion_processor and add additional test for invalid strategy handling
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/unit/sdk/spec/test_object.py (1)

233-249: Consider moving the import to the top of the file.

The import on line 238 is placed inside the test function. While acceptable for testing purposes, moving it to the module level would be more consistent with Python conventions.

Note: The test manipulates the internal _processors class variable to test error handling. While this tests implementation details, it's acceptable for ensuring robust error handling in the factory. The cleanup in the finally block is appropriate.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a03008a and df0f3db.

📒 Files selected for processing (1)
  • tests/unit/sdk/spec/test_object.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method

Files:

  • tests/unit/sdk/spec/test_object.py
tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place and write unit tests under tests/unit/ (isolated component tests)

Files:

  • tests/unit/sdk/spec/test_object.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use the custom pytest plugin (infrahub_sdk.pytest_plugin) fixtures for clients, configuration, and Infrahub-specific testing

Files:

  • tests/unit/sdk/spec/test_object.py
🧬 Code graph analysis (1)
tests/unit/sdk/spec/test_object.py (1)
infrahub_sdk/spec/object.py (8)
  • spec (702-708)
  • ObjectFile (698-726)
  • ObjectStrategy (49-51)
  • RelationshipDataFormat (38-46)
  • get_relationship_info (111-173)
  • validate_format (262-281)
  • validate_format (719-723)
  • DataProcessorFactory (229-249)
🔇 Additional comments (5)
tests/unit/sdk/spec/test_object.py (5)

8-8: LGTM!

The import of ObjectStrategy is correct and necessary for the new strategy-based processing functionality.


43-43: LGTM!

The explicit setting of ObjectStrategy.RANGE_EXPAND in expansion fixtures correctly enables the range expansion behavior for these test scenarios.

Also applies to: 57-57, 86-86, 101-101


61-72: LGTM!

The new no_location_expansion fixture provides good test coverage for the NORMAL strategy, ensuring that range patterns are not expanded when expansion is disabled.


144-153: LGTM!

The test provides good coverage for the NORMAL strategy, verifying that range patterns in data are preserved without expansion.


251-257: LGTM!

The test properly validates that invalid strategy values are rejected with appropriate Pydantic validation errors.

@minitriga minitriga merged commit 7fbc329 into develop Oct 9, 2025
20 checks passed
@minitriga minitriga deleted the atg-20251007-ifc-1878 branch October 9, 2025 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants