-
Notifications
You must be signed in to change notification settings - Fork 7
fix(client): hangs when using batches with parallel or other batch #601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ching relationships" This reverts commit de6cfbf.
WalkthroughThe changes refactor concurrent execution control in batch operations across the SDK. The initialization of Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying infrahub-sdk-python with
|
| Latest commit: |
dd4cb6e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://5923ec79.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://fac-fix-create-batch.infrahub-sdk-python.pages.dev |
It would make clients hang when using batches within batches. Signed-off-by: Fatih Acar <fatih@opsmill.com>
8f0af3b to
dd4cb6e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## stable #601 +/- ##
==========================================
- Coverage 75.80% 75.79% -0.01%
==========================================
Files 100 100
Lines 8918 8916 -2
Branches 1758 1758
==========================================
- Hits 6760 6758 -2
Misses 1677 1677
Partials 481 481
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
infrahub_sdk/client.py (1)
1517-1520: Add docstring to document batch behavior and concurrency semantics.The synchronous version (
InfrahubClientSync.create_batch, lines 2258-2266) includes a helpful docstring explaining thread pool execution and dependency constraints. The async version should have equivalent documentation explaining:
- That each batch creates its own semaphore with
max_concurrent_executionlimit- Concurrency implications when multiple batches execute simultaneously
- Any constraints around nesting batches or using batches with parallel operations
This maintains API consistency between async and sync clients per the dual client pattern.
Apply this diff to add documentation:
async def create_batch(self, return_exceptions: bool = False) -> InfrahubBatch: + """Create a batch to execute multiple queries concurrently. + + Each batch manages its own concurrency via a semaphore limited to max_concurrent_execution. + Multiple batches can run simultaneously, each respecting its own concurrency limit. + + Args: + return_exceptions: If True, exceptions from tasks are returned rather than raised. + + Returns: + InfrahubBatch: A batch object for scheduling and executing tasks concurrently. + """ return InfrahubBatch( max_concurrent_execution=self.max_concurrent_execution, return_exceptions=return_exceptions )Based on learnings
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
infrahub_sdk/client.py(1 hunks)infrahub_sdk/node/relationship.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method
Files:
infrahub_sdk/node/relationship.pyinfrahub_sdk/client.py
infrahub_sdk/client.py
📄 CodeRabbit inference engine (CLAUDE.md)
infrahub_sdk/client.py: Use HTTPX for transport with proxy support (single proxy or HTTP/HTTPS mounts)
Support authentication via API tokens or JWT with automatic refresh
Files:
infrahub_sdk/client.py
🧠 Learnings (1)
📚 Learning: 2025-08-24T13:35:12.998Z
Learnt from: CR
Repo: opsmill/infrahub-sdk-python PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T13:35:12.998Z
Learning: Maintain a dual client pattern: InfrahubClient (async) and InfrahubClientSync (sync) must provide identical interfaces
Applied to files:
infrahub_sdk/client.py
🧬 Code graph analysis (1)
infrahub_sdk/client.py (1)
infrahub_sdk/batch.py (1)
InfrahubBatch(55-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit-tests (3.12)
- GitHub Check: unit-tests (3.13)
- GitHub Check: integration-tests-latest-infrahub
- GitHub Check: Cloudflare Pages
🔇 Additional comments (2)
infrahub_sdk/client.py (1)
2258-2266: LGTM!The sync batch creation correctly instantiates
InfrahubBatchSyncwith themax_concurrent_executionparameter, and the docstring provides helpful guidance about thread pool execution and dependency constraints.infrahub_sdk/node/relationship.py (1)
168-178: Verify whether nested batching provides measurable benefit; consider making parallel=True optional based on relationship cardinality.The concern raised is valid:
parallel=Truedoes create nested batches. Thefilters()implementation confirms this—whenparallel=True, it internally creates its own batch (batch_process = await self.client.create_batch()) to parallelize pagination across result pages (seeprocess_batch()function).This results in:
- Outer batch (lines 168–178): parallelizes
filterscalls across node kinds- Inner batches (from
parallel=True): parallelizes pagination within each kindWhile this works correctly with per-batch semaphores, the nested structure may create more concurrent tasks than beneficial for typical relationship cardinalities. The trade-off between pagination parallelism and concurrent task overhead is unclear without profiling.
Consider:
- Profiling to measure if inner batch parallelization provides meaningful improvement for relationship sizes in typical use
- If benefit is marginal, making
parallel=Trueconditional (e.g., only when expected results exceed threshold)- Adding inline comment explaining the nested batching design choice
Summary by CodeRabbit