-
Notifications
You must be signed in to change notification settings - Fork 7
feat: exponential retry decorator #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… normalize dict items in utils
…and configuration details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a robust retry decorator with exponential backoff and rate-limit handling for both synchronous and asynchronous functions, enabling configurable retry behavior through environment variables and Kubernetes configuration.
- Implements a configurable retry decorator with exponential backoff, jitter, and rate-limit awareness
- Integrates retry settings into infrastructure configuration via Helm values and ConfigMaps
- Adds comprehensive test coverage for retry scenarios including rate-limit handling
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| retry_decorator_test.py | Comprehensive test suite covering sync/async retry scenarios and rate-limit handling |
| utils.py | Utility functions for parsing rate-limit headers and extracting exception metadata |
| retry_decorator.py | Core retry decorator implementation with exponential backoff and rate-limit awareness |
| retry_decorator_settings.py | Pydantic settings model for configuring retry behavior via environment variables |
| pyproject.toml | Adds pytest-asyncio dependency for async test support |
| README.md | Documentation for retry decorator usage and configuration |
| values.yaml | Default retry configuration values for Helm deployment |
| configmap.yaml | ConfigMap template for retry decorator environment variables |
| deployment.yaml files | Integration of retry decorator ConfigMap into backend deployments |
| _helpers.tpl | Helm template helper for retry decorator ConfigMap naming |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
libs/rag-core-lib/src/rag_core_lib/impl/settings/retry_decorator_settings.py
Outdated
Show resolved
Hide resolved
libs/rag-core-lib/src/rag_core_lib/impl/utils/retry_decorator.py
Outdated
Show resolved
Hide resolved
…ecorator_settings.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…orator.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…cloud/rag-template into feat/exponential-retry-decorator
libs/rag-core-lib/src/rag_core_lib/impl/settings/retry_decorator_settings.py
Outdated
Show resolved
Hide resolved
This pull request introduces enhanced configurability and reliability to the summarization workflow by adding granular retry and concurrency settings, and refactoring the summarizer to use them. The changes allow for more robust handling of transient failures and better control over resource usage. **Configuration enhancements:** * Added new retry-related fields (e.g., `max_retries`, `retry_base_delay`, `retry_max_delay`, `backoff_factor`, `attempt_cap`, `jitter_min`, `jitter_max`) to the `SummarizerSettings` class, allowing fine-grained control over retry behavior for summarization tasks. [[1]](diffhunk://#diff-ceade27a403894bb34e6c3c94bca8739203875d1037ce8896348b9eeb377dbcbL15-R31) [[2]](diffhunk://#diff-ceade27a403894bb34e6c3c94bca8739203875d1037ce8896348b9eeb377dbcbL26-R82) * Fixed a typo in the `SummarizerSettings` field name from `maximum_concurrreny` to `maximum_concurrency`. [[1]](diffhunk://#diff-ceade27a403894bb34e6c3c94bca8739203875d1037ce8896348b9eeb377dbcbL15-R31) [[2]](diffhunk://#diff-ceade27a403894bb34e6c3c94bca8739203875d1037ce8896348b9eeb377dbcbL26-R82) **Dependency injection and wiring:** * Registered `RetryDecoratorSettings` in the dependency container and passed both summarizer and global retry settings to the `LangchainSummarizer` instance, enabling summarizer-specific overrides. [[1]](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7R67) [[2]](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7R90) [[3]](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7L139-R143) **Summarizer logic refactoring:** * Refactored the summarization logic in `LangchainSummarizer` to: - Use asynchronous chunk summarization with concurrency control via a semaphore. - Implement retry logic with exponential backoff and jitter for chunk summarization, using the new settings for configuration. - Cleaned up error handling and removed redundant retry code in favor of the new decorator-based approach. [[1]](diffhunk://#diff-9793b1081628436dd7d5a0e37abc9d79ee5e25af3f5e784f99379249809ed8dbR3-R21) [[2]](diffhunk://#diff-9793b1081628436dd7d5a0e37abc9d79ee5e25af3f5e784f99379249809ed8dbR39-R47) [[3]](diffhunk://#diff-9793b1081628436dd7d5a0e37abc9d79ee5e25af3f5e784f99379249809ed8dbL68-R161) These changes collectively improve the reliability, configurability, and maintainability of the summarization pipeline. **fixes partly following issue** #87 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This pull request introduces configurable retry behavior for the `StackitEmbedder`, allowing for fine-grained control of retry and backoff parameters via environment variables, Helm chart values, or code. The changes ensure that retry settings can be overridden per embedder instance, falling back to shared defaults when not specified. Documentation and dependency injection are updated to reflect this new flexibility. **Embedder Retry Configuration** * Added new optional retry-related fields (`max_retries`, `retry_base_delay`, `retry_max_delay`, `backoff_factor`, `attempt_cap`, `jitter_min`, `jitter_max`) to the `StackitEmbedderSettings` model, allowing per-embedder overrides of retry/backoff parameters. [[1]](diffhunk://#diff-0e502aa8b53287c8b12f5f4d053e9ae904620403c6c502df29f9673f4ae88d09R21-R34) [[2]](diffhunk://#diff-0e502aa8b53287c8b12f5f4d053e9ae904620403c6c502df29f9673f4ae88d09R46-R86) * Updated the `StackitEmbedder` implementation to use a shared retry decorator with exponential backoff, resolving settings from both `StackitEmbedderSettings` and fallback `RetryDecoratorSettings`. The retry logic now handles OpenAI API errors and rate limits robustly. [[1]](diffhunk://#diff-7ebf8bf6adafb79699aea6bcd32de76398f9734e795f1d3c53cf524f1d69a5a1L4-R38) [[2]](diffhunk://#diff-7ebf8bf6adafb79699aea6bcd32de76398f9734e795f1d3c53cf524f1d69a5a1R63-R73) [[3]](diffhunk://#diff-7ebf8bf6adafb79699aea6bcd32de76398f9734e795f1d3c53cf524f1d69a5a1L72-R138) **Dependency Injection and Configuration** * Modified the dependency container to inject both `StackitEmbedderSettings` and `RetryDecoratorSettings` into the `StackitEmbedder`, supporting the new configuration pattern. [[1]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bR66) [[2]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bR93) [[3]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bL101-R103) * Added corresponding environment variable keys to the Helm chart (`values.yaml`), enabling retry configuration via deployment configuration for both backend and adminBackend services. [[1]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R195-R202) [[2]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R325) **Documentation Updates** * Documented the new retry configuration mechanism in `libs/README.md`, explaining how override and fallback resolution works, and how to configure via environment variables and Helm chart values. [[1]](diffhunk://#diff-34194a117b05d75d22ca968cdb7d540839dc7a0eb33960fbca668b5a6ade87cbR11) [[2]](diffhunk://#diff-34194a117b05d75d22ca968cdb7d540839dc7a0eb33960fbca668b5a6ade87cbR103-R128) **tackles following issue:** #87
…ettings initialization
…try_decorator_settings
…r and StackitEmbedder settings
…and StackitEmbedder settings
This pull request introduces a robust, configurable retry decorator with exponential backoff and rate-limit handling, and integrates it across the RAG stack for both the embedder and summarizer components. The retry behavior is now centrally managed, with clear support for both global and per-component overrides via environment variables and Helm chart values. The documentation has been updated to explain configuration and usage, and the Helm templates and values have been extended to support the new settings.
Retry decorator integration and configuration:
retry_with_backoff) inrag-core-lib, with support for both sync and async callables, rate-limit awareness, and extensive configuration via environment variables or Helm values. Documentation inlibs/README.mddetails usage, configuration, and advanced features.infrastructure/rag/values.yamland related templates. [1] [2] [3] [4] [5] [6] [7]Embedder and summarizer retry logic:
StackitEmbedder(backend) andLangchainSummarizer(admin-backend) now both use the shared retry decorator, with per-component settings overriding global defaults as needed. This is documented in detail inlibs/README.mdand supported by new environment variable keys and Helm values. [1] [2]DependencyContainer) now wires the newretry_decorator_settingsand passes it to the summarizer implementation, ensuring the retry logic is properly configured at runtime. [1] [2] [3]Documentation improvements:
libs/README.mdto include new sections describing the retry decorator, its configuration (including environment variables and Helm usage), and how the embedder and summarizer resolve their retry settings. [1] [2] [3] [4]libs/README.md. [1] [2] [3] [4] [5]Settings and type improvements:
SummarizerSettingsto support optional retry-related fields, aligning with the new decorator's configuration model.These changes centralize and standardize retry logic across the stack, making it easier to tune reliability and rate-limiting behavior per environment and per component.