Skip to content

perf: retry logic #138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: dev
Choose a base branch
from
Draft

perf: retry logic #138

wants to merge 5 commits into from

Conversation

Shreyas-Microsoft
Copy link
Collaborator

Purpose

  • ...
    This pull request introduces retry logic and enhanced error handling for agent communication in the CommsManager class and integrates these improvements into the convert_script function. Key changes include adding retry mechanisms, improving JSON parsing for robustness, and updating logging and error reporting.

Enhancements to Communication and Retry Logic:

  • Introduced retry logic in CommsManager with configurable parameters such as max_retries, initial_delay, and backoff_factor. Added the async_invoke method to handle retries and error handling during agent communication. [1] [2]
  • Updated convert_script to use the new CommsManager for group chat operations, replacing direct calls to chat with comms_manager.group_chat. This ensures retry logic is applied to all communication. [1] [2] [3]

Improved Error Handling:

  • Added safe JSON parsing in convert_script to handle malformed or incomplete JSON responses from agents. Fallback values are used to avoid crashes.
  • Enhanced error logging and reporting in convert_script. Critical errors during communication now create logs in the batch service and send error status updates to the client.

Codebase Improvements:

  • Added utility imports and a class-level logger to CommsManager for better debugging and maintainability. [1] [2]
  • Removed temporary asyncio.sleep calls in both convert_script and process_batch_async as they are no longer needed with the new retry logic. [1] [2]

Does this introduce a breaking change?

  • Yes
  • No

Golden Path Validation

  • I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

  • I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

  • ...

Other Information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant