Skip to content

Fix/azure container signal handling 6611 #6668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

tejas-dharani
Copy link
Contributor

Why are these changes needed?

This PR fixes a critical issue where AutoGen's gRPC runtime host was not working properly in Azure Container Apps and other container environments. The stop_when_signal() method would cause applications to exit immediately with "RuntimeError: Host runtime is not started" instead of waiting for proper signals.
The root cause was limitations of Python's asyncio.loop.add_signal_handler() in container environments, particularly when running in Azure Container Apps where signal delivery behaves differently.

Related issue number

Closes #6611

Checks

…on guide- Fix version format from 0.4.0-dev-1 to 0.4.0-dev.1 for all packages- Remove reference to non-existent Microsoft.AutoGen.Extensions package- Add correct extension packages: Aspire, MEAI, and SemanticKernel- Fix typo: RuntimeGatewway -> RuntimeGateway- Improve documentation structure with clear section headersFixes microsoft#6244
Fix issue microsoft#6277 where TextMessage was used but not imported in three code cells
of the custom agents documentation, causing NameError when users run the examples.

Changes:
- Add TextMessage to imports in ArithmeticAgent section
- Add TextMessage to imports in GeminiAssistantAgent section
- Add TextMessage to imports in Declarative GeminiAssistantAgent section

The CountDownAgent section already had the correct import.

Fixes microsoft#6277
@shreayan98c
Copy link

Hi @tejas-dharani, thank you for your contribution and for addressing the issue #6611 with the stop_when_signal() method in containerized environments. I appreciate the detailed explanation of the root cause and the proposed fix.

I tested the changes in your PR by setting the required environment variables (DOCKER_CONTAINER=true / CONTAINER_APP_NAME) to signal that the environment is a Docker container. I also deployed the updated code, ensuring that AutoGen was installed from the branch containing your fix (fix/azure-container-signal-handling-6611). However, the issue persists, and the runtime host still exits prematurely with the same error: RuntimeError: Host runtime is not started.

Here are some additional observations from my testing:

  1. The stop_when_signal() method still seems to behave inconsistently in the Azure Container App environment.
  2. I verified that the environment variables were correctly set and accessible within the container.
  3. The runtime host starts successfully but does not remain active to listen for events.

Could you please review the changes in the PR to ensure that all edge cases in containerized environments are handled? Additionally, if there are any specific steps or configurations I might have missed during testing, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The Autogen core distributed group chat sample not working when app deployed to Azure Container App
2 participants