Fix worker monitor Valkey connection with Secrets Manager auth#6
Fix worker monitor Valkey connection with Secrets Manager auth#6
Conversation
…s connection handling - Implemented a function to retrieve the Valkey auth token from AWS Secrets Manager with caching. - Updated Redis connection logic to include the auth token and handle connection timeouts and errors more gracefully. - Modified CloudFormation template to allow access to the Secrets Manager for the Valkey auth token. - Enhanced error handling in various functions to improve robustness and logging for Redis operations.
|
@claude please review this PR |
|
Claude finished @jfrench9's task —— View job PR Review
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
bin/lambda/worker_monitor.py
Outdated
| secret_name = f"robosystems/{ENVIRONMENT}/valkey" | ||
|
|
||
| try: | ||
| response = secretsmanager.get_secret_value(SecretId=secret_name) | ||
| secret_data = json.loads(response["SecretString"]) | ||
| _valkey_auth_token = secret_data.get("AUTH_TOKEN") | ||
| logger.info(f"Retrieved Valkey auth token from {secret_name}") |
There was a problem hiding this comment.
Fetch Valkey token under wrong key from Secrets Manager
The Valkey secret created by CloudFormation stores the token under the key VALKEY_AUTH_TOKEN (and other parts of the codebase look up that key), but this function retrieves secret_data.get("AUTH_TOKEN"). That lookup always returns None, so _valkey_auth_token is never populated and get_redis_connection immediately raises RuntimeError("Failed to get Valkey auth token"), preventing the Lambda from ever connecting to Valkey and breaking all monitoring actions. The key name should match the secret structure (VALKEY_AUTH_TOKEN).
Useful? React with 👍 / 👎.
…ed error handling - Implemented TTL-based caching for Valkey auth token to reduce unnecessary calls to Secrets Manager. - Updated logging to provide clearer information on token retrieval and errors. - Enhanced error messages for better debugging when the VALKEY_AUTH_TOKEN key is missing.

Summary
This bugfix resolves connection issues in the worker monitor by implementing proper Valkey authentication through AWS Secrets Manager and enhancing Redis connection handling mechanisms.
Key Changes
Technical Improvements
Infrastructure Considerations
Testing Notes
This fix addresses critical connectivity issues that were preventing the worker monitor from maintaining stable connections to the Valkey backend.
🤖 Generated with Claude Code
Branch Info:
bugfix/worker-monitor-valkey-connmainCo-Authored-By: Claude noreply@anthropic.com