Skip to content

move RAG infrastructure into separate stack + fix general permissions issues#2

Merged
colinmxs merged 26 commits intomainfrom
develop
Jan 29, 2026
Merged

move RAG infrastructure into separate stack + fix general permissions issues#2
colinmxs merged 26 commits intomainfrom
develop

Conversation

@colinmxs
Copy link
Copy Markdown
Contributor

No description provided.

…D pipeline

- Add GitHub Actions workflow for RAG ingestion stack build, test, and deployment
- Implement CDK stack for RAG ingestion infrastructure with ECS Fargate service
- Add comprehensive deployment guide and implementation documentation
- Create build and deployment scripts for Docker image and CDK stack management
- Add unit tests for CDK configuration and RAG ingestion stack
- Configure environment variables and secrets management for AWS deployment
- Implement Docker image build with ARM64 support for Lambda compatibility
- Add caching strategies for Python packages and node_modules in CI/CD pipeline
- Enable concurrent workflow management with proper concurrency controls
- Integrate AWS credential configuration and ECR image push capabilities
…flow

- Remove Python packages cache save step from GitHub Actions workflow
- Remove Python dependency installation logic from install.sh script
- Update install.sh description to reflect Node.js-only dependency installation
- Add note clarifying that Python dependencies are installed inside Docker container
- Simplify CI workflow by delegating Python setup to container build process
- Reduce CI execution time by eliminating redundant Python package caching
…validation

- Update Lambda handler verification to provide more detailed error messages
- Change handler check from informational to error-level when not found at expected path
- Add handler module import verification as fallback validation method
- Update Python package verification to check for boto3, docling, tiktoken, and transformers
- Add diagnostic output to identify which packages are missing when validation fails
- Improve error handling and reporting for better debugging of Docker image issues
- Update description to clarify validation checks structure rather than runtime
- Replace Lambda handler file existence check with Python3 runtime verification
- Simplify package validation by removing individual package import checks
- Update CMD configuration verification to use docker inspect instead of runtime execution
- Add note explaining Lambda function image validation approach differs from web services
- Reduce complexity and improve test reliability by focusing on image structure validation
…umentation

- Add comprehensive MIGRATION_GUIDE.md with step-by-step instructions for switching from old RAG resources to new RagIngestionStack
- Add MIGRATION_IMPLEMENTATION.md detailing technical implementation specifics and code changes required
- Add READY_TO_DEPLOY.md confirming deployment readiness and pre-flight checks
- Update app-api-stack.ts to import RAG resources from SSM parameters instead of hardcoded references
- Document environment variable mappings, IAM permission updates, and verification procedures
- Provide rollback and cleanup instructions for safe migration path
…eywords

- Change inclusion setting from "always" to "manual" for better content curation
- Add comprehensive keywords covering CI/CD, GitHub Actions, infrastructure, and deployment tools
- Include keywords for IaC tools (Terraform, CDK, CloudFormation) and AWS services (ECS, Fargate, Lambda, S3, CloudFront, ALB, VPC, SSM, ECR)
- Improve discoverability and categorization of DevOps documentation
…bleArn

- Replace fromTableAttributes with fromTableArn for cleaner DynamoDB table import
- Remove redundant tableName parameter from table import configuration
- Reduce boilerplate code while maintaining the same functionality
- Simplifies the table reference to only require the ARN, which is the essential identifier
…tables

- Add explicit Query and Scan permissions for assistants table GSI indexes
- Add explicit Query and Scan permissions for RAG assistants table GSI indexes
- Grant permissions on index/* resources to enable GSI operations
- Fix missing GSI permissions that grantReadWriteData does not include by default
…iStack

- Remove unused Lambda and S3 notifications imports
- Remove path and S3_GRANT_WRITE_WITHOUT_ACL imports no longer needed
- Delete DockerImageFunction for document ingestion pipeline
- Remove all associated IAM permissions and S3 event trigger configuration
- Remove Lambda function ARN output
- Add note that Lambda function is now created in RagIngestionStack
- Simplifies AppApiStack by moving ingestion logic to dedicated stack
…configuration

- Add IAM policy statement for DynamoDB users table access to runtime execution role
- Include permissions for GetItem, PutItem, UpdateItem, Query, and Scan operations
- Add GSI (Global Secondary Index) permissions for users table
- Import users table ARN from SSM parameter store (App API Stack reference)
- Add DYNAMODB_USERS_TABLE_NAME environment variable to Lambda runtime configuration
- Import users table name from SSM parameter store for runtime access
- Enable inference API Lambda to interact with users table for authentication and user data operations
…sponses

- Update docstrings to clarify return values and exception behavior for assistant retrieval and share access check methods
- Refactor error handling in `_get_assistant_cloud_without_ownership_check` to raise exceptions for real DynamoDB errors instead of silently returning None
- Refactor error handling in `check_share_access` to raise exceptions for real DynamoDB errors while still returning False for ResourceNotFoundException
- Distinguish between table not found (ResourceNotFoundException) and other DynamoDB errors (AccessDeniedException, etc.)
- Add assistant existence check in chat invocation endpoint to differentiate between 404 (not found) and 403 (access denied) responses
- Improve logging with contextual information including user email and assistant ID for better debugging
- Add emoji indicators in logs to quickly identify error types (❌ for 404, 🔒 for 403)
- Pass user email to access check function for more detailed error context
…ccess

- Add IAM permissions for DynamoDB assistants table operations (GetItem, PutItem, UpdateItem, Query, Scan)
- Include GSI query permissions for assistants table indexes
- Add read-only S3 Vector Store permissions for RAG queries (GetVector, QueryVectors, GetIndex, ListIndexes)
- Add Bedrock permissions for generating query embeddings using Titan Embed Text v2 model
- Configure environment variables for assistants table name, vector store bucket, and vector index name
- Import assistants and vector store resources from RagIngestionStack via SSM parameters
- Document that S3 documents bucket access is not needed by inference API (only used during ingestion)
…ent configuration

- Add IAM policy statement for AppRoles table read permissions (GetItem, Query, Scan)
- Include GSI index permissions for AppRoles table queries
- Add DYNAMODB_APP_ROLES_TABLE_NAME environment variable to Lambda runtime
- Retrieve AppRoles table ARN and name from SSM Parameter Store
- Include clarifying comment that inference API has read-only access to tool definitions and roles
- Align with existing pattern of importing DynamoDB table configurations from other stacks
…anagement

- Add AGENTCORE_MEMORY_TYPE environment variable set to 'dynamodb'
- Add AGENTCORE_MEMORY_ID environment variable referencing memory table ID
- Configure AWS Bedrock AgentCore to use DynamoDB for session state persistence
- Enable proper session management through AgentCore memory integration
…scovery

- Add 'bedrock-agentcore:GetMemory' action to MemoryAccess IAM policy statement
- Enable AgentCore to discover available memory strategies during session initialization
- Complements existing memory management permissions (CreateEvent, RetrieveMemory, ListEvents)
…ro title generation

- Add IAM policy statement granting bedrock:InvokeModel action to task role
- Configure access to Amazon Nova Micro foundation model for title generation
- Include both direct foundation model and inference profile ARNs for Nova Micro v1
- Enable AI-powered title generation capability for the application
- Add AGENTCORE_MEMORY_TYPE and AGENTCORE_MEMORY_ID environment variables to task definition
- Import memory configuration from InferenceApiStack via SSM parameters
- Grant bedrock-agentcore permissions to task role for memory operations
- Add support for GetMemory, CreateEvent, RetrieveMemory, ListEvents, ListMemorySessions, GetMemorySession, and DeleteMemorySession actions
- Enable app API to interact with AgentCore memory for session management and event tracking
- Document cross-service dependency issue between Inference API and App API
- Detail impact of tight coupling on deployment, testing, and maintenance
- List affected modules and their usage patterns
- Provide four recommended solutions with pros/cons analysis
- Include action items for architectural refactoring
- Document fixed issues (DynamoDB GSI permissions, silent failures)
- Establish process for tracking future technical debt
- assistant_service.py: Kept improved error handling with proper exception propagation
- inference_api routes.py: Combined better error messages (404 vs 403) with mark_share_as_interacted from main
- app-api-stack.ts: Kept GSI permissions and S3 Vectors permissions, removed duplicate RAG resource imports (now handled in RagIngestionStack)
- Add fast-check ^3.23.2 as dev dependency for property-based testing
- Add pure-rand ^6.1.0 as transitive dependency for random data generation
- Update package-lock.json with new dependency entries and checksums
- Enable property-based testing capabilities for improved test coverage
- assistant_service.py: Kept improved error handling with proper exception propagation
- inference_api routes.py: Combined better error messages (404 vs 403) with mark_share_as_interacted from main
- app-api-stack.ts: Kept GSI permissions and S3 Vectors permissions, removed duplicate RAG resource imports (now handled in RagIngestionStack)
…management

- Add bedrock-agentcore:DeleteEvent permission to inference API role
- Enable deletion of memory events in AgentCore memory management
- Complements existing GetMemory, CreateEvent, and RetrieveMemory permissions
…region access

- Change foundation model ARN from region-specific to wildcard (*) for global access
- Update inference profile ARN to use wildcard region to support cross-region invocations
- Improve flexibility by allowing Nova Micro model access from any AWS region
- Maintain account-scoped inference profile for security while enabling regional flexibility
…re memory access

- Add bedrock-agentcore:RetrieveMemoryRecords permission to app-api-stack
- Add bedrock-agentcore:RetrieveMemoryRecords permission to inference-api-stack
- Enable retrieval of detailed memory records from AgentCore memory sessions
- Complements existing RetrieveMemory permission for comprehensive memory access
…y and record management

- Add GetMemoryStrategies permission for memory configuration discovery in both stacks
- Add ListMemoryRecords permission for querying memory records in both stacks
- Add BatchDeleteMemoryRecords permission to app-api-stack for memory cleanup operations
- Remove DeleteMemorySession permission from inference-api-stack (runtime read-only access)
- Reorganize permissions with inline comments for clarity (memory config, event ops, retrieval, sessions)
- Align permission sets between app-api and inference-api stacks for consistency
- Enable more granular memory management capabilities for AgentCore integration
@colinmxs colinmxs merged commit 153cdf3 into main Jan 29, 2026
53 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant