A vendor-agnostic AI proxy service built with NestJS and TypeScript. Features retry mechanisms, structured logging, configuration validation, and vendor abstraction for seamless AI provider integration.
- Vendor-Agnostic Architecture - Provider Abstraction Layer (Strategy + Factory + Adapter) supports multiple AI providers (AWS, Azure, Google, OpenAI)
- Enterprise-Grade Reliability - Intelligent retry logic with exponential backoff and jitter
- Configuration Validation - Comprehensive schema validation with class-validator decorators
- Structured Logging - JSON-formatted logs with performance tracking and correlation IDs
- Enhanced Error Handling - Hierarchical exception system with retryability classification
- Async System Prompt Loading - File-based caching with mtime validation for optimal performance
- AWS Bedrock Integration - Support for Claude, Titan, and other foundation models
- Specialized Safety Analysis - Dedicated incident report analysis with expert safety prompts
- Model Discovery - Comprehensive endpoint for provider and model information with pricing
- Configurable Model Settings - Per-operation model, token, and temperature configuration
- Multi-Provider Support - Ready for Azure OpenAI, Google Vertex AI, and OpenAI integration
- Input Validation - Comprehensive DTO validation with security constraints
- Rate Limiting - Configurable throttling (100 requests/minute by default)
- Health Monitoring - Detailed health checks with provider status
- Security Hardening - CORS, Helmet, input sanitization, and XSS protection
- Performance Metrics - Operation tracking with success rates and latency percentiles
- API Documentation - Interactive Swagger/OpenAPI documentation
graph TB
Client[Client Applications] --> Controller[ProxyController]
Controller --> Factory[AIServiceFactory]
Factory --> |Provider Selection| BedrockSvc[BedrockService]
Factory --> |Future| AzureSvc[AzureOpenAIService]
Factory --> |Future| GoogleSvc[GoogleVertexAIService]
Factory --> |Future| OpenAISvc[OpenAIService]
BedrockSvc --> |AWS SDK| Bedrock[AWS Bedrock]
AzureSvc --> |Azure SDK| AzureAI[Azure OpenAI]
GoogleSvc --> |Google SDK| VertexAI[Vertex AI]
OpenAISvc --> |OpenAI SDK| OpenAI[OpenAI API]
Controller --> |Incident Analysis| SystemPrompt[System Prompt Loader]
SystemPrompt --> |Config File| PromptFile[incident-report-system-prompt.md]
subgraph "Cross-Cutting Concerns"
Logger[Enhanced Logger]
Retry[Retry Service]
ConfigValidator[Config Validator]
end
BedrockSvc -.-> Logger
BedrockSvc -.-> Retry
Factory -.-> ConfigValidator
This codebase intentionally implements the Adapter Pattern (combined with Strategy + Abstract Factory) to provide a uniform AI invocation interface while isolating vendor-specific SDK details. Although classes are named *Service
to align with NestJS conventions and domain language, each concrete provider service is an adapter that translates between:
Pattern Role | Implementation |
---|---|
Target Interface | AIServiceInterface + abstract base BaseAIService |
Adapters | BedrockService (and future AzureOpenAIService , GoogleVertexAIService , OpenAIService , OllamaService ) |
Adaptees | External provider SDKs / HTTP APIs (AWS Bedrock, Azure OpenAI, etc.) |
Client | ProxyController (and any other application layer components) |
Factory / Creator | AIServiceFactory (selects + caches appropriate adapter) |
- Controller receives a normalized request DTO (
PromptRequestDto
). AIServiceFactory
returns the appropriate provider service (adapter) based on configuration or request metadata.- The adapter maps the unified request into provider-specific SDK parameters (e.g., model IDs, token/temperature keys, safety settings).
- Provider response (which may vary widely in shape) is transformed back into a standardized
PromptResponse
structure with consistentusage
,modelId
, and metadata. - Cross-cutting concerns (retry, logging, incident analysis prompt composition) are applied uniformly in
BaseAIService
.
- Follows NestJS ecosystem convention (
SomethingService
) for readability and DI familiarity. - Maintains domain-centric language ("Bedrock Service" vs. "Bedrock Adapter").
- Avoids redundancy: architectural intent is documented in DESIGN-PATTERNS.md and SERVICE-ARCHITECTURE.md.
- Leaves room for provider-specific enhancements beyond pure adaptation (e.g., pre-processing, safety tuning, cost controls).
- Create
NewProviderService extends BaseAIService
. - Implement abstract methods:
invokeModel
,getProviderName
,healthCheck
, optional overrides. - Add provider definition in
provider.types.ts
(PROVIDER_DEFINITIONS
). - Register instantiation branch in
AIServiceFactory
. - Add model mapping file under
config/
if needed. - Document usage and (optionally) add tests for translation logic.
The combination of Strategy (runtime selection), Adapter (request/response translation), and Abstract Factory (centralized creation) yields a pluggable, testable, and vendor-neutral architecture.
src/
βββ main.ts # Application entry point
βββ app.module.ts # Root module with P2 enhancements
βββ common/ # Shared services and utilities
β βββ logger.service.ts # Custom logging service
β βββ enhanced-logger.service.ts # Structured logging with performance tracking
β βββ retry.service.ts # Intelligent retry with exponential backoff
β βββ config.schema.ts # Configuration validation schemas
β βββ logging.interceptor.ts # HTTP request/response logging
βββ config/ # Configuration files
β βββ incident-report-system-prompt.md # Expert safety analyst prompt
β βββ aws-model-mappings.json # AWS Bedrock model mappings
β βββ azure-model-mappings.json # Azure OpenAI model mappings (future)
β βββ google-model-mappings.json # Google Vertex AI mappings (future)
β βββ openai-model-mappings.json # OpenAI model mappings (future)
βββ proxy/
βββ proxy.module.ts # Proxy feature module
βββ proxy.controller.ts # REST API endpoints
βββ dto/ # Data transfer objects
β βββ prompt-request.dto.ts # General prompt request validation
β βββ prompt-response.dto.ts # Standardized response interface
β βββ incident-report-feedback.dto.ts # Incident report validation
β βββ providers-response.dto.ts # Provider/model information
β βββ health-response.dto.ts # Health check responses
βββ exceptions/ # Custom exception hierarchy
β βββ bedrock.exceptions.ts # Bedrock-specific exceptions
β βββ provider.exceptions.ts # Base provider exception system
βββ interfaces/
β βββ ai-service.interface.ts # Provider abstraction interface
βββ types/
β βββ provider.types.ts # Provider type definitions and unions
βββ services/
βββ ai-service.factory.ts # Provider factory with caching
βββ base-ai.service.ts # Abstract base service implementation
βββ bedrock.service.ts # AWS Bedrock integration
βββ system-prompt-loader.service.ts # Async prompt loading with caching
- Node.js (v18 or higher)
- npm or yarn
- AWS Account with Bedrock access
- AWS credentials configured
- Clone the repository
git clone https://github.com/your-username/demo-ai-proxy-service.git
cd demo-ai-proxy-service
- Install dependencies
npm install
- Set up environment variables
cp .env.example .env
- Configure environment variables in
.env
# Core Application Configuration
PORT=3000
NODE_ENV=development
# Provider Configuration
AI_PROVIDER=aws # Provider selection (aws|azure|google|openai)
# AWS Configuration (when AI_PROVIDER=aws)
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your-access-key-id
AWS_SECRET_ACCESS_KEY=your-secret-access-key
AWS_SESSION_TOKEN=your-session-token-if-using-temporary-credentials
# Model Defaults
BEDROCK_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0
BEDROCK_MAX_TOKENS=1000
BEDROCK_TEMPERATURE=0.7
BEDROCK_TIMEOUT_MS=30000
BEDROCK_MAX_RETRIES=3
# Incident Analysis Configuration
INCIDENT_ANALYSIS_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0
INCIDENT_ANALYSIS_MAX_TOKENS=2000
INCIDENT_ANALYSIS_TEMPERATURE=0.3
INCIDENT_ANALYSIS_SYSTEM_PROMPT_PATH=config/incident-report-system-prompt.md
# Retry and Resilience Configuration
GLOBAL_TIMEOUT_MS=30000
GLOBAL_MAX_RETRIES=3
RETRY_DELAY_MS=1000
RETRY_BACKOFF_MULTIPLIER=2
# Logging Configuration
LOG_LEVEL=INFO # ERROR|WARN|INFO|DEBUG|VERBOSE
ENABLE_STRUCTURED_LOGGING=true
ENABLE_METRICS=false
ENABLE_TRACING=false
# Future Provider Configurations
# AZURE_OPENAI_KEY=your-azure-openai-key
# AZURE_OPENAI_ENDPOINT=your-azure-openai-endpoint
# GOOGLE_APPLICATION_CREDENTIALS=path/to/google-credentials.json
# OPENAI_API_KEY=your-openai-api-key
Development Mode (with hot reload)
npm run start:dev
Production Mode
npm run build
npm run start:prod
The service will be available at:
- Application: http://localhost:3000
- API Documentation: http://localhost:3000/api/docs
- API Endpoints: http://localhost:3000/api/proxy
Visit http://localhost:3000/api/docs for the complete Swagger/OpenAPI documentation where you can:
- View all available endpoints
- See request/response schemas
- Test endpoints directly in the browser
- View example requests and responses
GET /api/proxy/providers
Response:
{
"providers": [
{
"name": "Anthropic",
"description": "Claude family of large language models",
"website": "https://www.anthropic.com",
"models": [
{
"id": "anthropic.claude-3-5-sonnet-20240620-v1:0",
"name": "Claude 3.5 Sonnet",
"description": "Most intelligent model with balanced performance for complex tasks",
"maxTokens": 200000,
"supportsStreaming": true,
"inputCostPer1K": 0.003,
"outputCostPer1K": 0.015
}
]
}
],
"totalModels": 21,
"defaultModel": "anthropic.claude-3-sonnet-20240229-v1:0",
"timestamp": "2025-09-12T20:09:18.273Z"
}
POST /api/proxy/prompt
Content-Type: application/json
{
"prompt": "What is artificial intelligence?",
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"maxTokens": 1000,
"temperature": 0.7
}
Response:
{
"response": "Artificial intelligence (AI) refers to...",
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"usage": {
"inputTokens": 15,
"outputTokens": 128
}
}
POST /api/proxy/incident-report-feedback
Content-Type: application/json
{
"incidentReport": "A worker slipped on a wet floor in the warehouse. The employee was carrying boxes when they fell and injured their wrist. The floor was wet due to a leaking pipe that had not been reported."
}
Response:
{
"response": "**INCIDENT ANALYSIS REPORT**\n\n**Risk Classification:** Medium-High Risk\n\n**Primary Hazards Identified:**\n1. Workplace slip/fall hazard due to wet surfaces\n2. Inadequate hazard reporting systems\n3. Poor housekeeping and maintenance protocols...",
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"usage": {
"inputTokens": 1847,
"outputTokens": 1523
}
}
Feature | sendPrompt | processIncidentReportFeedback |
---|---|---|
Purpose | General-purpose AI prompting | Specialized safety incident analysis |
User Control | Full parameter customization | Predefined safety-optimized settings |
Model Selection | User-configurable (modelId parameter) |
Fixed: anthropic.claude-3-sonnet-20240229-v1:0 |
Temperature | User-configurable (0.0-1.0) | Fixed: 0.3 (focused analytical responses) |
Max Tokens | User-configurable (up to 4096) | Fixed: 2000 (comprehensive safety analysis) |
System Prompt | None (direct user input) | Expert safety analyst persona automatically prepended |
Input Validation | Generic prompt validation | Specialized incident report validation (up to 50k chars) |
Response Type | General AI response | Structured safety analysis with risk assessment |
Use Case | Development, testing, general queries | Production safety management systems |
Consistency | Varies based on user parameters | Standardized expert-level analysis |
Use sendPrompt
for:
- Development and testing
- Custom AI applications requiring parameter control
- Experimental prompts with different models/settings
- General-purpose AI interactions
- Research and experimentation
Use processIncidentReportFeedback
for:
- Production workplace safety systems
- Standardized incident analysis
- Compliance and regulatory reporting
- Consistent safety assessment across organization
- Emergency response and risk management
sendPrompt
Request Flow:
- Validates user-provided parameters
- Sends prompt directly to specified Bedrock model
- Returns raw AI response
processIncidentReportFeedback
Request Flow:
- Validates incident report content (up to 50,000 characters)
- Loads expert safety analyst system prompt from
config/incident-report-system-prompt.md
- Combines system prompt with incident report
- Sends to pre-optimized Bedrock model with safety-focused settings
- Returns structured expert safety analysis
GET /api/proxy/health
Response:
{
"status": "ok",
"timestamp": "2025-09-12T18:33:48.905Z",
"endpoints": [
"GET /api/proxy/health - This endpoint",
"GET /api/proxy/providers - Get all LLM providers and models",
"POST /api/proxy/health - Health check",
"POST /api/proxy/prompt - Send prompt to Bedrock",
"POST /api/proxy/incident-report-feedback - Analyze incident reports with expert safety feedback"
]
}
GET /api/proxy
The service includes comprehensive configuration validation using class-validator decorators:
// Automatic validation of environment variables at startup
// Prevents misconfiguration in production
export class ConfigurationSchema {
@IsEnum(NodeEnvironment)
nodeEnv?: NodeEnvironment = NodeEnvironment.DEVELOPMENT;
@IsNumber()
@IsPositive()
@Min(1000)
@Max(65535)
port?: number = 3000;
@IsString()
@IsOptional()
region?: string = 'us-east-1';
}
Ready for multiple AI providers with unified configuration:
// Provider abstraction allows seamless switching
export interface AIServiceProvider {
name: ProviderName;
description: string;
models: ModelInfo[];
defaultModel: string;
}
// Supports: 'aws' | 'azure' | 'google' | 'openai'
Built-in retry logic with exponential backoff and jitter:
export class RetryService {
// Automatically retries on:
// - Network errors (ECONNRESET, ETIMEDOUT, etc.)
// - HTTP 5xx server errors
// - HTTP 429 (Too Many Requests)
// - Provider-specific transient errors
async executeWithRetry<T>(
operation: () => Promise<T>,
retryCondition: RetryCondition = RetryConditions.default,
config: Partial<RetryConfig> = {}
): Promise<RetryResult<T>>
}
GLOBAL_MAX_RETRIES=3 # Maximum retry attempts
RETRY_DELAY_MS=1000 # Base delay between retries
RETRY_BACKOFF_MULTIPLIER=2 # Exponential backoff multiplier
GLOBAL_TIMEOUT_MS=30000 # Operation timeout
JSON-formatted logs with comprehensive metadata:
{
"timestamp": "2025-09-16T14:51:34.071Z",
"level": "INFO",
"message": "Incident report analysis completed",
"context": "BaseAIService",
"metadata": {
"requestId": "aws-1726498294071-xyz123",
"duration": 2847,
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"provider": "aws",
"responseLength": 1523
},
"traceId": "abc123def456",
"correlationId": "req-789xyz"
}
traceId: Unique identifier for distributed tracing across service boundaries
correlationId: Request-specific identifier for linking related log entries
Automatic performance metrics collection:
export class PerformanceTracker {
// Tracks operation duration, success rate, error classification
// Provides P95 latency, error breakdown, and throughput metrics
getPerformanceMetrics(): {
totalOperations: number;
successRate: number;
averageDuration: number;
p95Duration: number;
errorBreakdown: Record<string, number>;
}
}
ENABLE_STRUCTURED_LOGGING=true # JSON log format
ENABLE_METRICS=true # Performance metrics collection
ENABLE_TRACING=true # Request tracing with correlation IDs
Hierarchical exception handling with retry classification:
export class AIProviderError extends Error {
public readonly isRetryable: boolean;
public readonly errorCategory: 'network' | 'authentication' | 'rate_limit' | 'model' | 'input' | 'unknown';
public readonly metadata: Record<string, any>;
// Automatic determination of retryability based on error type
// Structured error metadata for debugging and monitoring
}
BedrockConnectionError
- Network/connection issuesBedrockAuthenticationError
- AWS credential problemsBedrockRateLimitError
- API throttlingBedrockModelError
- Model-specific issuesBedrockTimeoutError
- Request timeouts
Optimized system prompt loading with file-based caching:
export class SystemPromptLoader {
// Features:
// - Async file operations (non-blocking)
// - mtime-based cache invalidation
// - Memory-efficient caching
// - Error resilience with fallbacks
async getIncidentPrompt(): Promise<string> {
// Automatically caches and invalidates based on file changes
}
}
INCIDENT_ANALYSIS_SYSTEM_PROMPT_PATH=config/incident-report-system-prompt.md
Clean multi-provider abstraction combining:
- Strategy (runtime provider selection via unified interface)
- Abstract Factory (centralized creation in
AIServiceFactory
with caching) - Adapter (each provider service normalizes its native SDK to the common contract)
This layered approach keeps controllers and business flows decoupled from vendor SDK details while enabling safe extension.
export abstract class BaseAIService implements AIServiceInterface {
// Unified interface for all providers
abstract invokeModel(request: PromptRequestDto): Promise<PromptResponse>;
abstract getAvailableProviders(): ProviderInfo[];
abstract getProviderName(): ProviderName;
abstract healthCheck(): Promise<{ status: 'healthy' | 'unhealthy' }>;
// Shared implementation for incident analysis
async processIncidentReportFeedback(incidentReport: string): Promise<PromptResponse> {
// Uses retry service, performance tracking, and enhanced logging
}
}
Provider instantiation with caching and dependency injection:
export class AIServiceFactory {
// Features:
// - Provider caching for performance
// - Race condition prevention
// - Dependency injection for cross-cutting concerns
// - Type-safe provider selection
async getService(providerName: ProviderName): Promise<AIServiceInterface>
}
The service supports various AWS Bedrock models:
- Anthropic Claude models (claude-3-sonnet, claude-3-haiku, etc.)
- Amazon Titan models
- Other Bedrock-compatible models
The service automatically handles different request/response formats for each model family.
The application features a dual-logging system:
- Traditional log format for development
- Context-aware logging with service names
- Configurable log levels
- JSON-formatted output for log aggregation systems
- Performance tracking with operation metrics
- Correlation IDs and trace IDs for distributed tracing
- Metadata enrichment for debugging
Control logging behavior via environment variables:
LOG_LEVEL=INFO # ERROR|WARN|INFO|DEBUG|VERBOSE
ENABLE_STRUCTURED_LOGGING=true # Enable JSON log format
ENABLE_METRICS=true # Performance metrics collection
ENABLE_TRACING=true # Request correlation tracking
[9:51:34 AM] [INFO] [BedrockService] Loaded model mappings for aws provider
[9:51:34 AM] [INFO] [BedrockService] BedrockService initialized with region: us-east-1
[9:51:34 AM] [INFO] [NestApplication] Nest application successfully started
{
"timestamp": "2025-09-16T14:51:34.071Z",
"level": "INFO",
"message": "Incident report analysis completed",
"context": "BaseAIService",
"metadata": {
"requestId": "aws-1726498294071-abc123",
"duration": 2847,
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"provider": "aws",
"responseLength": 1523,
"success": true
},
"traceId": "def456ghi789",
"correlationId": "req-xyz123"
}
The enhanced logger automatically tracks:
- Operation Duration: Start-to-finish timing
- Success Rates: Success/failure ratios
- Error Classification: Categorized error types
- Latency Percentiles: P50, P95, P99 response times
- Throughput: Requests per second metrics
// Get metrics for specific operations
const metrics = enhancedLogger.getPerformanceMetrics('incident-report-analysis');
// Returns:
// {
// totalOperations: 100,
// successRate: 0.95,
// averageDuration: 2340,
// p95Duration: 4200,
// errorBreakdown: {
// "NetworkError": 3,
// "TimeoutError": 2
// }
// }
Every request gets a unique ID for end-to-end tracing:
Incoming POST /api/proxy/incident-report-feedback - IP: ::1 - User-Agent: curl/7.68.0
[aws-1726498294071-abc123] Processing incident report feedback request
[aws-1726498294071-abc123] System prompt loaded
[aws-1726498294071-abc123] Invoking expert analysis with optimized settings
[aws-1726498294071-abc123] Incident report analysis completed in 2847ms
POST /api/proxy/incident-report-feedback 200 - 2847ms
{
"timestamp": "2025-09-16T14:51:34.071Z",
"level": "ERROR",
"message": "Failed to invoke model after 3 attempts",
"context": "RetryService",
"metadata": {
"requestId": "aws-1726498294071-err456",
"attempts": 3,
"totalElapsedMs": 15234,
"lastError": "ThrottlingException",
"provider": "aws",
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0"
},
"error": {
"name": "BedrockRateLimitError",
"message": "Request was throttled by AWS Bedrock",
"stack": "...",
"code": "ThrottlingException"
}
}
# Development
npm run start:dev # Start with hot reload
npm run start:debug # Start in debug mode
# Production
npm run build # Build the application
npm run start:prod # Start production build
# Code Quality
npm run format # Format code with Prettier
npm run lint # Run ESLint
npm run lint:fix # Fix ESLint issues
# Testing
npm run test # Run unit tests
npm run test:watch # Run tests in watch mode
npm run test:cov # Run tests with coverage
npm run test:e2e # Run end-to-end tests
Variable | Description | Default |
---|---|---|
AWS_REGION |
AWS region for Bedrock | us-east-1 |
AWS_ACCESS_KEY_ID |
AWS access key | Required |
AWS_SECRET_ACCESS_KEY |
AWS secret key | Required |
AWS_SESSION_TOKEN |
AWS session token (for temporary credentials) | Optional |
BEDROCK_MODEL_ID |
Default Bedrock model ID | anthropic.claude-3-sonnet-20240229-v1:0 |
BEDROCK_MAX_TOKENS |
Default max tokens | 1000 |
BEDROCK_TEMPERATURE |
Default temperature | 0.7 |
PORT |
Server port | 3000 |
NODE_ENV |
Environment | development |
LOG_LEVEL |
Logging level | INFO |
The application includes comprehensive error handling:
- Input validation errors (400 Bad Request)
- AWS service errors (500 Internal Server Error)
- Detailed error logging with stack traces
- Structured error responses with request IDs
- Environment variables for sensitive data
- Input validation and sanitization
- CORS enabled for cross-origin requests
- Sensitive data redaction in logs
- Request ID tracking for security auditing
- Response times for all endpoints
- AI model performance monitoring
- Token usage tracking for cost optimization
- Real-time error alerts with stack traces
- Request tracing with unique IDs
- Detailed context for debugging
npm run test
npm run test:e2e
# Check service health and provider status
curl -X GET http://localhost:3000/api/proxy/health
# Get all available providers and models with pricing
curl -X GET http://localhost:3000/api/proxy/providers
# Send a basic prompt
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello, how are you?"}'
# Send a prompt with custom parameters
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{
"prompt": "Explain artificial intelligence in simple terms",
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"maxTokens": 500,
"temperature": 0.3
}'
# Analyze a workplace incident
curl -X POST http://localhost:3000/api/proxy/incident-report-feedback \
-H "Content-Type: application/json" \
-d '{
"incidentReport": "A worker slipped on a wet floor in the warehouse. The employee was carrying boxes when they fell and injured their wrist. The floor was wet due to a leaking pipe that had not been reported."
}'
# Test with a more complex incident
curl -X POST http://localhost:3000/api/proxy/incident-report-feedback \
-H "Content-Type: application/json" \
-d '{
"incidentReport": "During routine maintenance, an electrician received a minor shock while working on a control panel. The worker was wearing appropriate PPE but the circuit breaker was not properly locked out. The incident occurred at 2:30 PM on a Tuesday. No serious injuries occurred, but the worker was taken to the medical station for evaluation. The maintenance supervisor was notified immediately."
}'
# Test with invalid port (should fail startup)
PORT=99999 npm run start:dev
# Test with invalid log level (should fail startup)
LOG_LEVEL=INVALID npm run start:dev
# Test with missing required AWS credentials (should fail gracefully)
AWS_ACCESS_KEY_ID= npm run start:dev
# Test retry behavior with network simulation
# (Configure temporary network issues to see retry logs)
# Check retry statistics in application logs
# Look for messages like: "π Operation succeeded after 2 attempts in 3847ms"
# Make multiple requests to generate performance metrics
for i in {1..10}; do
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "Test prompt '$i'"}' &
done
# Check structured logs for performance data
# Look for JSON logs with duration, successRate, p95Duration fields
# Test with oversized prompt (should return 400 Bad Request)
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "'$(python3 -c "print('x' * 50000)")'"}'
# Test with invalid model ID (should return appropriate error)
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello", "modelId": "invalid-model-id"}'
# Test with invalid temperature (should return validation error)
curl -X POST http://localhost:3000/api/proxy/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello", "temperature": 5.0}'
# Watch logs in real-time during development
npm run start:dev | grep "INFO"
# Filter specific operation logs
npm run start:dev | grep "incident-report-analysis"
# Monitor error logs
npm run start:dev | grep "ERROR\|WARN"
# Test different provider configurations
AI_PROVIDER=aws npm run start:dev
AI_PROVIDER=azure npm run start:dev # (when implemented)
# Test retry configurations
GLOBAL_MAX_RETRIES=1 RETRY_DELAY_MS=500 npm run start:dev
# Test logging configurations
ENABLE_STRUCTURED_LOGGING=true ENABLE_METRICS=true npm run start:dev
# Enable verbose logging for performance analysis
LOG_LEVEL=VERBOSE ENABLE_METRICS=true ENABLE_TRACING=true npm run start:dev
# Monitor memory usage during load testing
node --max-old-space-size=4096 dist/main.js
- PII Detection: Automatic identification and redaction
- Toxicity Analysis: Content safety scoring and filtering
- Compliance Framework: GDPR, OSHA, SOX validation
- Custom Policies: Configurable content rules per organization
export class CircuitBreakerService {
// Prevent cascading failures
// Automatic failure detection and recovery
// Provider fallback mechanisms
}
export class ResponseCacheService {
// Intelligent response caching
// Cost optimization for repeated queries
// Cache invalidation strategies
}
export class AdvancedRateLimitService {
// Per-provider rate limiting
// User-based quotas
// Cost management controls
}
- Vendor Neutrality: All provider-specific logic isolated behind abstractions
- Fail-Safe Design: Graceful degradation and intelligent retry mechanisms
- Observability First: Comprehensive logging, metrics, and tracing
- Configuration Driven: Runtime behavior controlled via environment variables
- Type Safety: Leverage TypeScript for compile-time error prevention
- Extend
BaseAIService
abstract class - Implement required abstract methods
- Add provider to
AIServiceFactory
instantiation logic - Create provider-specific exception classes
- Add configuration schema validation
- Update model mappings in
config/
directory - Add comprehensive tests and documentation
- Input Validation: Class-validator decorators on all DTOs
- Rate Limiting: ThrottlerModule with configurable limits
- CORS Configuration: Cross-origin request handling
- Error Sanitization: Sensitive data exclusion in error responses
- Request ID Tracking: Audit trail for all operations
- Design Patterns - Adapter, Strategy, and Factory patterns overview
- Service Architecture - Detailed implementation guide for interfaces and services
- Implementation Guide - Step-by-step instructions for adding providers and features
- Testing Strategy - Comprehensive testing approaches for AI services
- API Reference - Complete endpoint documentation and usage examples
- Guardrails Implementation Plan - Content safety and compliance features
- Archives - Original documentation files for reference