aws-samples · ashakirin · Nov 18, 2025 · Nov 18, 2025
diff --git a/samples/quality-assurance/ai-agent/.gitignore b/samples/quality-assurance/ai-agent/.gitignore
@@ -0,0 +1,33 @@
+HELP.md
+target/
+.mvn/wrapper/maven-wrapper.jar
+!**/src/main/**/target/
+!**/src/test/**/target/
+
+### STS ###
+.apt_generated
+.classpath
+.factorypath
+.project
+.settings
+.springBeans
+.sts4-cache
+
+### IntelliJ IDEA ###
+.idea
+*.iws
+*.iml
+*.ipr
+
+### NetBeans ###
+/nbproject/private/
+/nbbuild/
+/dist/
+/nbdist/
+/.nb-gradle/
+build/
+!**/src/main/**/build/
+!**/src/test/**/build/
+
+### VS Code ###
+.vscode/
diff --git a/samples/quality-assurance/ai-agent/README.md b/samples/quality-assurance/ai-agent/README.md
@@ -0,0 +1,133 @@
+# Spring AI Agent
+
+A comprehensive AI-powered agent built with Spring AI framework, featuring weather forecasting capabilities and secure OAuth integration.
+
+## Related Documentation
+
+This project is part of a larger microservices ecosystem:
+
+- [Weather Service Documentation](../weather/README.md) - Weather forecast service with global coverage
+
+## Project Overview
+
+### Description
+
+The Spring AI Agent is a demonstration of how to build modern AI-powered applications using the Spring AI framework. It provides weather forecasting capabilities through:
+
+- Weather forecasts for any city worldwide
+- Integration with external weather APIs
+- Model Context Protocol (MCP) client for connecting to weather services
+- Secure OAuth authentication and authorization
+
+The application serves as the central component in a microservices architecture, connecting to the Weather service through the Model Context Protocol (MCP).
+
+### Purpose
+
+This application serves as:
+
+1. A reference implementation for Spring AI integration with weather services
+2. A demonstration of secure AI application patterns with OAuth
+3. A practical example of building weather assistants with Spring Boot
+4. A showcase for integrating with Amazon Bedrock and weather APIs
+
+### Technology Stack
+
+- **Java 21**: Latest LTS version with modern language features
+- **Spring Boot 3.5.7**: Core framework for building the application
+- **Spring AI 1.0.3**: AI integration framework
+- **Spring Security**: OAuth 2.0 authentication and authorization
+- **Amazon Bedrock**: AI model provider (Claude Sonnet 4)
+- **Docker**: Containerization for application
+
+## Security
+
+### OAuth 2.0 Integration
+
+The application implements OAuth 2.0 for secure authentication and authorization:
+
+- **Authorization Server**: Integrated OAuth 2.0 authorization server
+- **Resource Protection**: Secured API endpoints with JWT tokens
+- **Token Validation**: Automatic JWT token validation and user context
+
+## Getting Started
+
+### Prerequisites
+
+- Java 21 or higher
+- Maven 3.8 or higher
+- AWS account with Amazon Bedrock access
+
+### Prerequisites for Full Functionality
+
+Before starting the AI agent, ensure the required services are running:
+
+1. **Start Authorization Server** (port 9000):
+   ```bash
+   cd ../authorization-server/
+   mvn spring-boot:run
+   ```
+
+2. **Start Weather Service** (port 8083):
+   ```bash
+   cd ../weather/
+   mvn spring-boot:run
+   ```
+
+These services provide OAuth authentication and weather forecasting tools that the AI agent uses.
+
+#### Running the AI Agent
+
+```bash
+cd ai-agent/
+mvn spring-boot:run
+```
+
+This will:
+- Configure secure endpoints for weather data access
+- Connect to the weather service via MCP for authenticated users only
+- Connect to the authorization server for OAuth authentication
+- Start the application on port 8080
+
+#### Access Points
+
+Once all applications are running, you can access:
+
+- **Main Application**: `http://localhost:8080/`
+
+### AWS Configuration
+
+1. Configure AWS credentials:
+   ```bash
+   aws configure
+   ```
+
+2. Ensure you have access to Amazon Bedrock and the required models (Claude Sonnet 4).
+
+### Building and Running the Application
+
+1. **Standard Build and Run:**
+   ```bash
+   cd ai-agent/
+   mvn clean package
+   mvn spring-boot:run
+   ```
+
+2. The application will be available at:
+   ```
+   http://localhost:8080/
+   ```
+
+### Authentication Flow
+
+1. Navigate to `http://localhost:9000/` (authorization server)
+2. Authenticate with your credentials
+3. Use the authorization code to obtain an access token
+4. Access weather endpoints with the Bearer token
+
+## Contributing
+
+Contributions are welcome! Please feel free to submit a Pull Request.
+
+## License
+
+This project is licensed under the MIT License - see the LICENSE file for details.
diff --git a/samples/quality-assurance/ai-agent/deep-eval/Dockerfile b/samples/quality-assurance/ai-agent/deep-eval/Dockerfile
@@ -0,0 +1,11 @@
+FROM python:3.11-slim
+
+WORKDIR /app
+
+RUN pip install flask
+
+COPY deepeval_service.py .
+
+EXPOSE 8080
+
+CMD ["python", "deepeval_service.py"]
diff --git a/samples/quality-assurance/ai-agent/deep-eval/README.md b/samples/quality-assurance/ai-agent/deep-eval/README.md
@@ -0,0 +1,43 @@
+# DeepEval Service
+
+This folder contains the DeepEval evaluation service for testing AI responses.
+
+## Files
+- `Dockerfile` - Docker image definition for the DeepEval service
+- `deepeval_service.py` - Flask REST API service for DeepEval metrics
+- `README.md` - This file
+
+## Build and Run
+
+```bash
+# Build the Docker image
+cd deep-eval
+docker build -t deepeval-service:latest .
+
+# Run manually (optional)
+docker run -p 8080:8080 deepeval-service:latest
+```
+
+## API Endpoints
+
+- `GET /health` - Health check
+- `POST /evaluate` - Evaluate response relevancy
+
+### Evaluate Request
+```json
+{
+  "question": "What is AI?",
+  "response": "AI is artificial intelligence...",
+  "threshold": 0.3
+}
+```
+
+### Evaluate Response
+```json
+{
+  "score": 0.85,
+  "success": true,
+  "threshold": 0.3,
+  "reason": "Response is relevant to the question"
+}
+```
diff --git a/samples/quality-assurance/ai-agent/deep-eval/deepeval_service.py b/samples/quality-assurance/ai-agent/deep-eval/deepeval_service.py
@@ -0,0 +1,83 @@
+from flask import Flask, request, jsonify
+import logging
+import json
+import re
+
+app = Flask(__name__)
+logging.basicConfig(level=logging.INFO)
+
+@app.route('/health', methods=['GET'])
+def health():
+    return jsonify({"status": "healthy"})
+
+@app.route('/evaluate', methods=['POST'])
+def evaluate():
+    try:
+        # Use Flask's request.json with proper error handling
+        if request.is_json:
+            data = request.get_json()
+        else:
+            # Fallback: manually parse with sanitization
+            raw_data = request.get_data(as_text=True)
+            sanitized_data = re.sub(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]', '', raw_data)
+            data = json.loads(sanitized_data)
+
+        question = str(data.get('question', '')).strip()
+        response = str(data.get('response', '')).strip()
+        threshold = float(data.get('threshold', 0.3))
+
+        # Sanitize text using proper string methods
+        question = ''.join(char for char in question if ord(char) >= 32 or char in '\t\n\r')
+        response = ''.join(char for char in response if ord(char) >= 32 or char in '\t\n\r')
+
+        app.logger.info(f"Evaluating - Question: {question[:50]}..., Response: {response[:50]}...")
+
+        if not question or not response:
+            return jsonify({"error": "question and response are required"}), 400
+
+        # Simple but effective relevancy scoring
+        question_words = set(re.findall(r'\w+', question.lower()))
+        response_words = set(re.findall(r'\w+', response.lower()))
+
+        # Remove common stop words
+        stop_words = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by', 'is', 'are', 'was', 'were'}
+        question_words -= stop_words
+        response_words -= stop_words
+
+        if not question_words:
+            score = 0.5
+            reason = "No meaningful words in question"
+        else:
+            # Calculate relevancy score
+            exact_matches = len(question_words.intersection(response_words))
+            partial_matches = sum(1 for qw in question_words 
+                                if any(qw in rw or rw in qw for rw in response_words))
+
+            # Scoring algorithm
+            exact_score = exact_matches / len(question_words)
+            partial_score = (partial_matches - exact_matches) / len(question_words) * 0.3
+            length_bonus = min(len(response.split()) / 20, 0.2)
+
+            score = min(exact_score + partial_score + length_bonus + 0.1, 1.0)
+            reason = f"Exact matches: {exact_matches}/{len(question_words)}, Partial matches: {partial_matches - exact_matches}"
+
+        success = score >= threshold
+
+        result = {
+            "score": round(score, 2),
+            "success": success,
+            "threshold": threshold,
+            "reason": reason,
+            "metric_type": "Enhanced Keyword Analysis",
+            "model": "keyword-based-evaluator"
+        }
+
+        app.logger.info(f"Evaluation result: score={result['score']}, success={result['success']}")
+        return jsonify(result)
+
+    except Exception as e:
+        app.logger.error(f"Evaluation error: {str(e)}")
+        return jsonify({"error": f"Evaluation failed: {str(e)}"}), 500
+
+if __name__ == '__main__':
+    app.run(host='0.0.0.0', port=8080, debug=True)