A comprehensive AI chat system that combines the power of Llama language models with Model Context Protocol (MCP) tool execution, featuring a modern React web interface. This project demonstrates how to build a complete AI assistant that can interact with the operating system, execute commands, manipulate files, and provide intelligent responses through a beautiful web interface.
- Project Overview
- Architecture
- Components
- Quick Start
- Detailed Setup
- Usage Examples
- API Documentation
- Development Guide
- Troubleshooting
- Contributing
This project addresses the challenge of creating an AI assistant that can not only engage in intelligent conversation but also take concrete actions on behalf of the user. By combining:
- Llama Language Models for natural language understanding and generation
- Model Context Protocol (MCP) for secure, structured tool execution
- Spring Boot Services for robust backend architecture
- React Web Interface for modern user experience
We create a complete AI assistant capable of:
- Understanding natural language requests
- Executing file system operations
- Running system commands safely
- Providing intelligent analysis and responses
- Maintaining conversation context and history
- Tool-Aware AI: The AI automatically determines when and how to use tools based on user requests
- Secure Execution: MCP provides a secure abstraction layer for system operations
- Real-time Feedback: Users see exactly what tools are being executed and their results
- Conversation Persistence: Full conversation history with context preservation
- Service Architecture: Modular design allows independent scaling and updates
βββββββββββββββββββ HTTP/REST βββββββββββββββββββ HTTP/REST βββββββββββββββββββ
β React UI β ββββββββββββββββΊ β Llama Chat β βββββββββββββββΊ β MCP Streaming β
β (Port 3000) β β Service β β Service β
β β ββββββββββββββββ β (Port 8081) β βββββββββββββββ β (Port 8080) β
β β’ Chat Feed β JSON β β JSON β β
β β’ Conversationsβ β β’ Orchestrationβ β β’ File Ops β
β β’ Tool Results β β β’ LLM Calling β β β’ Commands β
β β’ Service Info β β β’ Tool Parsing β β β’ Pattern Matchβ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
β β HTTP β
β βΌ βΌ
β βββββββββββββββββββ βββββββββββββββββββ
β β Ollama β β OS Services β
β β (Port 11434) β β β
β β β β β’ File System β
βββββ Browser Rendering βββββ β’ Llama Models β β β’ Process Exec β
β β’ Inference β β β’ Search Ops β
β β’ Tool Support β β β’ Validation β
βββββββββββββββββββ βββββββββββββββββββ
Typical Chat Flow:
- User Input β React UI captures message
- API Request β UI sends POST to Chat Service
- Context Loading β Chat Service loads conversation history
- LLM Generation β Chat Service calls Ollama with tool definitions
- Tool Parsing β If tools are called, Chat Service extracts tool calls
- Tool Execution β Chat Service calls MCP Service for each tool
- System Operations β MCP Service executes file/command operations
- Result Integration β Chat Service integrates tool results
- Final Generation β Ollama generates final response with tool context
- Response Delivery β UI displays message with tool execution details
Purpose: Provides secure, abstracted access to operating system primitives through a well-defined protocol.
Architecture:
src/main/java/com/example/mcpstreaming/
βββ controller/
β βββ McpStreamingController.java # REST API endpoints
βββ service/
β βββ FileOperationService.java # File system operations
β βββ CommandExecutionService.java # System command execution
β βββ GrepService.java # Pattern matching and search
βββ model/
β βββ McpRequest.java # Request data models
β βββ McpResponse.java # Response data models
β βββ McpStreamChunk.java # Streaming data chunks
βββ websocket/
β βββ McpWebSocketHandler.java # Real-time WebSocket interface
βββ config/
βββ WebSocketConfig.java # WebSocket configuration
Key Features:
- Secure Operations: Command validation, path sanitization, privilege controls
- Streaming Support: Large file operations with real-time progress
- WebSocket Interface: Real-time bidirectional communication
- Comprehensive Operations: Files, commands, search, pattern matching
- Safety First: Blacklisted dangerous commands, timeout enforcement
Available Operations:
list_directory
- List files and directoriesread_file
- Read file contents with streaming supportcreate_file
- Create new files with contentedit_file
- Modify existing filesappend_file
- Append content to filesexecute_command
- Run system commands with validationgrep
- Search for patterns in files/directories
Purpose: Orchestrates conversation flow, integrates with Llama models, and manages tool calling logic.
Architecture:
chat-service/src/main/java/com/example/chatservice/
βββ controller/
β βββ ChatController.java # Chat API endpoints
βββ service/
β βββ ChatService.java # Main conversation orchestration
β βββ OllamaService.java # Llama model integration
β βββ McpClientService.java # MCP service client
β βββ ConversationService.java # Conversation history management
βββ model/
β βββ ChatMessage.java # Chat message data model
β βββ ChatRequest.java # API request models
β βββ ChatResponse.java # API response models
β βββ ToolCall.java # Tool calling data structures
β βββ ToolCallResult.java # Tool execution results
β βββ OllamaModels.java # Ollama API integration models
βββ config/
βββ application.yml # Service configuration
Key Features:
- Intelligent Tool Calling: Automatically determines when and how to use tools
- Conversation Management: Maintains context across multiple turns
- Multi-Model Support: Works with various Llama models via Ollama
- Error Recovery: Graceful handling of tool failures and service interruptions
- Performance Monitoring: Request timing, tool usage analytics
Processing Flow:
- Receive User Message: Parse and validate incoming chat requests
- Context Preparation: Load conversation history and prepare context
- LLM Generation: Send to Llama with available tool definitions
- Tool Execution: If tools are called, execute via MCP service
- Result Integration: Incorporate tool results into conversation
- Final Response: Generate final response with complete context
- History Storage: Save conversation for future context
Purpose: Provides a modern, responsive web interface for interacting with the chat system.
Architecture:
chat-ui/src/
βββ components/
β βββ App.tsx # Main application component
β βββ ChatInterface.tsx # Infinite scroll chat feed
β βββ ChatMessage.tsx # Individual message rendering
β βββ MessageInput.tsx # Smart input with auto-resize
β βββ Sidebar.tsx # Conversation and service management
βββ hooks/
β βββ useChat.ts # Main state management hook
βββ services/
β βββ chatApi.ts # Backend API client
βββ types/
β βββ chat.ts # TypeScript type definitions
βββ styles/
βββ App.css # Tailwind CSS configuration
Key Features:
- Infinite Scroll Feed: Smooth scrolling with auto-scroll behavior
- Rich Message Rendering: Markdown support with syntax highlighting
- Tool Execution Visualization: Real-time display of tool calls and results
- Conversation Management: Create, switch, and delete conversations
- Service Monitoring: Live health status of all backend services
- Responsive Design: Works seamlessly on desktop and mobile
- Error Handling: Graceful degradation and recovery mechanisms
User Experience Flow:
- Service Connection: Automatically connects and monitors backend health
- Conversation Creation: Users can start new conversations or continue existing ones
- Message Input: Smart input field with keyboard shortcuts and auto-expand
- Real-time Feedback: Immediate visual feedback for message processing
- Tool Visualization: Clear display of which tools are being executed
- Result Integration: Tool results are seamlessly integrated into conversation flow
- History Management: Easy access to previous conversations and messages
Ollama Integration:
- Local LLM Serving: Runs Llama models locally for privacy and performance
- Model Management: Supports multiple model sizes and configurations
- Tool Calling Protocol: Structured function calling for reliable tool execution
- Performance Optimization: Optimized for local inference with reasonable hardware
System Services:
- File System Access: Secure, validated file operations
- Command Execution: Sandboxed system command execution
- Pattern Matching: Efficient search across files and directories
- Process Management: Safe process creation and monitoring
# 1. Java 21+ (Amazon Corretto recommended)
java -version
# 2. Maven 3.8+
mvn -version
# 3. Node.js 18+ and npm
node -v && npm -v
# 4. Ollama with Llama model
ollama --version
# Install and start Ollama
brew install ollama
ollama serve # In one terminal
ollama pull llama3.2:latest # In another terminal
# Start complete system
git clone <repository>
cd java_mcp_streaming
./start-complete-stack.sh
- React UI: http://localhost:3000 (Main interface)
- Chat Service: http://localhost:8081 (API)
- MCP Service: http://localhost:8080 (Tools)
- Ollama: http://localhost:11434 (LLM)
# Set Java version (if using SDKMAN)
sdk use java 21.0.6-amzn
# Set JAVA_HOME (example for Amazon Corretto)
export JAVA_HOME=/Library/Java/JavaVirtualMachines/amazon-corretto-23.jdk/Contents/Home
# Verify Java configuration
java -version
echo $JAVA_HOME
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start Ollama service
ollama serve
# Pull required models
ollama pull llama3.2:latest # Main model (4.3GB)
ollama pull llama3.2:3b # Smaller model (2.0GB)
# Verify installation
curl http://localhost:11434/api/tags
# Build MCP Service
mvn clean compile
# Build Chat Service
cd chat-service
mvn clean compile
cd ..
# Start services individually
./start-service.sh # MCP Service
cd chat-service && ./start-chat-service.sh # Chat Service
cd chat-ui
# Install dependencies
npm install
# Start development server
npm start
# or use the custom script
./start-ui.sh
MCP Service (src/main/resources/application.yml
):
mcp:
streaming:
max-concurrent-streams: 10
default-timeout-seconds: 300
security:
validate-commands: true
allow-dangerous-commands: false
Chat Service (chat-service/src/main/resources/application.yml
):
chat:
ollama:
base-url: http://localhost:11434
default-model: llama3.2:latest
temperature: 0.7
mcp:
service-url: http://localhost:8080
tools:
enabled: true
max-calls-per-turn: 5
React UI (chat-ui/.env
):
REACT_APP_API_URL=http://localhost:8081/api/chat
REACT_APP_ENV=development
- Open http://localhost:3000
- Click "New Conversation"
- Try these example queries:
"List all files in the current directory"
β Uses list_directory tool
β Shows file listing in chat
"Read the README.md file and summarize it"
β Uses read_file tool
β AI provides intelligent summary
"Create a Python script that prints 'Hello World'"
β Uses create_file tool
β Creates actual file on system
"Show me the current system uptime and memory usage"
β Uses execute_command tool
β Runs system commands safely
"Find all Java files in this project"
β Uses grep tool
β Searches and lists matching files
"Check if port 8080 is in use"
β Uses execute_command with netstat/lsof
β Shows port usage information
"Analyze the structure of this Java project"
β Uses multiple tools (list_directory, read_file, grep)
β Provides comprehensive code analysis
"Find any TODO comments in the codebase"
β Uses grep tool with pattern matching
β Lists all TODO items found
Direct Chat Service Usage:
# Simple chat without tools
curl -X POST http://localhost:8081/api/chat/message \
-H "Content-Type: application/json" \
-d '{"message": "Hello! How are you?"}'
# Chat with tools enabled
curl -X POST http://localhost:8081/api/chat/message \
-H "Content-Type: application/json" \
-d '{"message": "List the files in the current directory", "enable_tools": true}'
# Continue conversation
curl -X POST http://localhost:8081/api/chat/message \
-H "Content-Type: application/json" \
-d '{"message": "Now read the first file", "conversation_id": "conv-123", "enable_tools": true}'
Direct MCP Service Usage:
# List directory via MCP
curl -X POST http://localhost:8080/api/mcp/request \
-H "Content-Type: application/json" \
-d '{"operation": "list_directory", "parameters": {"path": "."}}'
# Execute command via MCP
curl -X POST http://localhost:8080/api/mcp/request \
-H "Content-Type: application/json" \
-d '{"operation": "execute_command", "parameters": {"command": "uptime"}}'
POST /api/chat/message
Content-Type: application/json
{
"message": "Your message here",
"conversation_id": "optional-conversation-id",
"model": "llama3.2:latest",
"enable_tools": true,
"temperature": 0.7,
"max_tokens": 2000
}
Response:
{
"message": {
"id": "msg-123",
"role": "assistant",
"content": "AI response here",
"timestamp": "2024-01-01T12:00:00Z",
"tool_call_results": [
{
"id": "tool-456",
"tool_name": "list_directory",
"success": true,
"result": ["file1.txt", "file2.java"]
}
]
},
"conversation_id": "conv-789",
"model_used": "llama3.2:latest",
"processing_time_ms": 1250
}
# Get conversation history
GET /api/chat/conversation/{conversationId}/history
# Clear conversation
DELETE /api/chat/conversation/{conversationId}
# List active conversations
GET /api/chat/conversations
# Service health
GET /api/chat/health
# Service capabilities
GET /api/chat/capabilities
POST /api/mcp/request
Content-Type: application/json
{
"operation": "operation_name",
"parameters": {
"param1": "value1",
"param2": "value2"
},
"stream": false
}
Operation | Parameters | Description |
---|---|---|
list_directory |
path |
List files and directories |
read_file |
path |
Read file contents |
create_file |
path , content |
Create new file |
edit_file |
path , content |
Edit existing file |
append_file |
path , content |
Append to file |
execute_command |
command , working_directory , timeout_seconds |
Run system command |
grep |
pattern , path , recursive , case_sensitive |
Search for patterns |
POST /api/mcp/stream
Content-Type: application/json
Accept: application/x-ndjson
{
"operation": "read_file",
"parameters": {"path": "/large/file.txt"},
"stream": true
}
const ws = new WebSocket('ws://localhost:8080/ws/mcp');
ws.onopen = () => {
ws.send(JSON.stringify({
operation: "list_directory",
parameters: {path: "."}
}));
};
ws.onmessage = (event) => {
const response = JSON.parse(event.data);
console.log('MCP Response:', response);
};
java_mcp_streaming/
βββ src/main/java/com/example/mcpstreaming/ # MCP Streaming Service
β βββ controller/ # REST controllers
β βββ service/ # Business logic
β βββ model/ # Data models
β βββ websocket/ # WebSocket handlers
β βββ config/ # Configuration
βββ chat-service/ # Llama Chat Service
β βββ src/main/java/com/example/chatservice/
β βββ controller/ # Chat API controllers
β βββ service/ # Chat business logic
β βββ model/ # Chat data models
β βββ config/ # Chat configuration
βββ chat-ui/ # React UI
β βββ src/
β β βββ components/ # React components
β β βββ hooks/ # Custom hooks
β β βββ services/ # API clients
β β βββ types/ # TypeScript types
β βββ public/ # Static assets
βββ start-service.sh # Start MCP service
βββ start-all-services.sh # Start backend services
βββ start-complete-stack.sh # Start everything
βββ demo-chat.sh # Demo script
βββ README.md # This file
- Define Operation Logic (
service/CustomOperationService.java
):
@Service
public class CustomOperationService {
public Mono<CustomResult> performCustomOperation(String param) {
// Implementation
}
}
- Update Controller (
controller/McpStreamingController.java
):
case "custom_operation" -> {
String param = getStringParameter(request, "param");
yield customOperationService.performCustomOperation(param)
.map(result -> new McpResponse(request.getId(), result));
}
- Add to Operations List:
operations.put("custom_operation", Map.of(
"description", "Performs a custom operation",
"parameters", Map.of("param", "string - parameter description"),
"streaming", false
));
- Extend Chat Service (
service/ChatService.java
):
public Mono<CustomResponse> customChatFeature(CustomRequest request) {
// Implementation
}
- Update Controller (
controller/ChatController.java
):
@PostMapping("/custom-feature")
public Mono<ResponseEntity<CustomResponse>> customFeature(@RequestBody CustomRequest request) {
return chatService.customChatFeature(request)
.map(ResponseEntity::ok);
}
- Update Frontend (
services/chatApi.ts
):
async customFeature(request: CustomRequest): Promise<CustomResponse> {
return this.fetchWithErrorHandling(`${API_BASE_URL}/custom-feature`, {
method: 'POST',
body: JSON.stringify(request),
});
}
- Create Component (
components/CustomComponent.tsx
):
interface CustomComponentProps {
data: CustomData;
onAction: (action: string) => void;
}
const CustomComponent: React.FC<CustomComponentProps> = ({ data, onAction }) => {
return (
<div className="custom-component">
{/* Component implementation */}
</div>
);
};
- Update State Management (
hooks/useChat.ts
):
const [customState, setCustomState] = useState<CustomState>({});
const customAction = useCallback(async (param: string) => {
// Custom action logic
}, []);
return {
// ... existing state and actions
customState,
customAction,
};
Java Version Issues:
# Check Java version (must be 21+)
java -version
# If wrong version, install correct one
sdk install java 21.0.6-amzn
sdk use java 21.0.6-amzn
# Set JAVA_HOME
export JAVA_HOME=$(sdk home java 21.0.6-amzn)
Port Conflicts:
# Check what's using ports
lsof -i :8080 # MCP Service
lsof -i :8081 # Chat Service
lsof -i :3000 # React UI
lsof -i :11434 # Ollama
# Kill processes if needed
kill -9 <PID>
Service Not Running:
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Start Ollama if not running
ollama serve
# Check available models
ollama list
# Pull models if missing
ollama pull llama3.2:latest
UI Won't Load:
# Check Node.js version
node -v # Should be 16+
# Clear npm cache
npm cache clean --force
# Reinstall dependencies
rm -rf node_modules package-lock.json
npm install
# Check for port conflicts
lsof -i :3000
Backend Services:
# MCP Service with debug
mvn spring-boot:run -Dspring-boot.run.arguments="--logging.level.com.example.mcpstreaming=DEBUG"
# Chat Service with debug
cd chat-service
mvn spring-boot:run -Dspring-boot.run.arguments="--logging.level.com.example.chatservice=DEBUG"
Frontend:
# React with debug info
REACT_APP_DEBUG=true npm start
Health Checks:
# Comprehensive health check
curl http://localhost:8080/api/mcp/health | jq .
curl http://localhost:8081/api/chat/health | jq .
curl http://localhost:11434/api/tags | jq .
# Service capabilities
curl http://localhost:8081/api/chat/capabilities | jq .
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Follow coding standards:
- Java: Google Java Style
- TypeScript: ESLint + Prettier
- Commit messages: Conventional Commits
- Add tests for new functionality
- Update documentation as needed
- Submit a pull request
Write Tests For:
- All public API endpoints
- Core business logic methods
- Error handling scenarios
- UI component interactions
- Integration between services
This project is provided as a demonstration of integrating Llama language models with MCP tool execution and modern web interfaces.
Use Case Scenarios:
- Development Tools: AI-powered development assistants
- System Administration: Intelligent system management interfaces
- Data Analysis: AI assistants for data exploration and analysis
- Educational: Learning about AI integration architectures
- Research: Foundation for AI agent research projects
Ready to dive in? Here's the fastest path to a working system:
# 1. Prerequisites check
java -version # Need 21+
node -v # Need 16+
ollama --version
# 2. Quick setup
git clone <this-repository>
cd java_mcp_streaming
# 3. Start Ollama (in separate terminal)
ollama serve
ollama pull llama3.2:latest
# 4. Start everything
./start-complete-stack.sh
# 5. Open browser
open http://localhost:3000
# 6. Try it out
# Type: "List the files in the current directory"
# Watch the AI use tools to complete your request!
Need help? Check the troubleshooting section or run ./demo-chat.sh
to test your setup.
Questions? The system provides extensive logging and health checks to help debug any issues.
Want to extend it? Check the development guide for adding new features.
Welcome to the future of AI-powered system interaction! π