A pragmatic, ready-to-run stack showcasing JVM services for AI/RAG with observability and auth.
- gateway/ – Spring Boot WebFlux app
/api/chat– SSE chat proxied to Ollama (default:llama3.1)/api/embed/upsert– upsert a doc (text) with embeddings into PGVector/api/search– vector similarity search (PGVector)- Micrometer Prometheus metrics + OTel tracing
- onnx-service/ – Minimal Spring Boot service using ONNX Runtime Java for image classification demo
- db/ – Postgres with pgvector extension and sample schema
- infra/
- Keycloak realm & test client
- Prometheus + Grafana (pre-provisioned Prometheus datasource)
- OpenTelemetry Collector (OTLP gRPC) + Jaeger for traces
Requirements: Docker 24+, Docker Compose plugin.
docker compose up -d --buildThen:
- Gateway API: http://localhost:8080
- SSE Chat (browser-friendly): http://localhost:8080/demo.html
- Health: http://localhost:8080/actuator/health
- Metrics: http://localhost:8080/actuator/prometheus
- ONNX service: http://localhost:9090/actuator/health
- Postgres (pgvector): localhost:5432 (user:
postgres, pass:postgres, db:ragdb) - Prometheus: http://localhost:9091
- Grafana: http://localhost:3000 (login:
admin/admin) - Jaeger UI: http://localhost:16686
- Keycloak: http://localhost:8081 (login:
admin/admin123) - Ollama API: http://localhost:11434 (models pulled on first run)
Note on models: The first
compose upwill pull Ollama models (llama3.1for chat,nomic-embed-textfor embeddings). This can take a while depending on bandwidth. You can change models indocker-compose.ymlorgateway/src/main/resources/application.yml.
Open in browser:
http://localhost:8080/demo.html
Or curl:
curl -N "http://localhost:8080/api/chat?q=Say%20hello%20in%20one%20sentence"curl -X POST http://localhost:8080/api/embed/upsert -H "Content-Type: application/json" -d '{"id":"doc-1","content":"Wi‑Fi 7 (802.11be) brings MLO and 320 MHz channels."}'curl "http://localhost:8080/api/search?q=What%20is%20Wi‑Fi%207?&k=3"curl "http://localhost:9090/api/onnx/classify?imageUrl=https://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg"Returns top-5 logits indices (no labels bundled).
- Edit docker-compose.yml to switch models or credentials.
gatewayuses virtual threads and exports metrics/traces by default to Prometheus/OTel.- Keycloak includes a
jvm-airealm and a public clientgateway-local(for local dev).
- Rebuild just one service:
docker compose build gateway && docker compose up -d gateway - Tail logs:
docker compose logs -f gateway onnx-service ollama postgres otel-collector
- This stack is for local demo/dev. Replace default passwords, harden configurations, and lock down ports before any real deployment.