Skip to content

A pragmatic, **ready-to-run** stack showcasing JVM services for AI/RAG with observability and auth.

Notifications You must be signed in to change notification settings

codingkiddo/jvm-ai-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JVM × AI Stack (Docker Compose)

A pragmatic, ready-to-run stack showcasing JVM services for AI/RAG with observability and auth.

What’s inside

  • gateway/ – Spring Boot WebFlux app
    • /api/chat – SSE chat proxied to Ollama (default: llama3.1)
    • /api/embed/upsert – upsert a doc (text) with embeddings into PGVector
    • /api/search – vector similarity search (PGVector)
    • Micrometer Prometheus metrics + OTel tracing
  • onnx-service/ – Minimal Spring Boot service using ONNX Runtime Java for image classification demo
  • db/ – Postgres with pgvector extension and sample schema
  • infra/
    • Keycloak realm & test client
    • Prometheus + Grafana (pre-provisioned Prometheus datasource)
    • OpenTelemetry Collector (OTLP gRPC) + Jaeger for traces

Quick start

Requirements: Docker 24+, Docker Compose plugin.

docker compose up -d --build

Then:

Note on models: The first compose up will pull Ollama models (llama3.1 for chat, nomic-embed-text for embeddings). This can take a while depending on bandwidth. You can change models in docker-compose.yml or gateway/src/main/resources/application.yml.

Minimal usage

1) Stream chat tokens (SSE)

Open in browser:

http://localhost:8080/demo.html

Or curl:

curl -N "http://localhost:8080/api/chat?q=Say%20hello%20in%20one%20sentence"

2) Upsert a doc → PGVector

curl -X POST http://localhost:8080/api/embed/upsert   -H "Content-Type: application/json"   -d '{"id":"doc-1","content":"Wi‑Fi 7 (802.11be) brings MLO and 320 MHz channels."}'

3) Vector search

curl "http://localhost:8080/api/search?q=What%20is%20Wi‑Fi%207?&k=3"

4) ONNX classification demo

curl "http://localhost:9090/api/onnx/classify?imageUrl=https://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg"

Returns top-5 logits indices (no labels bundled).

Env & config

  • Edit docker-compose.yml to switch models or credentials.
  • gateway uses virtual threads and exports metrics/traces by default to Prometheus/OTel.
  • Keycloak includes a jvm-ai realm and a public client gateway-local (for local dev).

Dev tips

  • Rebuild just one service:
    docker compose build gateway && docker compose up -d gateway
  • Tail logs:
    docker compose logs -f gateway onnx-service ollama postgres otel-collector

Security notes

  • This stack is for local demo/dev. Replace default passwords, harden configurations, and lock down ports before any real deployment.

About

A pragmatic, **ready-to-run** stack showcasing JVM services for AI/RAG with observability and auth.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published