Semantic search engine for an e-commerce product catalog. Replaces traditional keyword search with vector embedding-based semantic search, allowing users to find products using natural language queries like "outfit for a rainy hike in Nairobi".
- Java 25
- Spring Boot 3.2
- LangChain4j — local AllMiniLmL6V2 quantized embedding model (384 dimensions, no external API required)
- Milvus — vector database for storing and querying embeddings with cosine similarity
- PostgreSQL — relational database for product catalog
- JUnit 5 + Testcontainers — integration testing with real containers
- Docker + Docker Compose — local infrastructure
controller/ REST API endpoints
service/ Application logic, orchestration
embedding/ LangChain4j embedding generation
vector/ Milvus vector store operations
repository/ Spring Data JPA for PostgreSQL
entity/ JPA product entity
dto/ Request and response records
config/ Spring beans for Milvus and embedding model
POST /api/products
Request body:
{
"name": "Waterproof Hiking Jacket",
"description": "Durable jacket designed for rainy mountain conditions",
"category": "Outdoor Apparel",
"price": 89.99,
"location": "Nairobi"
}Response (201 Created):
{
"id": 1,
"name": "Waterproof Hiking Jacket",
"description": "Durable jacket designed for rainy mountain conditions",
"category": "Outdoor Apparel",
"price": 89.99,
"location": "Nairobi",
"createdTimestamp": "2026-03-11T10:00:00"
}GET /api/products?page=0&size=20GET /api/search?q=outfit+for+a+rainy+hike+in+Nairobi&topK=10Response:
[
{
"product": {
"id": 1,
"name": "Waterproof Hiking Jacket",
...
},
"similarityScore": 0.94
}
]Results are ranked by cosine similarity score (highest first).
- Java 25
- Maven 3.9+
- Docker and Docker Compose
cd docker
docker compose up -dWait for all services to be healthy (Milvus takes approximately 30-60 seconds to initialize):
docker compose ps./mvnw spring-boot:runThe application starts on port 8080 and automatically:
- Connects to PostgreSQL and creates the
productstable - Connects to Milvus and creates the
product_embeddingscollection with a COSINE index - Loads the embedding collection into memory for search
./scripts/seed-products.shThis inserts 15 sample products across categories like outdoor apparel, footwear, and hiking equipment.
curl "http://localhost:8080/api/search?q=outfit+for+a+rainy+hike+in+Nairobi"curl "http://localhost:8080/api/search?q=cheap+hiking+jacket"curl "http://localhost:8080/api/search?q=lightweight+travel+backpack"curl "http://localhost:8080/api/search?q=running+shoes+for+long+distance"Mount the docker/application.yml as the Spring configuration to use Docker network hostnames:
./mvnw package -DskipTests
docker run --rm \
--network docker_neural-search \
-p 8080:8080 \
-v "$(pwd)/docker/application.yml:/workspace/config/application.yml" \
-e SPRING_CONFIG_LOCATION=file:/workspace/config/ \
neural-search-java:0.0.1-SNAPSHOTIntegration tests use Testcontainers to spin up PostgreSQL and Milvus standalone containers automatically. No external infrastructure is required.
./mvnw verifyThe test suite verifies:
- Product ingestion stores records in PostgreSQL
- Embedding generation stores vectors in Milvus
- Semantic search returns relevant results for natural language queries
- Search results are ranked by similarity score
| Property | Default | Description |
|---|---|---|
spring.datasource.url |
jdbc:postgresql://localhost:5432/neuralsearch |
PostgreSQL connection URL |
spring.datasource.username |
neuralsearch |
PostgreSQL username |
spring.datasource.password |
neuralsearch |
PostgreSQL password |
milvus.host |
localhost |
Milvus host |
milvus.port |
19530 |
Milvus gRPC port |
milvus.collection-name |
product_embeddings |
Milvus collection name |
milvus.embedding-dimension |
384 |
Vector dimension (matches AllMiniLmL6V2) |
All properties support environment variable overrides using the ${ENV_VAR:default} syntax documented in application.yml.
- On product ingestion, the system concatenates
name + description + categoryand passes the text to the AllMiniLmL6V2 quantized model (bundled in the JAR, no API key required), producing a 384-dimensional float vector. - The vector is stored in Milvus under the
product_embeddingscollection alongside the product ID. - On search, the user query is embedded with the same model, producing a query vector.
- Milvus performs an approximate nearest neighbor search using COSINE similarity and returns the top-K most similar product IDs.
- Product details are fetched from PostgreSQL by ID and returned alongside their similarity scores, ranked highest first.
neural-search-java/
├── docker/
│ ├── docker-compose.yml Infrastructure: PostgreSQL, etcd, Milvus standalone
│ └── application.yml Spring config for containerised deployment
├── scripts/
│ └── seed-products.sh Seed script for 15 sample products
├── src/
│ ├── main/java/com/neuralsearch/
│ │ ├── NeuralSearchApplication.java
│ │ ├── config/
│ │ │ ├── EmbeddingModelConfig.java
│ │ │ ├── MilvusConfig.java
│ │ │ └── MilvusProperties.java
│ │ ├── controller/
│ │ │ ├── ProductController.java
│ │ │ └── SearchController.java
│ │ ├── dto/
│ │ │ ├── PagedResponse.java
│ │ │ ├── ProductRequest.java
│ │ │ ├── ProductResponse.java
│ │ │ └── SearchResult.java
│ │ ├── embedding/
│ │ │ └── ProductEmbeddingService.java
│ │ ├── entity/
│ │ │ └── Product.java
│ │ ├── repository/
│ │ │ └── ProductRepository.java
│ │ ├── service/
│ │ │ ├── ProductService.java
│ │ │ └── SearchService.java
│ │ └── vector/
│ │ ├── MilvusVectorStore.java
│ │ └── VectorSearchResult.java
│ └── main/resources/
│ └── application.yml
└── src/test/java/com/neuralsearch/integration/
├── AbstractIntegrationTest.java
├── ProductIngestionIntegrationTest.java
└── SemanticSearchIntegrationTest.java