Skip to content

A minimalist example of local Semantic Image Search using Java & OpenAI CLIP (via ONNX Runtime)

License

Notifications You must be signed in to change notification settings

eeexception/helloclipj

Repository files navigation

HelloClipJ - Semantic Photo Search (Java + CLIP)

A modern semantic photo search engine built with Java 8, Spring Boot, and ONNX Runtime. It uses OpenAI's CLIP model to enable searching for images using natural language or other images.

Screenshot

How it Works

This project is practical implementation of the concepts described in our detailed guide:

Features

  • Semantic Text Search: Search your photo collection using descriptive text (e.g., "sunset at the beach").
  • Search by Image: Upload an image to find visually and conceptually similar photos in your collection.
  • Spring Boot & Jetty: Lightweight web application served by an embedded Jetty server.
  • 3-in-a-Row Grid: Modern, responsive UI for browsing search results.

Prerequisites

  • JDK 8 or higher
  • Maven 3.6+
  • Python 3.8+ (for model conversion)

Getting Started

1. Model Preparation

First, you need to download and export the CLIP models to ONNX format. Use the provided Python script:

pip install -r requirements.txt
python export_clip.py

This will create the following files in src/main/resources/models/:

  • text_model.onnx
  • visual_model.onnx
  • tokenizer.json

Note

The models are exported as single, self-contained files (weights included) to allow loading from Java resources.

2. Build and Run

Build the fat JAR and start the Spring Boot application:

mvn clean package
java -jar target/helloclipj-1.0-SNAPSHOT.jar

Alternatively, run directly with Maven:

mvn spring-boot:run

3. Usage

  • Open your browser and navigate to http://localhost:8080.
  • Add your photos to the src/main/resources/photos directory before starting the app to have them indexed.
  • Use the text input for semantic search.
  • Use the "Search by Image" button to perform a reverse image search.
  • Click "Clear" to reset the search and view all photos.

Technical Details

  • CLIP (Contrastive Language-Image Pre-training): A neural network trained on a wide variety of (image, text) pairs.
  • ONNX Runtime: Used for high-performance inference of the CLIP models in Java.
  • Spring Boot 2.7.18: Used for the backend API and serving the frontend, maintaining compatibility with Java 8.

License

This project's code is licensed under the MIT License.

Photo Assets Notice

Important

The example images located in src/main/resources/photos are NOT covered by the MIT License. They are provided for demonstration purposes only and are restricted to non-commercial, educational use. (Photos by Valerii Konchin) See src/main/resources/photos/NOTICE.md for more details.

About

A minimalist example of local Semantic Image Search using Java & OpenAI CLIP (via ONNX Runtime)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published