This project's origin is here.
The Multi-Modal Text/Image Search using CLIP project revolutionizes search capabilities by integrating CLIP technology, allowing users to search for images using natural language descriptions. Built on Weaviate, it supports multi-modal searches, combining text and images effortlessly. Users can describe images or provide images directly for contextual searches. The system is user-friendly, with a customizable interface and support for various image formats, ensuring a seamless and intuitive experience.
This example application spins up a Weaviate instance using the multi2vec-clip module, imports a few sample images (you can add your own images, too!) and provides a very simple search frontend in React using the Weaviate JS Client
Model Credits: This demo uses the ckip-ViT-B32-multilingual-v1 model from SBERT.net. Shoutout to Nils Reimers and his colleagues for the great Sentence Transformers models.
- Docker & Docker-Compose: Required to set up the Weaviate instance
- Bash: Necessary for executing the provided setup scripts.
- Node.js and npm/yarn: Optional for running the frontend locally.
- Start up Weaviate using
docker-compose up -d
- Import the schema (the script will wait for Weaviate to be ready) using
bash ./import/curl/create_schema.sh
- Import the images using
bash ./import/curl/import.sh
- To run the frontend navigate to the
./frontend
folder and runyarn && yarn start
. Wait for your browser to open athttp://localhost:3000
Simply add your images to the ./images
folder prior to running the import
script. The script looks for .jpg
file ending, but Weaviate supports other
image types as well, you can adopt those if you like.
The images used in this demo are licensed as follows:
- Photo by Michael on Unsplash
- Photo by Bas Peperzak on Unsplash
- Photo by David Köhler on Unsplash
- Photo by eggbank on Unsplash
- Photo by John McArthur on Unsplash
It is a minimal example using only 5 images, but you can add any amount of images yourself!