Skip to content

needlehaystack/needlestack

Repository files navigation

Needlestack

image

image

image

Documentation Status

Needlestack is a distributed vector search microservice.

Features

  • gRPC server for kNN vector search
  • Shard vectors over multiple nodes
  • Replicate shard over multiple nodes
  • Retrieve vectors by ID

Limitations

The current beta builds have limitations that make them difficult to use in production. These should be addressed in future builds.

Caveats

  • Vectors must be manually sharded, indexed, and serialized to disk as protobufs
  • Only kNN library currently supported is Faiss

Quickstart

Get started with the examples in this repo!

Start Docker containers running Needlestack services. This runs the examples/run_merger.py and examples/run_searcher.py in containers.

docker-compose up merger-grpc searcher-grpc1 searcher-grpc2 searcher-grpc3

Create local index data and send to the Needlestack services. This runs examples/indexing_job.py to create dummy data, then runs examples/add_collections.py to send them to the Needlestack service.

docker-compose run --rm make-test-data

Access the gRPC endpoints at localhost:50051