Skip to content

TrendingTechnology/vald

Β 
Β 

Repository files navigation

License: Apache 2.0 release Go Reference Codacy Badge Go Report Card DepShield Badge FOSSA Status DeepSource CLA Artifact Hub Slack Twitter

What is Vald?

Vald is a highly scalable distributed fast approximate nearest neighbor dense vector search engine.

Vald is designed and implemented based on Cloud-Native architecture.

It uses the fastest ANN Algorithm NGT to search neighbors.

Vald has automatic vector indexing and index backup, and horizontal scaling which made for searching from billions of feature vector data.

Vald is easy to use, feature-rich and highly customizable as you needed.

Go to Get Started page to try out Vald :)

(If you are interested in ANN benchmarks, please refer to the official website.)

Main Features

  • Asynchronous Auto Indexing

    • Usually the graph requires locking during indexing, which causes stop-the-world. But Vald uses distributed index graphs so it continues to work during indexing.
  • Customizable Ingress/Egress Filtering

    • Vald implements it's own highly customizable Ingress/Egress filter.
    • Which can be configured to fit the gRPC interface.
      • Ingress Filter: Ability to Vectorize through filter on request.
      • Egress Filter: rerank or filter the searching result with your own algorithm.
  • Cloud-native based vector searching engine

    • Horizontal scalable on memory and CPU for your demand.
  • Auto Backup for Index data

    • Vald has a feature to store the backup of the index data using MySQL or Cassandra which enables disaster recovery.
  • Distributed Indexing

    • Vald distribute vector index to multiple agents, each agent stores different index.
  • Index Replication

    • Vald stores each index in multiple agents which enables index replicas.
    • Automatically rebalance the replica when some Vald agent goes down.
  • Easy to use

    • Vald can be easily installed in a few steps.
  • Highly customizable

    • You can configure the number of vector dimensions, the number of replica and etc.
  • Multi language supported

    • Go, Java, Clojure, Node.js, and Python client library are supported.
    • gRPC APIs can be triggered by any programming languages which support gRPC.
    • REST API is also supported.```

Requirements

  • Kubernetes 1.17~
  • AVX2 instructions (required by Vald Agent NGT)

Get Started

Please refer to Get Started.

Installation

Using Helm

helm repo add vald https://vald.vdaas.org/charts
helm install vald-cluster vald/vald

If you use the default values.yaml, the nightly images will be installed.

Docker image tagging policy

  • nightly ... latest build of master branch
  • vX.X.X ... released versions
  • latest ... latest build of release versions
  • stable ... latest long-term supported version

Using Helm-operator

vald-helm-operator

Example

Write example here

Architecture Overview

Please refer here for more details of the architecture overview in the future.

Development

Before your first commit to this repository, it is strongly recommended to run the commands below.

make init

Components

Component Docker image
Agent NGT
Agent Sidecar
Discoverer K8s
Gateway
Backup Manager (MySQL)
Backup Manager (Cassandra)
Compressor
Meta (Redis)
Meta (Cassandra)
Index Manager
Helm Operator
Load Test

Contribution

Please read the contribution guide

Contributors

All Contributors

Thanks goes to these wonderful people (emoji key):


Yusuke Kato

πŸ’» πŸ€” 🚧 πŸ“†

Rintaro Okamura

πŸ’» πŸ“– 🚧 πŸ“¦

Kosuke Morimoto

πŸ’» πŸ’‘ πŸ”§ ⚠️

Kiichiro YUKAWA

πŸ“– 🚧 ⚠️ βœ…

datelier

πŸ’» πŸ€”

Kevin Diu

πŸ“– πŸ’‘ ⚠️ βœ…

Hiroto Funakoshi

πŸ“– πŸ”§ ⚠️ βœ…

taisho

🎨 πŸ“– πŸ’‘

Pierre Grimaud

πŸ“–

LICENSE

vald released under Apache 2.0 license, refer LICENSE file.

FOSSA Status

About

Vald. A Highly Scalable Distributed Vector Search Engine

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 97.9%
  • Makefile 1.1%
  • Other 1.0%