[Feature]: Add support for the vector data type #1505
Labels
priority: p1
This issue will be fixed in the next major/minor version
serverity: major
Major functionality but can wait
type: feature
Use Case
This issue proposes the addition of support for the vector data type in OceanBase. This feature will enhance the performance and scalability of using OceanBase in AI projects.
Background
While working on a project to integrate AI with OceanBase, I realized that OceanBase does not inherently support the vector data type. This limitation posed a challenge as the project involved transforming articles and user queries into vector representations for comparison and answer generation.
To work around this, I had to convert vectors into JSON for storage in OceanBase and then convert them back into vectors for comparison. This conversion process resulted in a slower system that lacks scalability.
Describe the solution you'd like
The proposed solution involves adding native support for the vector data type to OceanBase, inspired by the
pg-vectorextension in PostgreSQL.Just like we store numbers, text, or dates, we need to create a way to store vectors - which are essentially a list of numbers. This change will allow OceanBase to store and retrieve vectors efficiently.
The proposal will also impact how queries are processed in OceanBase. This means teaching OceanBase how to understand and perform operations on vectors. For example, it should be able to calculate the similarity between two vectors, which is a common operation in AI and machine learning applications.
To make searching through vectors efficient, we need to implement a system that can quickly find the most similar vectors in the database. This is crucial for many AI applications where you need to find the most similar item to a given input.
Additional context
I have published an article detailing my experiment using OceanBase as a vector store in AI training. I will include the link in comment to provide more context.
The text was updated successfully, but these errors were encountered: