-
Notifications
You must be signed in to change notification settings - Fork 5k
[API Proposal]: Expose Vector Dataype in SqlDbType #115148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @cheenamalhotra, @David-Engel |
cc @roji |
Tagging subscribers to this area: @roji, @ajcvickers |
Vector is being supported across many different ADO.NET providers: https://github.com/pgvector/pgvector-dotnet, mysql-net/MySqlConnector#1549. Is there value in adopting a consistent approach for this data type (or even adding some System.Data types to support it)? |
@bgrainger there's indeed some sort of vector support in most relational databases nowadays. However, the .NET type used to represent an embedding unfortunately varies considerably - pgvector has Vector/HalfVector/SparseVector (the first two are wrappers around ReadOnlyMemory, the latter is a custom sparse vector format - there's no universal way yet to represent a sparse vector in .NET). There's a new extension package called Microsoft.Extensions.AI that's going to be released soon, which has an Embedding type which is a base class, and So basically at this point I'm not sure there's anything feasible to do type-wise... We could also consider adding a DbType.Vector value to the enum, but that doesn't uniquely identify a vector type (because there are multiple - float32, float16, sparse...). And with the actual .NET type varying across providers, I'm not sure there's much use for a common DbType value... What do you think? Am I missing other idea here? |
Tagging subscribers to this area: @cheenamalhotra, @David-Engel |
Note: this is basically the same as #103925 (which was about adding SqlDbType.Json). |
It would be nice to use a common .NET type to represent Embeddings (since that is almost always what a
Yes, I agree; I don't think there's anything to do in ADO.NET itself. |
I'm not sure. One problem is that there really are different vector types. For one thing, vectors can be of different types (float32, float16, int8, bit). For another, they can be dense or sparse vectors. For dense vectors, we're generally recommending to use The Embedding types in Microsoft.Extensions.AI are intended more to be wrappers around the vector type (e.g. If MySQL/MariaDB have vector search support, I'd advise adding support for that via |
Looks good as proposed. Approved via email (trivial addition) namespace System.Data
{
// Specifies the SQL Server data type.
public enum SqlDbType
{
Vector = 36,
}
} |
Based on these comments: dotnet/runtime#115148 (comment). Signed-off-by: Bradley Grainger <bgrainger@gmail.com>
Background and motivation
The
System.Data.SqlDbType
enum represents the datatypes supported by SQL Server and is used withSqlParameter
to specify the column type to be used in SQL server operations while executionSqlCommand
.With the vector datatype being supported in SQL Server link there is a need to support the vector type in Microsoft.Data.SqlClient ADO.Net provider for SQL Server.
The API suggestion is aimed at adding an enum called
Vector
with value36
inSqlDbType
.Once this enum is available Microsoft.Data.SqlClient (the SQL Server driver) can then leverage the enum value to allow vector operations using Microsoft.Data.SqlClient APIs.
The version of Microsoft.Data.SqlClient targeting .Net 10, will be able to use the enum
SqlDbType.Vector
to provide vector datatype support.API Proposal
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: