Skip to content

.Net: [MEVD] Collapse search methods to a single method accepting string, ROM<T> and Embedding<T> #11871

@roji

Description

@roji
  • Both current methods - SearchAsync and SearchEmbeddingAsync - accept a parameter of any type without any constraints.
    • This is because we can't limit the vector/embedding types since those vary from implementation to implementation, and input types can also be anything that the configured embedding generator supports.
    • As a result, the user has to know exactly what they can pass in. If the methods constrained the inputs, it would make sense two have different methods with different parameters constraints - but two methods with completely open parameters aren't useful. We may as well collapse them together.
  • We already don't distinguish between raw vector/embedding and string/DataContent in other contexts. Specifically, in the .NET type that the user maps to the database, they can have a string vector property (if they configure an embedding generator) or a raw vector. Since we treat these interchangeably in that context, we may as well treat them interchangeably on the search API as well.
  • Types we want to support:
    • We're not sure what exact types should be supported as vector/embedding types: Embedding<T> (or rather Embedding), ReadOnlyMemory<T> (for dense vector representations), possibly T[], List<T>, IList<T>, IEnumerable<T>...
    • All these are for "raw" operations - when an embedding generator isn't defined. When a generator is defined, the user may pass in anything as long as the resolved generator accepts it as input.
    • We'll likely start off with supporting Embedding and ReadOnlyMemory<float>, and consider doing more based on user feedback.

/cc @westey-m @stephentoub

See some previous discussion in #11701

Metadata

Metadata

Assignees

Labels

.NETIssue or Pull requests regarding .NET codeBuildFeatures planned for next Build conferencemsft.ext.vectordataRelated to Microsoft.Extensions.VectorData

Projects

Status

Sprint: Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions