Proposal: Add Retrieval Pipeline Abstractions (Microsoft.Extensions.DataRetrieval)
Summary
This issue proposes adding retrieval pipeline abstractions to dotnet/extensions, enabling .NET developers to compose advanced Retrieval-Augmented Generation (RAG) retrieval pipelines using a consistent, pluggable model.
In a RAG application, retrieval is the step that finds relevant information to give to an LLM. These abstractions allow developers to add processing stages around vector search — transforming queries before searching and refining results after — without coupling to any specific vector store, LLM provider, or retrieval strategy.
These packages complement the existing Microsoft.Extensions.DataIngestion (for writing data into vector stores) by providing the symmetric read-side: pre-search query processing, vector search orchestration, and post-search result processing.
Motivation
What is a Retrieval Pipeline?
When a user asks a question in a RAG (Retrieval-Augmented Generation) application, the simplest approach is:
- Embed the question as a vector
- Find the closest document chunks in a vector store
- Pass those chunks to an LLM as context
This works for simple cases, but falls apart quickly:
- Ambiguous queries — "How do I configure it?" matches poorly because embeddings don't know what "it" refers to
- Vocabulary mismatch — The user says "set up auth" but the docs say "configure identity providers" — semantically similar but far apart in vector space
- Noise in results — The top-5 nearest vectors often include irrelevant chunks that happen to share keywords
- No quality signal — There's no way to know if the retrieved chunks actually answer the question before passing them to the LLM
A retrieval pipeline solves these by adding processing stages around the vector search:
[User Query] → [Query Processors] → [Vector Search] → [Result Processors] → [Final Results]
↑ expand, rewrite, ↑ the actual ↑ rerank, filter,
or augment the database validate relevance
query before lookup
searching
Query processors (pre-search) transform the query to improve search quality. Result processors (post-search) refine what comes back. The pipeline orchestrates both around any vector store.
Why Abstractions?
Today, .NET developers implementing these patterns must:
- Write custom orchestration code for each retrieval strategy
- Tightly couple their logic to a specific vector store (Azure AI Search, Qdrant, Pinecone, etc.)
- Re-implement common patterns from research papers with no shared building blocks
The existing Microsoft.Extensions.DataIngestion packages solved the write-side (document → chunks → vector store). Microsoft.Extensions.DataRetrieval solves the read-side (query → search → process → results) with the same philosophy: thin abstractions that enable a rich ecosystem.
Symmetry with DataIngestion
| Concern |
Write-Side |
Read-Side |
| Package |
Microsoft.Extensions.DataIngestion |
Microsoft.Extensions.DataRetrieval |
| Pipeline |
IngestionPipeline<T> |
RetrievalPipeline |
| Pre-step processors |
IngestionDocumentProcessor |
RetrievalQueryProcessor |
| Post-step processors |
IngestionChunkProcessor<T> |
RetrievalResultProcessor |
| Primary method |
ProcessAsync |
ProcessAsync |
| Data flows |
Documents → Chunks → Vector Store |
Query → Search → Ranked Results |
Developers familiar with one side immediately understand the other.
Proposed API Surface
Package: Microsoft.Extensions.DataRetrieval.Abstractions
namespace Microsoft.Extensions.DataRetrieval;
// Core data types
public sealed class RetrievalQuery
{
public RetrievalQuery(string text);
public string Text { get; }
public IList<string> Variants { get; set; }
public IDictionary<string, object?> Metadata { get; }
}
public sealed class RetrievalChunk
{
public RetrievalChunk(string content, double score);
public string Content { get; }
public double Score { get; set; }
public IDictionary<string, object?> Record { get; }
}
public sealed class RetrievalResults
{
public IList<RetrievalChunk> Chunks { get; set; }
public IDictionary<string, object?> Metadata { get; }
}
// Processor abstractions
public abstract class RetrievalQueryProcessor
{
public abstract Task<RetrievalQuery> ProcessAsync(
RetrievalQuery query, CancellationToken cancellationToken = default);
}
public abstract class RetrievalResultProcessor
{
public abstract Task<RetrievalResults> ProcessAsync(
RetrievalResults results, RetrievalQuery query,
CancellationToken cancellationToken = default);
}
// Re-ranking interface
public interface IReranker
{
Task<IReadOnlyList<RetrievalChunk>> RerankAsync(
string query, IReadOnlyList<RetrievalChunk> chunks,
CancellationToken cancellationToken = default);
}
// Retrieval interface for DI and testability
public interface IRetriever
{
Task<RetrievalResults> RetrieveAsync(
string query, int topK = 5,
CancellationToken cancellationToken = default);
}
Package: Microsoft.Extensions.DataRetrieval
namespace Microsoft.Extensions.DataRetrieval;
public sealed class RetrievalPipeline : IDisposable
{
public RetrievalPipeline(
RetrievalPipelineOptions? options = null,
ILoggerFactory? loggerFactory = null);
public IList<RetrievalQueryProcessor> QueryProcessors { get; }
public IList<RetrievalResultProcessor> ResultProcessors { get; }
public Task<RetrievalResults> ProcessAsync<TKey, TRecord>(
VectorStoreCollection<TKey, TRecord> collection,
string query,
int topK = 5,
Func<TRecord, string>? contentSelector = null,
CancellationToken cancellationToken = default)
where TKey : notnull
where TRecord : class;
}
public sealed class RetrievalPipelineOptions
{
public string ActivitySourceName { get; set; }
}
public sealed class VectorStoreRetriever<TKey, TRecord> : IRetriever
where TKey : notnull
where TRecord : class
{
public VectorStoreRetriever(
RetrievalPipeline pipeline,
VectorStoreCollection<TKey, TRecord> collection,
Func<TRecord, string>? contentSelector = null);
}
// Extension method for discoverability
public static class RetrievalPipelineExtensions
{
public static IRetriever AsRetriever<TKey, TRecord>(
this RetrievalPipeline pipeline,
VectorStoreCollection<TKey, TRecord> collection,
Func<TRecord, string>? contentSelector = null)
where TKey : notnull
where TRecord : class;
}
Design Principles
-
Symmetry with DataIngestion. Query processors mirror chunk processors; RetrievalPipeline mirrors IngestionPipeline<T>. Developers familiar with one immediately understand the other.
-
Composable pipelines. Zero processors = raw vector search. Add one processor = single enhancement. Stack many = advanced multi-stage retrieval. No dead weight.
-
Vector store agnostic. ProcessAsync accepts any VectorStoreCollection<TKey, TRecord> (from Microsoft.Extensions.VectorData). Works with Azure AI Search, Qdrant, Pinecone, in-memory, or any MEVD provider.
-
Observable. Built-in ActivitySource + ILogger support. Each processor invocation is traced with structured log entries.
Usage Example
var pipeline = new RetrievalPipeline(loggerFactory: loggerFactory);
pipeline.QueryProcessors.Add(new MultiQueryExpander(chatClient));
pipeline.ResultProcessors.Add(new LlmReranker(chatClient));
var results = await pipeline.ProcessAsync(
collection,
"What are the retention policies?",
topK: 10,
contentSelector: record => record.Content);
Relationship to Existing Packages
| Package |
Role |
Microsoft.Extensions.AI |
LLM client abstractions (IChatClient, IEmbeddingGenerator) |
Microsoft.Extensions.VectorData |
Vector store abstractions (VectorStoreCollection<TKey, TRecord>) |
Microsoft.Extensions.DataIngestion |
Write-side pipeline (document → chunks → vector store) |
Microsoft.Extensions.DataRetrieval |
Read-side pipeline (query → search → process → results) |
Together these four packages provide a complete, composable RAG stack without coupling to any specific provider.
Implementation
A reference implementation exists on the feature/retrieval-abstractions branch of this repo's fork, including:
- Both packages with full XML documentation
- OpenTelemetry tracing integration
- Reciprocal Rank Fusion for multi-query deduplication
- Tree-traversal search paradigm for hierarchical indices
- README documentation for each package
Design Decisions (per Framework Design Guidelines audit)
| Decision |
Rationale |
| Abstract classes for processors (not interfaces) |
Allows non-breaking additions of new virtual members in future versions. Follows Framework Design (FDG) Guidelines "DO prefer classes over interfaces." |
IReranker as interface (not abstract class) |
Single-method stateless contract. Types may implement both RetrievalResultProcessor AND IReranker (adapter pattern). Single inheritance prevents this with two abstract classes. |
Sealed data types (RetrievalQuery, RetrievalChunk, RetrievalResults) |
Leaf DTOs not intended for inheritance. Prevents fragile base class problems. |
Sealed RetrievalPipeline |
Owns ActivitySource lifetime — subclassing would create resource management issues. |
IList<T> for mutable collections |
Pipeline processors need to add/remove/reorder entries. IList<T> (not List<T>) follows FDG. |
IDictionary<string, object?> for metadata |
Established pattern (HttpContext.Items, Activity.Tags). Allows extensibility without breaking changes. |
CancellationToken last with = default |
Standard .NET async method convention. |
| Abstractions package has zero non-polyfill dependencies |
Consumers can reference abstractions without pulling heavy transitive deps. |
IRetriever as interface in Abstractions |
Data-source-agnostic retrieval contract. Enables DI, testability, and future implementations (web search, SQL, hybrid) without coupling to vector stores. |
RetrievalPipeline does NOT implement IRetriever |
Pipeline is a reusable processing engine that works with any collection per-call. VectorStoreRetriever adapts pipeline + collection → IRetriever for single-endpoint DI scenarios. |
ProcessAsync (pipeline) vs RetrieveAsync (IRetriever) |
Establishes clear vocabulary: pipelines process queries through stages, retrievers retrieve results. Symmetric with IngestionPipeline.ProcessAsync — both pipelines use the same verb for their primary method. |
Open Questions
- Should a fluent
RetrievalPipelineBuilder be included in the core package, or ship separately?
Design Note: IRetriever and VectorStoreRetriever
RetrievalPipeline intentionally does NOT implement IRetriever. The pipeline is a processing engine — it defines what transformations occur (query expansion, reranking, validation) but requires a vector store collection per-call. This enables one pipeline to serve multiple collections:
// Same pipeline, different indices
var policyResults = await pipeline.ProcessAsync(policyCollection, query);
var faqResults = await pipeline.ProcessAsync(faqCollection, query);
VectorStoreRetriever<TKey, TRecord> bridges this gap. It captures a pipeline + collection + content selector, then exposes the simple IRetriever contract. This is the recommended DI registration pattern:
services.AddSingleton<IRetriever>(sp =>
new VectorStoreRetriever<string, Article>(
sp.GetRequiredService<RetrievalPipeline>(),
sp.GetRequiredService<VectorStoreCollection<string, Article>>(),
record => record.Content));
For discoverability, RetrievalPipeline also offers an AsRetriever() extension method:
// Direct usage — no DI needed
IRetriever retriever = pipeline.AsRetriever(collection, record => record.Content);
var results = await retriever.RetrieveAsync("What are the retention policies?");
The IRetriever abstraction is intentionally data-source-agnostic.Future implementations may include:
| Implementation |
Data Source |
VectorStoreRetriever<TKey, TRecord> |
Any MEVD vector store provider |
WebSearchRetriever |
Bing, Google, or other search APIs |
DatabaseRetriever |
SQL/NoSQL query-based retrieval |
HybridRetriever |
Combines multiple IRetriever instances |
CachingRetriever |
Wraps another IRetriever with response caching |
This enables consumers to program against IRetriever regardless of backend, and swap implementations without changing calling code.
Proposal: Add Retrieval Pipeline Abstractions (
Microsoft.Extensions.DataRetrieval)Summary
This issue proposes adding retrieval pipeline abstractions to
dotnet/extensions, enabling .NET developers to compose advanced Retrieval-Augmented Generation (RAG) retrieval pipelines using a consistent, pluggable model.In a RAG application, retrieval is the step that finds relevant information to give to an LLM. These abstractions allow developers to add processing stages around vector search — transforming queries before searching and refining results after — without coupling to any specific vector store, LLM provider, or retrieval strategy.
These packages complement the existing
Microsoft.Extensions.DataIngestion(for writing data into vector stores) by providing the symmetric read-side: pre-search query processing, vector search orchestration, and post-search result processing.Motivation
What is a Retrieval Pipeline?
When a user asks a question in a RAG (Retrieval-Augmented Generation) application, the simplest approach is:
This works for simple cases, but falls apart quickly:
A retrieval pipeline solves these by adding processing stages around the vector search:
Query processors (pre-search) transform the query to improve search quality. Result processors (post-search) refine what comes back. The pipeline orchestrates both around any vector store.
Why Abstractions?
Today, .NET developers implementing these patterns must:
The existing
Microsoft.Extensions.DataIngestionpackages solved the write-side (document → chunks → vector store).Microsoft.Extensions.DataRetrievalsolves the read-side (query → search → process → results) with the same philosophy: thin abstractions that enable a rich ecosystem.Symmetry with DataIngestion
Microsoft.Extensions.DataIngestionMicrosoft.Extensions.DataRetrievalIngestionPipeline<T>RetrievalPipelineIngestionDocumentProcessorRetrievalQueryProcessorIngestionChunkProcessor<T>RetrievalResultProcessorProcessAsyncProcessAsyncDevelopers familiar with one side immediately understand the other.
Proposed API Surface
Package:
Microsoft.Extensions.DataRetrieval.AbstractionsPackage:
Microsoft.Extensions.DataRetrievalDesign Principles
Symmetry with DataIngestion. Query processors mirror chunk processors;
RetrievalPipelinemirrorsIngestionPipeline<T>. Developers familiar with one immediately understand the other.Composable pipelines. Zero processors = raw vector search. Add one processor = single enhancement. Stack many = advanced multi-stage retrieval. No dead weight.
Vector store agnostic.
ProcessAsyncaccepts anyVectorStoreCollection<TKey, TRecord>(fromMicrosoft.Extensions.VectorData). Works with Azure AI Search, Qdrant, Pinecone, in-memory, or any MEVD provider.Observable. Built-in
ActivitySource+ILoggersupport. Each processor invocation is traced with structured log entries.Usage Example
Relationship to Existing Packages
Microsoft.Extensions.AIIChatClient,IEmbeddingGenerator)Microsoft.Extensions.VectorDataVectorStoreCollection<TKey, TRecord>)Microsoft.Extensions.DataIngestionMicrosoft.Extensions.DataRetrievalTogether these four packages provide a complete, composable RAG stack without coupling to any specific provider.
Implementation
A reference implementation exists on the
feature/retrieval-abstractionsbranch of this repo's fork, including:Design Decisions (per Framework Design Guidelines audit)
IRerankeras interface (not abstract class)RetrievalResultProcessorANDIReranker(adapter pattern). Single inheritance prevents this with two abstract classes.RetrievalQuery,RetrievalChunk,RetrievalResults)RetrievalPipelineActivitySourcelifetime — subclassing would create resource management issues.IList<T>for mutable collectionsIList<T>(notList<T>) follows FDG.IDictionary<string, object?>for metadataHttpContext.Items,Activity.Tags). Allows extensibility without breaking changes.CancellationTokenlast with= defaultIRetrieveras interface in AbstractionsRetrievalPipelinedoes NOT implementIRetrieverVectorStoreRetrieveradapts pipeline + collection →IRetrieverfor single-endpoint DI scenarios.ProcessAsync(pipeline) vsRetrieveAsync(IRetriever)IngestionPipeline.ProcessAsync— both pipelines use the same verb for their primary method.Open Questions
RetrievalPipelineBuilderbe included in the core package, or ship separately?Design Note:
IRetrieverandVectorStoreRetrieverRetrievalPipelineintentionally does NOT implementIRetriever. The pipeline is a processing engine — it defines what transformations occur (query expansion, reranking, validation) but requires a vector store collection per-call. This enables one pipeline to serve multiple collections:VectorStoreRetriever<TKey, TRecord>bridges this gap. It captures a pipeline + collection + content selector, then exposes the simpleIRetrievercontract. This is the recommended DI registration pattern:For discoverability,
RetrievalPipelinealso offers anAsRetriever()extension method:The
IRetrieverabstraction is intentionally data-source-agnostic.Future implementations may include:VectorStoreRetriever<TKey, TRecord>WebSearchRetrieverDatabaseRetrieverHybridRetrieverIRetrieverinstancesCachingRetrieverIRetrieverwith response cachingThis enables consumers to program against
IRetrieverregardless of backend, and swap implementations without changing calling code.