FluxImprover

The Quality Layer for RAG Data Pipelines. LLM-powered enrichment and quality assessment for document chunks.

Overview

FluxImprover is a specialized .NET library designed to enhance and validate the quality of document chunks before they are indexed into a RAG (Retrieval-Augmented Generation) system.

It acts as the quality assurance and value-add layer, leveraging Large Language Models (LLMs) to transform raw chunks into highly optimized assets for superior search and answer generation.

Key Capabilities

Chunk Enrichment: Uses LLMs to create concise summaries and relevant keywords for each chunk
Chunk Filtering: 3-stage LLM-based assessment with self-reflection and critic validation for intelligent retrieval filtering
Query Preprocessing: Normalizes, expands, and classifies queries with synonym expansion and intent classification for optimal retrieval
QA Pair Generation: Automatically generates Golden QA datasets from document chunks for RAG benchmarking
Quality Assessment: Provides Faithfulness, Relevancy, and Answerability evaluators
Question Suggestion: Generates contextual follow-up questions from content or conversations
Decoupled Design: Works with any LLM through the ITextCompletionService abstraction

Installation

Install the main package via NuGet:

dotnet add package FluxImprover

Quick Start

1. Implement ITextCompletionService

FluxImprover requires an ITextCompletionService implementation to connect to your LLM provider (OpenAI, Azure, Anthropic, local models, etc.):

public class OpenAICompletionService : ITextCompletionService
{
    private readonly HttpClient _httpClient;
    private readonly string _model;

    public OpenAICompletionService(string apiKey, string model = "gpt-4o-mini")
    {
        _model = model;
        _httpClient = new HttpClient
        {
            BaseAddress = new Uri("https://api.openai.com/v1/")
        };
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");
    }

    public async Task<string> CompleteAsync(
        string prompt,
        CompletionOptions? options = null,
        CancellationToken cancellationToken = default)
    {
        // Your OpenAI API implementation
    }

    public async IAsyncEnumerable<string> CompleteStreamingAsync(
        string prompt,
        CompletionOptions? options = null,
        CancellationToken cancellationToken = default)
    {
        // Your streaming implementation
    }
}

2. Configure and Build Services

Use the FluxImproverBuilder to create all services:

using FluxImprover;
using FluxImprover.Services;

// Create your ITextCompletionService implementation
ITextCompletionService completionService = new OpenAICompletionService(apiKey);

// Build FluxImprover services
var services = new FluxImproverBuilder()
    .WithCompletionService(completionService)
    .Build();

Or use dependency injection:

// Register your ITextCompletionService
services.AddSingleton<ITextCompletionService, OpenAICompletionService>();

// Add FluxImprover services
services.AddFluxImprover();

// Or with factory
services.AddFluxImprover(sp => new OpenAICompletionService(apiKey));

3. Enrich Chunks

Add summaries and keywords to your document chunks:

using FluxImprover.Models;

var chunk = new Chunk
{
    Id = "chunk-1",
    Content = "Paris is the capital of France. It is known for the Eiffel Tower."
};

// Enrich with summary and keywords
var enrichedChunk = await services.ChunkEnrichment.EnrichAsync(chunk);

Console.WriteLine($"Summary: {enrichedChunk.Summary}");
Console.WriteLine($"Keywords: {string.Join(", ", enrichedChunk.Keywords ?? [])}");

4. Generate QA Pairs

Create question-answer pairs for RAG testing:

using FluxImprover.Options;

var context = "The solar system has eight planets. Earth is the third planet from the sun.";

var options = new QAGenerationOptions
{
    PairsPerChunk = 3,
    QuestionTypes = [QuestionType.Factual, QuestionType.Reasoning]
};

var qaPairs = await services.QAGenerator.GenerateAsync(context, options);

foreach (var qa in qaPairs)
{
    Console.WriteLine($"Q: {qa.Question}");
    Console.WriteLine($"A: {qa.Answer}");
}

5. Evaluate Quality

Assess answer quality with multiple metrics:

var context = "France is in Europe. Paris is the capital of France.";
var question = "What is the capital of France?";
var answer = "Paris is the capital of France.";

// Faithfulness: Is the answer grounded in the context?
var faithfulness = await services.Faithfulness.EvaluateAsync(context, answer);

// Relevancy: Does the answer address the question?
var relevancy = await services.Relevancy.EvaluateAsync(question, answer, context: context);

// Answerability: Can the question be answered from the context?
var answerability = await services.Answerability.EvaluateAsync(context, question);

Console.WriteLine($"Faithfulness: {faithfulness.Score:P0}");
Console.WriteLine($"Relevancy: {relevancy.Score:P0}");
Console.WriteLine($"Answerability: {answerability.Score:P0}");

// Access detailed information
foreach (var detail in faithfulness.Details)
{
    Console.WriteLine($"  {detail.Key}: {detail.Value}");
}

6. Filter QA Pairs by Quality

Use the QA Pipeline to generate and automatically filter low-quality pairs:

using FluxImprover.QAGeneration;

var chunks = new[]
{
    new Chunk { Id = "1", Content = "Machine learning is a subset of AI..." },
    new Chunk { Id = "2", Content = "Neural networks mimic the human brain..." }
};

var pipelineOptions = new QAPipelineOptions
{
    GenerationOptions = new QAGenerationOptions { PairsPerChunk = 2 },
    FilterOptions = new QAFilterOptions
    {
        MinFaithfulness = 0.7,
        MinRelevancy = 0.7,
        MinAnswerability = 0.6
    }
};

var results = await services.QAPipeline.ExecuteFromChunksBatchAsync(chunks, pipelineOptions);

var totalGenerated = results.Sum(r => r.GeneratedCount);
var totalFiltered = results.Sum(r => r.FilteredCount);
var allQAPairs = results.SelectMany(r => r.QAPairs).ToList();

Console.WriteLine($"Generated: {totalGenerated}, Passed Filter: {totalFiltered}");

7. Filter Chunks with 3-Stage Assessment

Use intelligent chunk filtering with self-reflection and critic validation:

using FluxImprover.ChunkFiltering;
using FluxImprover.Options;

var chunk = new Chunk
{
    Id = "chunk-1",
    Content = "This is a detailed technical document about machine learning algorithms..."
};

var filterOptions = new ChunkFilteringOptions
{
    MinimumScore = 0.6,
    EnableSelfReflection = true,
    EnableCriticValidation = true
};

// Assess chunk quality with 3-stage evaluation
var assessment = await services.ChunkFiltering.AssessAsync(chunk, filterOptions);

Console.WriteLine($"Initial Score: {assessment.InitialScore:P0}");
Console.WriteLine($"Reflected Score: {assessment.ReflectedScore:P0}");
Console.WriteLine($"Final Score: {assessment.FinalScore:P0}");
Console.WriteLine($"Should Include: {assessment.ShouldInclude}");
Console.WriteLine($"Reasoning: {assessment.Reasoning}");

The 3-stage assessment process:

Initial Assessment: LLM evaluates chunk quality and relevance
Self-Reflection: LLM reviews its initial assessment for consistency
Critic Validation: Independent LLM evaluation validates the assessment

8. Preprocess Queries for Better Retrieval

Optimize queries before RAG retrieval with normalization, synonym expansion, and intent classification:

using FluxImprover.QueryPreprocessing;
using FluxImprover.Options;

var query = "How do I implement auth config?";

var options = new QueryPreprocessingOptions
{
    UseLlmExpansion = true,
    ExpandTechnicalTerms = true,
    MaxSynonymsPerKeyword = 3
};

var result = await services.QueryPreprocessing.PreprocessAsync(query, options);

Console.WriteLine($"Original: {result.OriginalQuery}");
Console.WriteLine($"Normalized: {result.NormalizedQuery}");
Console.WriteLine($"Expanded: {result.ExpandedQuery}");
Console.WriteLine($"Intent: {result.Intent} (confidence: {result.IntentConfidence:P0})");
Console.WriteLine($"Strategy: {result.SuggestedStrategy}");
Console.WriteLine($"Keywords: {string.Join(", ", result.Keywords)}");
Console.WriteLine($"Expanded Keywords: {string.Join(", ", result.ExpandedKeywords)}");

Features:

Query Normalization: Lowercase, trim, remove extra whitespace
Synonym Expansion: LLM-based and built-in technical term expansion (e.g., "auth" -> "authentication")
Intent Classification: Classifies queries into types (HowTo, Definition, Code, Search, etc.)
Entity Extraction: Identifies file names, class names, method names in queries
Search Strategy: Recommends optimal search strategy (Semantic, Keyword, Hybrid, MultiQuery)

9. Suggest Follow-up Questions

Generate contextual questions from content or conversations:

using FluxImprover.QuestionSuggestion;
using FluxImprover.Options;

// From a conversation
var history = new[]
{
    new ConversationMessage { Role = "user", Content = "What is machine learning?" },
    new ConversationMessage { Role = "assistant", Content = "Machine learning is a subset of AI..." }
};

var options = new QuestionSuggestionOptions
{
    MaxSuggestions = 3,
    Categories = [QuestionCategory.DeepDive, QuestionCategory.Related]
};

var suggestions = await services.QuestionSuggestion.SuggestFromConversationAsync(history, options);

foreach (var suggestion in suggestions)
{
    Console.WriteLine($"[{suggestion.Category}] {suggestion.Text} (relevance: {suggestion.Relevance:P0})");
}

Language Support

FluxImprover is designed to be language-agnostic. The underlying LLM automatically detects the input language and responds accordingly.

Supported Languages

Any language supported by your LLM provider works with FluxImprover:

English - Primary development and testing language
Korean - Tested with technical documentation (e.g., ClusterPlex HA solution manuals)
Other languages - Japanese, Chinese, German, French, etc. (depends on LLM capability)

Best Practices for Non-English Documents

Use a capable LLM: Modern LLMs (GPT-4, Claude, Phi-4) have excellent multilingual support
Domain terminology: The LLM will recognize domain-specific terms in any language
Mixed content: Documents with mixed languages (e.g., Korean text with English technical terms) are handled naturally

Example: Korean Document Enrichment

var chunk = new Chunk
{
    Id = "korean-1",
    Content = "ClusterPlex는 고가용성(HA) 솔루션으로, 핫빗 기반의 페일오버 메커니즘을 제공합니다."
};

var enriched = await services.ChunkEnrichment.EnrichAsync(chunk);
// Summary and keywords will be generated in Korean

Available Services

Service	Description
`Summarization`	Generates concise summaries from text
`KeywordExtraction`	Extracts relevant keywords
`ChunkEnrichment`	Combines summarization and keyword extraction
`ChunkFiltering`	3-stage LLM-based chunk assessment with self-reflection and critic validation
`QueryPreprocessing`	Normalizes, expands, and classifies queries for optimal retrieval
`Faithfulness`	Evaluates if answers are grounded in context
`Relevancy`	Evaluates if answers address the question
`Answerability`	Evaluates if questions can be answered from context
`QAGenerator`	Generates question-answer pairs from content
`QAFilter`	Filters QA pairs by quality thresholds
`QAPipeline`	End-to-end QA generation with quality filtering
`QuestionSuggestion`	Suggests contextual follow-up questions
`ContextualEnrichment`	Document-level contextual retrieval (Anthropic pattern)
`ChunkRelationship`	Discovers relationships between chunks

ITextCompletionService Interface

FluxImprover requires an implementation of ITextCompletionService to communicate with LLMs:

public interface ITextCompletionService
{
    Task<string> CompleteAsync(
        string prompt,
        CompletionOptions? options = null,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<string> CompleteStreamingAsync(
        string prompt,
        CompletionOptions? options = null,
        CancellationToken cancellationToken = default);
}

CompletionOptions

public record CompletionOptions
{
    public string? SystemPrompt { get; init; }
    public float? Temperature { get; init; }
    public int? MaxTokens { get; init; }
    public bool JsonMode { get; init; } = false;
    public string? ResponseSchema { get; init; }
    public IReadOnlyList<ChatMessage>? Messages { get; init; }
}

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                       FluxImproverBuilder                           │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │               ITextCompletionService                             ││
│  │        (Provided by consumer application)                        ││
│  │   ┌────────────────────────────────────────────────────────┐    ││
│  │   │  OpenAI, Azure, Anthropic, Local Models, etc.          │    ││
│  │   └────────────────────────────────────────────────────────┘    ││
│  └─────────────────────────────────────────────────────────────────┘│
│         │                    │                    │                 │
│  ┌──────▼──────┐     ┌──────▼──────┐     ┌──────▼──────┐           │
│  │ Enrichment  │     │ Evaluation  │     │    QA       │           │
│  │  Services   │     │  Metrics    │     │ Generation  │           │
│  └─────────────┘     └─────────────┘     └─────────────┘           │
│                                                                     │
│  ┌────────────────────┐  ┌─────────────────┐  ┌───────────────────┐│
│  │   Chunk Filtering  │  │ Query Preproc.  │  │ Question Suggest. ││
│  │  (3-Stage Assess.) │  │ (Expand/Intent) │  │                   ││
│  └────────────────────┘  └─────────────────┘  └───────────────────┘│
│                                                                     │
│  ┌────────────────────┐  ┌─────────────────┐                       │
│  │ Contextual Enrich. │  │ Chunk Relations │                       │
│  │ (Anthropic pattern)│  │   Discovery     │                       │
│  └────────────────────┘  └─────────────────┘                       │
└─────────────────────────────────────────────────────────────────────┘

Sample Project

Check out the Console Demo for a complete example showing all features with OpenAI integration.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - See LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
docs		docs
samples/FluxImprover.ConsoleDemo		samples/FluxImprover.ConsoleDemo
src/FluxImprover		src/FluxImprover
tests/FluxImprover.Tests		tests/FluxImprover.Tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
Directory.Packages.props		Directory.Packages.props
FluxImprover.sln		FluxImprover.sln
LICENSE		LICENSE
README.md		README.md
global.json		global.json
icon.png		icon.png
nuget.config		nuget.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FluxImprover

Overview

Key Capabilities

Installation

Quick Start

1. Implement ITextCompletionService

2. Configure and Build Services

3. Enrich Chunks

4. Generate QA Pairs

5. Evaluate Quality

6. Filter QA Pairs by Quality

7. Filter Chunks with 3-Stage Assessment

8. Preprocess Queries for Better Retrieval

9. Suggest Follow-up Questions

Language Support

Supported Languages

Best Practices for Non-English Documents

Example: Korean Document Enrichment

Available Services

ITextCompletionService Interface

CompletionOptions

Architecture

Sample Project

Contributing

License

About

Uh oh!

Releases 12

Packages

Languages

License

iyulab/FluxImprover

Folders and files

Latest commit

History

Repository files navigation

FluxImprover

Overview

Key Capabilities

Installation

Quick Start

1. Implement ITextCompletionService

2. Configure and Build Services

3. Enrich Chunks

4. Generate QA Pairs

5. Evaluate Quality

6. Filter QA Pairs by Quality

7. Filter Chunks with 3-Stage Assessment

8. Preprocess Queries for Better Retrieval

9. Suggest Follow-up Questions

Language Support

Supported Languages

Best Practices for Non-English Documents

Example: Korean Document Enrichment

Available Services

ITextCompletionService Interface

CompletionOptions

Architecture

Sample Project

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Languages

Packages