Skip to content

AIGeekSquad/AIContext

Repository files navigation

AiGeekSquad.AIContext

NuGet Version NuGet Downloads License: MIT Ask DeepWiki

AiContext

A comprehensive C# library for AI-powered context management, providing intelligent text processing capabilities for modern AI applications. This library combines semantic text chunking and Maximum Marginal Relevance (MMR) algorithms to help you build better RAG systems, search engines, and content recommendation platforms.

πŸ—οΈ Repository Structure

AiContext/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ AiGeekSquad.AIContext/              # Main library package
β”‚   β”‚   β”œβ”€β”€ Chunking/                       # Semantic text chunking components
β”‚   β”‚   β”œβ”€β”€ Ranking/                        # MMR algorithm implementation
β”‚   β”‚   β”œβ”€β”€ ContextRendering/               # Context management and rendering
β”‚   β”‚   └── README.md                       # NuGet package documentation
β”‚   β”œβ”€β”€ AiGeekSquad.AIContext.MEAI/         # Microsoft.Extensions.AI integration
β”‚   β”œβ”€β”€ AiGeekSquad.AIContext.Tests/        # Unit tests
β”‚   └── AiGeekSquad.AIContext.Benchmarks/   # Performance benchmarks
β”œβ”€β”€ docs/                                   # Detailed documentation
β”œβ”€β”€ examples/                               # Usage examples and demos
β”‚   └── notebooks/                          # Jupyter notebooks with tutorials
└── README.md                               # This file (repository overview)

✨ Features

🧠 Semantic Text Chunking

  • Intelligent text splitting based on semantic similarity analysis
  • Configurable chunk sizes with token-aware boundaries
  • Multiple text splitters (sentence, custom regex patterns)
  • Embedding-based analysis using your choice of embedding providers
  • Fallback mechanisms ensuring robust chunk generation

🎯 Maximum Marginal Relevance (MMR)

  • High-performance implementation of the MMR algorithm
  • Relevance-diversity balance for better search results
  • Optimized for large datasets with O(nΒ²k) complexity
  • Comprehensive benchmarks with real performance data

βš–οΈ Generic Ranking Engine

  • Combines multiple scoring functions with weights (positive for similarity, negative for dissimilarity)
  • Multiple normalization strategies (MinMax, ZScore, Percentile) for score standardization
  • Multiple combination strategies including:
  • Extensible architecture for custom scoring functions and strategies
  • Fully benchmarked with performance insights and optimization guidance

🧾 Context Rendering

  • Intelligent context management with time-based freshness weighting
  • MMR-powered context selection balancing relevance, diversity, and recency
  • Token budget management for optimal context window utilization
  • TimeProvider pattern for testable time-dependent operations

πŸ”— Microsoft.Extensions.AI Integration (MEAI)

  • Seamless integration with Microsoft's AI abstractions
  • Dependency injection ready for standard .NET DI containers
  • Embedding provider compatibility with Microsoft's ecosystem
  • Configuration patterns following Microsoft.Extensions standards

πŸ› οΈ Extensible Architecture

  • Dependency injection ready with clean interfaces
  • Custom text splitters for domain-specific requirements
  • Pluggable embedding generators for different AI models
  • Token counting with real tokenizer implementations

πŸš€ Getting Started

Prerequisites

  • .NET 9.0 SDK or later
  • Visual Studio 2022 or VS Code with C# extension

Building the Project

# Clone the repository
git clone https://github.com/AiGeekSquad/AIContext.git
cd AIContext

# Restore dependencies
dotnet restore

# Build the solution
dotnet build

# Build in Release mode
dotnet build --configuration Release

Running Tests

# Run all tests
dotnet test

# Run tests with coverage (generates both Cobertura and OpenCover formats)
dotnet test --collect:"XPlat Code Coverage" --results-directory ./TestResults/ -- DataCollectionRunSettings.DataCollectors.DataCollector.Configuration.Format=cobertura%2Copencover

# Run specific test projects
dotnet test src/AiGeekSquad.AIContext.Tests/
dotnet test --filter "SemanticChunkingTests"
dotnet test --filter "MaximumMarginalRelevanceTests"

Running Examples

# Run the basic chunking example
dotnet run --project examples/ --configuration Release BasicChunking

# Run the MMR demonstration
dotnet run --project examples/ --configuration Release MMR

# Or create a simple console app to test
dotnet new console -n MyAIContextTest
cd MyAIContextTest
dotnet add package AiGeekSquad.AIContext
# Copy examples from the repository and run

Running Benchmarks

# Run all benchmarks
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release

# Run specific benchmarks
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --filter "*MMR*"
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --filter "*Chunking*"

# Export benchmark results
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --exporters json html

πŸ§ͺ Testing

The library includes comprehensive test coverage:

  • 217 unit tests covering all core functionality
  • Real implementation testing (no mocks for core algorithms)
  • Edge case handling with robust fallback mechanisms
  • Performance testing with benchmarks

Test Categories

Test Project Coverage Description
SemanticChunkingTests Core chunking logic Text splitting, embedding analysis, chunk generation
SentenceTextSplitterTests Text splitting Sentence boundary detection, custom patterns
MaximumMarginalRelevanceTests MMR algorithm Relevance scoring, diversity optimization

Running Specific Test Categories

# Semantic chunking tests
dotnet test --filter "SemanticChunkingTests"

# Text splitter tests
dotnet test --filter "SentenceTextSplitterTests"

# MMR algorithm tests
dotnet test --filter "MaximumMarginalRelevanceTests"

πŸ“Š Performance Benchmarks

The project includes comprehensive benchmarks in src/AiGeekSquad.AIContext.Benchmarks/ using BenchmarkDotNet for accurate performance measurement.

βš–οΈ Generic Ranking Engine Benchmarks

The RankingEngineBenchmarks.cs file provides comprehensive performance testing for the Generic Ranking Engine with multiple scoring functions, normalization strategies, and combination approaches.

🎯 MMR Algorithm Benchmarks

The MmrBenchmarks.cs file provides comprehensive performance testing for the Maximum Marginal Relevance algorithm.

Benchmark Parameters

Parameter Values Tested Description
Vector Count 1,000 Number of vectors in the dataset
Vector Dimensions 100, 384 Embedding dimensions (standard and OpenAI-compatible)
TopK 10 Number of results to return
Lambda 0.0, 0.5, 1.0 Relevance vs diversity balance

MMR Benchmark Scenarios

Benchmark Method Purpose Configuration
ComputeMMR() Main benchmark with parameter combinations Uses [Params] for comprehensive testing
ComputeMMR_PureRelevance() Pure relevance selection Lambda = 1.0 (relevance only)
ComputeMMR_PureDiversity() Pure diversity selection Lambda = 0.0 (diversity only)
ComputeMMR_Balanced() Balanced selection Lambda = 0.5 (balanced approach)
ComputeMMR_MemoryFocused() Memory allocation analysis Includes forced GC for accurate measurement

Performance Characteristics

  • Processing Time: ~2ms for 1,000 vectors (384 dimensions)
  • Memory Allocation: ~120KB per 1,000 vectors
  • Complexity: O(nΒ²k) where n = vector count, k = topK
  • Optimization: Leverages .NET 9.0 with AVX-512 support
  • Reproducibility: Fixed seed (42) for consistent results

🧠 Semantic Chunking Benchmarks

The SemanticChunkingBenchmarks.cs file provides comprehensive performance testing for semantic text chunking functionality.

Benchmark Parameters

Parameter Values Tested Description
Document Size Short, Medium, Long Text complexity and length variations
Max Tokens Per Chunk 256, 512 Chunk size configurations
Caching Enabled, Disabled Embedding cache impact

Document Size Specifications

Size Content Token Count (Approx.)
Short 5 simple sentences ~50-100 tokens
Medium 3 paragraphs with technical content ~200-400 tokens
Long 3 detailed paragraphs with complex topics ~800-1200 tokens

Semantic Chunking Benchmark Scenarios

Benchmark Method Purpose Configuration
SemanticChunking_Complete() Baseline benchmark Uses parameterized configurations
SemanticChunking_DefaultOptions() Default configuration performance Standard SemanticChunkingOptions.Default
SemanticChunking_OptimizedForSpeed() Speed-optimized configuration Buffer=1, Threshold=0.75, Cache=true
SemanticChunking_OptimizedForQuality() Quality-optimized configuration Buffer=3, Threshold=0.90, MaxTokens=1024
SemanticChunking_SmallBuffer() Buffer size impact (small) BufferSize=1
SemanticChunking_LargeBuffer() Buffer size impact (large) BufferSize=4
SemanticChunking_CachingFirstPass() Cache miss performance Fresh chunker instance
SemanticChunking_NoCaching() No caching baseline Caching disabled

Performance Characteristics

  • Streaming Processing: Uses IAsyncEnumerable for memory efficiency
  • Token-Aware: Real tokenization using Microsoft.ML.Tokenizers
  • Embedding Cache: LRU cache with configurable size (up to 1000 entries)
  • Mock Implementation: High-performance mocks for consistent benchmarking
  • Memory Efficient: Minimal allocations with streaming approach

βš™οΈ Benchmark Configuration

The benchmarks use a custom BenchmarkConfig.cs with the following settings:

Runtime Configuration

Setting Value Purpose
Target Framework .NET 9.0 Latest performance optimizations
Platform x64 64-bit architecture support
GC Modes Server GC, Workstation GC Compare garbage collection strategies
Memory Diagnostics Enabled Track allocations and memory usage

Output Formats

  • Console: Real-time progress and summary
  • Markdown: GitHub-compatible tables
  • HTML: Interactive web reports
  • Statistical Analysis: Mean, Median, P95, Rankings

πŸš€ Running Benchmarks

Command Line Options

# Run MMR benchmarks only
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release mmr

# Run semantic chunking benchmarks only
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release semantic

# Run all benchmarks
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release all

# Run with specific filters
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --filter "*MMR*"
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --filter "*Chunking*"

Export Options

# Export to JSON format
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --exporters json

# Export to HTML format
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --exporters html

# Export to multiple formats
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release -- --exporters json html

πŸ“ˆ Benchmark Environment

Hardware Requirements

  • OS: Windows (for Windows-specific performance counters)
  • Runtime: .NET 9.0 SDK
  • Architecture: x64 recommended for optimal performance
  • Memory: Sufficient RAM for vector operations and caching

Test Data Generation

  • Reproducible: Fixed seed (42) for consistent results
  • Realistic: Vector values between -1 and 1
  • Fresh Data: Regenerated per iteration to avoid caching effects
  • Normalized: Proper vector normalization for semantic similarity

πŸ” Performance Insights

The benchmarks help identify:

  • Scalability: How performance scales with vector count and dimensions
  • Memory Usage: Allocation patterns and memory efficiency
  • Parameter Impact: How lambda and TopK values affect MMR performance
  • Configuration Impact: How chunking options affect semantic processing
  • Caching Benefits: Performance gains from embedding caching
  • GC Behavior: Impact of different garbage collection strategies

πŸ”§ Development Workflow

Project Setup for Contributors

  1. Fork the repository on GitHub
  2. Clone your fork locally:
    git clone https://github.com/YOUR-USERNAME/AIContext.git
    cd AIContext
  3. Create a feature branch:
    git checkout -b feature/your-feature-name
  4. Install dependencies:
    dotnet restore
  5. Make your changes and ensure tests pass:
    dotnet build
    dotnet test

Code Quality Standards

  • Code Coverage: Maintain >90% test coverage for new features
  • Performance: Run benchmarks for performance-critical changes
  • Documentation: Update relevant documentation for API changes
  • Coding Style: Follow existing C# conventions and patterns

Continuous Integration

The project uses GitHub Actions for continuous integration:

  • Automated builds on every commit
  • Test execution with comprehensive coverage
  • NuGet package generation and publishing
  • Version management with automatic build numbering

🧾 Context Rendering

The ContextRenderer provides intelligent context management for chat applications and RAG systems, balancing relevance, diversity, and freshness within token budgets.

Key Features

  • Time-based freshness weighting with configurable freshnessWeight parameter (0.0 to 1.0)
  • MMR-powered context selection for optimal relevance-diversity balance
  • Token budget management to respect context window limitations
  • TimeProvider pattern for testable time-dependent operations
  • Microsoft.Extensions.AI integration for chat message handling

Usage Example

using AiGeekSquad.AIContext.ContextRendering;
using AiGeekSquad.AIContext.Chunking;
using Microsoft.Extensions.AI;

// Create context renderer with dependencies
var tokenCounter = new MLTokenCounter();
var embeddingGenerator = CreateYourEmbeddingGenerator(); // Your implementation
var renderer = new ContextRenderer(tokenCounter, embeddingGenerator);

// Add chat messages over time
await renderer.AddMessageAsync(new ChatMessage(ChatRole.User, "What is machine learning?"));
await renderer.AddMessageAsync(new ChatMessage(ChatRole.Assistant, "Machine learning is..."));
await renderer.AddMessageAsync(new ChatMessage(ChatRole.User, "Tell me about deep learning"));

// Render context with freshness weighting
var context = await renderer.RenderContextAsync(
    query: "deep learning techniques",
    tokenBudget: 1000,           // Limit context size
    lambda: 0.7,                 // Favor relevance
    freshnessWeight: 0.3         // Give some priority to recent messages
);

// Use context for LLM prompt
var contextText = string.Join("\n", context.Select(item => item.Content));

TimeProvider Pattern for Testing

// Production code uses TimeProvider.System by default
var renderer = new ContextRenderer(tokenCounter, embeddingGenerator);

// Testing with FakeTimeProvider for time control
var fakeTimeProvider = new FakeTimeProvider();
var testRenderer = new ContextRenderer(tokenCounter, embeddingGenerator, fakeTimeProvider);

await testRenderer.AddMessageAsync(new ChatMessage(ChatRole.User, "Old message"));
fakeTimeProvider.Advance(TimeSpan.FromMinutes(10)); // Simulate time passage
await testRenderer.AddMessageAsync(new ChatMessage(ChatRole.User, "Recent message"));

// Test freshness weighting behavior
var result = await testRenderer.RenderContextAsync("query", freshnessWeight: 0.8);
// Recent message should be prioritized due to high freshness weight

πŸ”— Microsoft.Extensions.AI Integration (MEAI)

The AiGeekSquad.AIContext.MEAI project provides seamless integration with Microsoft's AI abstractions, enabling AIContext components to work with the broader Microsoft.Extensions.AI ecosystem.

Purpose and Benefits

  • Ecosystem Integration: Use any embedding generator implementing Microsoft's IEmbeddingGenerator<TInput,TEmbedding> interface
  • Dependency Injection Ready: Full support for .NET's standard DI container patterns
  • Provider Flexibility: Switch between OpenAI, Azure OpenAI, and other providers without code changes
  • Future-Proof Architecture: Benefit from updates to both Microsoft's AI abstractions and AIContext libraries

Installation and Setup

# Install the MEAI integration package
dotnet add package AiGeekSquad.AIContext.MEAI

Basic Usage

using AiGeekSquad.AIContext.MEAI;
using AiGeekSquad.AIContext.Chunking;
using Microsoft.Extensions.AI;

// Use any Microsoft Extensions AI embedding generator
IEmbeddingGenerator<string, Embedding<float>> microsoftGenerator =
    CreateYourEmbeddingGenerator(); // Your specific implementation

// Wrap with MEAI adapter
IEmbeddingGenerator aiContextGenerator =
    new MicrosoftExtensionsAIEmbeddingGenerator(microsoftGenerator);

// Use with semantic chunking
var chunker = new SemanticTextChunker(
    embeddingGenerator: aiContextGenerator,
    tokenCounter: new MLTokenCounter(),
    similarityCalculator: new MathNetSimilarityCalculator(),
    textSplitter: new SentenceTextSplitter()
);

var chunks = await chunker.ChunkTextAsync("Your document text...");

Dependency Injection Configuration

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

var builder = Host.CreateApplicationBuilder(args);

// Register your Microsoft Extensions AI embedding generator
builder.Services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(provider =>
{
    return CreateYourOpenAIEmbeddingGenerator(); // Your implementation
});

// Register AIContext dependencies
builder.Services.AddSingleton<ITokenCounter, MLTokenCounter>();
builder.Services.AddSingleton<ISimilarityCalculator, MathNetSimilarityCalculator>();
builder.Services.AddSingleton<ITextSplitter, SentenceTextSplitter>();

// Register the MEAI adapter
builder.Services.AddSingleton<AiGeekSquad.AIContext.Chunking.IEmbeddingGenerator,
    MicrosoftExtensionsAIEmbeddingGenerator>();

// Register semantic chunker
builder.Services.AddSingleton<SemanticTextChunker>();

var app = builder.Build();

// Use the chunker
var chunker = app.Services.GetRequiredService<SemanticTextChunker>();

Supported Embedding Providers

The MEAI integration works with any embedding provider that implements Microsoft's AI abstractions:

  • OpenAI - Direct OpenAI API integration
  • Azure OpenAI - Azure-hosted OpenAI models
  • Local Models - Ollama, LocalAI, and other local inference servers
  • Cloud Providers - AWS Bedrock, Google Vertex AI (via compatible adapters)
  • Custom Providers - Any implementation of IEmbeddingGenerator<string, Embedding<float>>

πŸ’‘ Examples & Use Cases

πŸš€ Complete Working Examples

The examples/ directory contains fully functional demonstrations:

Running Examples

# Run the enterprise RAG service demo
dotnet run --project examples/ --configuration Release EnterpriseRAGServiceDemo

# Run the MMR clustering problem demonstration
dotnet run --project examples/ --configuration Release MMRClusteringProblemDemo

# Interactive notebook (requires Jupyter)
jupyter notebook examples/notebooks/beyond-basic-rag-mmr-complete-demo.ipynb

πŸ” RAG Systems (Retrieval Augmented Generation)

// Complete RAG pipeline example
public async Task<string> ProcessUserQuery(string question)
{
    // 1. Generate query embedding
    var queryEmbedding = await embeddingGenerator.GenerateEmbeddingAsync(question);
    
    // 2. Retrieve candidate chunks from vector database
    var candidates = await vectorDb.SearchSimilarAsync(queryEmbedding, topK: 20);
    
    // 3. Use MMR to select diverse, relevant context
    var selectedContext = MaximumMarginalRelevance.ComputeMMR(
        vectors: candidates.Select(c => c.Embedding).ToList(),
        query: queryEmbedding,
        lambda: 0.8,  // Balance relevance vs diversity
        topK: 5       // Limit for LLM context window
    );
    
    // 4. Generate response with selected context
    var contextText = string.Join("\n", selectedContext.Select(s => candidates[s.Index].Text));
    return await llm.GenerateResponseAsync(question, contextText);
}

πŸ“š Document Processing Scenarios

  • Knowledge base chunking - Semantic splitting for enterprise search systems
  • Legal document analysis - Custom text splitters for numbered sections and clauses
  • Research paper processing - Academic content patterns with citation awareness
  • Technical documentation - Code-aware splitting that preserves syntax integrity

🎯 Content Recommendation Systems

// Avoid recommending similar content using MMR
var recommendations = MaximumMarginalRelevance.ComputeMMR(
    vectors: availableContent.Select(c => c.Embedding).ToList(),
    query: userInterestVector,
    lambda: 0.6,  // Favor diversity for better user experience
    topK: 10
);

πŸ”¬ Research & Analytics Applications

  • Literature review systems - Diverse paper selection for comprehensive coverage
  • Market research - Balanced sampling from different data sources
  • Content analysis - Representative text selection for qualitative research

βš–οΈ Generic Ranking Engine for Multi-Criteria Ranking

using System;
using System.Collections.Generic;
using System.Linq;
using AiGeekSquad.AIContext.Ranking;
using AiGeekSquad.AIContext.Ranking.Normalizers;
using AiGeekSquad.AIContext.Ranking.Strategies;

// Example document class
public class Document
{
    public string Title { get; set; }
    public double RelevanceScore { get; set; }
    public int PopularityRank { get; set; }
    
    public Document(string title, double relevanceScore, int popularityRank)
    {
        Title = title;
        RelevanceScore = relevanceScore;
        PopularityRank = popularityRank;
    }
}

// Custom scoring functions
public class SemanticRelevanceScorer : IScoringFunction<Document>
{
    public string Name => "SemanticRelevance";
    public double ComputeScore(Document item) => item.RelevanceScore;
    public double[] ComputeScores(IReadOnlyList<Document> items) =>
        items.Select(ComputeScore).ToArray();
}

public class PopularityScorer : IScoringFunction<Document>
{
    public string Name => "Popularity";
    public double ComputeScore(Document item) => 1.0 / item.PopularityRank;
    public double[] ComputeScores(IReadOnlyList<Document> items) =>
        items.Select(ComputeScore).ToArray();
}

// Create documents to rank
var documents = new List<Document>
{
    new("AI Research Paper", relevanceScore: 0.9, popularityRank: 5),
    new("ML Tutorial", relevanceScore: 0.7, popularityRank: 1),
    new("Data Science Guide", relevanceScore: 0.8, popularityRank: 3)
};

// Create scoring functions with weights
var scoringFunctions = new List<WeightedScoringFunction<Document>>
{
    // Positive weight for similarity (relevance)
    new(new SemanticRelevanceScorer(), weight: 0.7)
    {
        Normalizer = new MinMaxNormalizer()
    },
    // Negative weight for dissimilarity (avoid popular but less relevant)
    new(new PopularityScorer(), weight: -0.3)
    {
        Normalizer = new ZScoreNormalizer()
    }
};

// Create ranking engine and rank documents
var engine = new RankingEngine<Document>();
var results = engine.Rank(documents, scoringFunctions, new WeightedSumStrategy());

foreach (var result in results)
{
    Console.WriteLine($"Rank {result.Rank}: {result.Item.Title} (Score: {result.FinalScore:F3})");
    Console.WriteLine($"  Relevance: {result.ComponentScores["SemanticRelevance"]:F3}");
    Console.WriteLine($"  Popularity: {result.ComponentScores["Popularity"]:F3}");
}

πŸ—οΈ Architecture

Core Components

graph TB
    A[SemanticTextChunker] --> B[ITextSplitter]
    A --> C[IEmbeddingGenerator]
    A --> D[ITokenCounter]
    A --> E[ISimilarityCalculator]
    
    F[MaximumMarginalRelevance] --> G[Vector Operations]
    F --> H[Similarity Calculations]
    
    B --> I[SentenceTextSplitter]
    C --> J[Your Embedding Provider]
    D --> K[MLTokenCounter]
    E --> L[MathNetSimilarityCalculator]
Loading

Core Interfaces

// Implement for your embedding provider
public interface IEmbeddingGenerator
{
    IAsyncEnumerable<Vector<double>> GenerateBatchEmbeddingsAsync(
        IEnumerable<string> texts, 
        CancellationToken cancellationToken = default);
}

// Implement for custom text splitting
public interface ITextSplitter
{
    IAsyncEnumerable<TextSegment> SplitAsync(
        string text, 
        CancellationToken cancellationToken = default);
}

// Real token counting
public interface ITokenCounter
{
    Task<int> CountTokensAsync(string text, CancellationToken cancellationToken = default);
}

Built-in Implementations

  • MLTokenCounter - GPT-4 compatible tokenizer using Microsoft.ML.Tokenizers
    • Uses TiktokenTokenizer internally with GPT-4 tokenization standards
    • Provides accurate token counting for context window management
    • Seamlessly integrates with semantic chunking workflows
  • SentenceTextSplitter - Regex-based sentence splitting with customizable patterns
    • Default pattern is optimized for English text
    • Handles common English titles and abbreviations (Mr., Mrs., Ms., Dr., Prof., Sr., Jr.)
    • Prevents incorrect sentence breaks at abbreviations
  • MathNetSimilarityCalculator - Cosine similarity using MathNet.Numerics
  • EmbeddingCache - LRU cache for embedding storage with configurable capacity
  • ContextRenderer - Intelligent context management with time-based freshness weighting

πŸ“¦ Dependencies

Package Version Purpose
MathNet.Numerics v5.0.0 Vector operations and similarity calculations
Microsoft.ML.Tokenizers v0.22.0 Real tokenization for accurate token counting
Microsoft.Extensions.AI.Abstractions v9.10.0 AI provider abstractions for MEAI integration
Microsoft.Bcl.TimeProvider v9.0.10 Time abstraction for testable time-dependent operations
.NET 9.0 Target framework for optimal performance

πŸ“– Documentation

🀝 Contributing

We welcome contributions! Here's how to get involved:

Types of Contributions

  • πŸ› Bug Reports - Submit detailed bug reports with reproduction steps
  • ✨ Feature Requests - Propose new features with use cases and examples
  • πŸ“ Documentation - Improve documentation, examples, and guides
  • πŸ”§ Code Contributions - Implement features, fix bugs, optimize performance

Contribution Process

  1. Check existing issues to avoid duplicates
  2. Create an issue to discuss major changes
  3. Fork and create a branch for your contribution
  4. Write tests for new functionality
  5. Ensure all tests pass and maintain code coverage
  6. Update documentation as needed
  7. Submit a pull request with clear description

Development Guidelines

  • Follow existing code style and patterns
  • Write comprehensive tests for new features
  • Update benchmarks for performance-critical changes
  • Document public APIs with XML comments
  • Keep commits focused and well-described

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🌟 Support

πŸ™ Acknowledgments

  • Carbonell, J. and Goldstein, J. (1998) - Original MMR algorithm
  • Microsoft - ML.NET tokenizers for accurate token counting
  • MathNet.Numerics - Excellent numerical computing library
  • Community contributors - Thank you for your feedback and contributions

Built with ❀️ for the AI community by AiGeekSquad

About

Set of utilities for C# to process and consume context in AI workloads

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors