Skip to content

Features/refine vector store #1123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Aug 12, 2025

Conversation

iceljc
Copy link
Collaborator

@iceljc iceljc commented Aug 8, 2025

PR Type

Enhancement


Description

  • Refine vector storage operations with enhanced filtering and indexing

  • Add time range filtering to logs and conversation queries

  • Improve repository methods with async operations and better performance

  • Add payload index management for vector collections


Diagram Walkthrough

flowchart LR
  A["Vector Storage"] --> B["Enhanced Filtering"]
  A --> C["Payload Indexing"]
  A --> D["Query Options"]
  E["Repository Layer"] --> F["Async Operations"]
  E --> G["Time Range Filters"]
  B --> H["Filter Groups"]
  B --> I["Order By Support"]
  C --> J["Create/Delete Indexes"]
Loading

File Walkthrough

Relevant files
Enhancement
47 files
IKnowledgeService.cs
Add vector collection data query and index management methods
+7/-0     
SuccessFailResponse.cs
Add generic success/fail response model                                   
+7/-0     
ConversationStateKeysFilter.cs
Add time range filtering properties                                           
+2/-0     
InstructLogFilter.cs
Add time range filtering properties                                           
+2/-0     
InstructLogKeysFilter.cs
Add time range filtering properties                                           
+2/-0     
IBotSharpRepository.cs
Convert repository methods to async ValueTask                       
+6/-6     
IUserService.cs
Add batch user retrieval method                                                   
+1/-0     
Pagination.cs
Change Count property from int to long                                     
+1/-1     
IVectorDb.cs
Refactor vector database interface with options pattern   
+7/-4     
VectorCollectionIndexOptions.cs
Add vector collection index management models                       
+18/-0   
VectorFilter.cs
Replace search pairs with filter groups and order by         
+14/-6   
VectorFilterGroup.cs
Add vector filter group model for complex filtering           
+10/-0   
VectorQueryOptions.cs
Add vector query options model                                                     
+12/-0   
VectorSearchOptions.cs
Enhance search options with filter groups and defaults     
+14/-1   
VectorSort.cs
Add vector sorting model                                                                 
+10/-0   
CrontabService.cs
Update to use async repository method                                       
+1/-1     
ConversationService.cs
Update to use async repository method                                       
+1/-1     
LoggerService.Instruction.cs
Optimize instruction log retrieval with batch operations 
+4/-8     
FileRepository.AgentTask.cs
Convert to async ValueTask method                                               
+1/-1     
FileRepository.Conversation.cs
Add async operations and time range filtering                       
+9/-3     
FileRepository.Crontab.cs
Convert to async ValueTask method                                               
+2/-2     
FileRepository.KnowledgeBase.cs
Convert to async ValueTask method                                               
+1/-1     
FileRepository.Log.cs
Add async operations and time range filtering                       
+12/-2   
FileRepository.User.cs
Convert to async ValueTask method                                               
+1/-1     
AgentTaskService.cs
Update to use async repository method                                       
+1/-1     
UserService.cs
Add batch user retrieval and async operations                       
+9/-1     
ConversationController.cs
Optimize user data retrieval in conversations                       
+2/-2     
KnowledgeBaseController.cs
Add vector collection data query and index endpoints         
+32/-2   
QueryVectorDataRequest.cs
Add vector data query request model                                           
+8/-0     
SearchVectorKnowledgeRequest.cs
Add filter groups support to search request                           
+4/-0     
VectorCollectionIndexRequest.cs
Add vector collection index request models                             
+13/-0   
VectorKnowledgeCreateRequest.cs
Remove default data source value                                                 
+1/-1     
VectorKnowledgeViewModel.cs
Replace vector array with dimension count                               
+13/-3   
MemoryVectorDb.cs
Update to use VectorSearchOptions pattern                               
+6/-6     
KnowledgeService.Document.cs
Update error logging and async operations                               
+5/-5     
KnowledgeService.Graph.cs
Update error logging level                                                             
+1/-1     
KnowledgeService.Index.cs
Add vector collection payload index management                     
+74/-0   
KnowledgeService.Vector.cs
Enhance vector operations with better error handling         
+54/-18 
MongoRepository.AgentTask.cs
Add async operations with parallel execution                         
+16/-6   
MongoRepository.Conversation.cs
Add async operations and time range filtering                       
+22/-6   
MongoRepository.Crontab.cs
Add async operations with parallel execution                         
+14/-5   
MongoRepository.KnowledgeBase.cs
Add async operations with parallel execution                         
+14/-4   
MongoRepository.Log.cs
Add async operations and time range filtering                       
+30/-5   
MongoRepository.User.cs
Add async operations with parallel execution                         
+14/-5   
ChatCompletionProvider.cs
Add temperature defaults for new OpenAI models                     
+6/-1     
QdrantDb.cs
Major refactor with enhanced filtering and index support 
+199/-51
SemanticKernelMemoryStoreProvider.cs
Update to use VectorSearchOptions pattern                               
+5/-6     
Bug fix
1 files
BotSharpConversationSideCar.cs
Fix typo in log message                                                                   
+1/-1     
Miscellaneous
1 files
BotSharpDbContext.cs
Remove unused method implementations                                         
+0/-196 
Formatting
2 files
AgentController.cs
Minor code formatting cleanup                                                       
+1/-2     
AgentLlmConfigMongoElement.cs
Remove trailing comma formatting                                                 
+2/-2     
Dependencies
1 files
Directory.Packages.props
Update OpenAI and Qdrant.Client package versions                 
+2/-2     

@iceljc iceljc marked this pull request as draft August 8, 2025 20:28
Copy link

qodo-merge-pro bot commented Aug 8, 2025

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

In GetPagedCollectionData, StartId is passed directly as a UUID string without validating Qdrant’s expected format; if StartId is not a valid UUID, request may fail. Consider guarding with Guid.TryParse before constructing PointId.

collectionName,
limit: (uint)filter.Size,
offset: !string.IsNullOrWhiteSpace(filter.StartId) ? new PointId { Uuid = filter.StartId } : null,
filter: queryFilter,
payloadSelector: payloadSelector,
vectorsSelector: filter.WithVector);
Not Implemented

GetCollectionData now has a new signature but still throws NotImplementedException; this will break new API endpoints relying on point retrieval from the in-memory provider.

public Task<IEnumerable<VectorCollectionData>> GetCollectionData(string collectionName, IEnumerable<Guid> ids, VectorQueryOptions? options = null)
{
    throw new NotImplementedException();
}
Data Conversion

Payload mapping adds DoubleValue support, but integer parsing in BuildQueryFilter uses long only; float/double filters won’t match unless parsed. Consider handling double parsing for numeric filters consistently.

private PayloadSchemaType ConvertPayloadSchemaType(string schemaType)
{
    PayloadSchemaType res;
    switch (schemaType.ToLower())
    {
        case "text":
            res = PayloadSchemaType.Text;
            break;
        case "keyword":
            res = PayloadSchemaType.Keyword;
            break;
        case "integer":
            res = PayloadSchemaType.Integer;
            break;
        case "float":
            res = PayloadSchemaType.Float;
            break;
        case "bool":
            res = PayloadSchemaType.Bool;
            break;
        case "geo":
            res = PayloadSchemaType.Geo;
            break;
        case "datetime":
            res = PayloadSchemaType.Datetime;
            break;
        case "uuid":
            res = PayloadSchemaType.Uuid;
            break;
        default:
            res = PayloadSchemaType.UnknownType;
            break;
    }

    return res;
}

Copy link

qodo-merge-pro bot commented Aug 8, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Validate and sanitize filters

The new filter-group feature passes raw strings straight into Qdrant Match
without centralized validation or sanitization; numeric/bool coercion is
attempted but lacks schema awareness and safe typing across providers. Add a
validation layer (schema-driven or per-collection payload index metadata) to
enforce allowed fields, types, and operators before building filters to prevent
mismatches, silent failures, or inefficient scans when indexes/types don't
align.

Examples:

src/Plugins/BotSharp.Plugin.Qdrant/QdrantDb.cs [539-601]
private Filter? BuildQueryFilter(IEnumerable<VectorFilterGroup>? filterGroups)
{
    Filter? queryFilter = null;

    if (filterGroups.IsNullOrEmpty())
    {
        return queryFilter;
    }

    var conditions = filterGroups.Where(x => !x.Filters.IsNullOrEmpty()).Select(x =>

 ... (clipped 22 lines)

Solution Walkthrough:

Before:

// In QdrantDb.BuildQueryFilter
foreach (var filter in filters) {
    var fieldCondition = new FieldCondition { Key = filter.Key };
    // Blindly attempt type coercion, defaulting to text search
    if (bool.TryParse(filter.Value, out var b)) {
        fieldCondition.Match = new Match { Boolean = b };
    } else if (long.TryParse(filter.Value, out var l)) {
        fieldCondition.Match = new Match { Integer = l };
    } else {
        fieldCondition.Match = new Match { Text = filter.Value };
    }
    // ... add condition
}

After:

// In QdrantDb.BuildQueryFilter, now with schema access
var schema = await GetPayloadIndexSchema(collectionName);

foreach (var filter in filters) {
    var fieldSchemaType = schema.Get(filter.Key); // e.g., "integer", "bool"
    var fieldCondition = new FieldCondition { Key = filter.Key };

    // Build match condition based on known schema type
    switch (fieldSchemaType) {
        case "bool":
            fieldCondition.Match = new Match { Boolean = bool.Parse(filter.Value) };
            break;
        case "integer":
            fieldCondition.Match = new Match { Integer = long.Parse(filter.Value) };
            break;
        // ... other types
        default:
            fieldCondition.Match = new Match { Text = filter.Value };
            break;
    }
    // ... add condition
}
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a critical design flaw in the new filtering feature that could lead to incorrect query results, making it a high-impact issue.

High
General
Preserve payload value types

Preserve original numeric and boolean types instead of string-based parsing to
avoid locale issues and loss of precision. Use type checks and handle
arrays/objects appropriately so Qdrant payloads reflect true data types.

src/Plugins/BotSharp.Plugin.Qdrant/QdrantDb.cs [237-258]

 foreach (var item in payload)
 {
-    var value = item.Value?.ToString();
-    if (value == null || item.Key.IsEqualTo(KnowledgePayloadName.Text))
+    if (item.Key.IsEqualTo(KnowledgePayloadName.Text)) continue;
+    var v = item.Value;
+    switch (v)
     {
-        continue;
-    }
-
-    if (bool.TryParse(value, out var b))
-    {
-        point.Payload[item.Key] = b;
-    }
-    else if (long.TryParse(value, out var l))
-    {
-        point.Payload[item.Key] = l;
-    }
-    else if (double.TryParse(value, out var d))
-    {
-        point.Payload[item.Key] = d;
-    }
-    else
-    {
-        point.Payload[item.Key] = value;
+        case null:
+            continue;
+        case bool b:
+            point.Payload[item.Key] = b;
+            break;
+        case sbyte or byte or short or ushort or int or uint or long:
+            point.Payload[item.Key] = Convert.ToInt64(v);
+            break;
+        case float or double or decimal:
+            point.Payload[item.Key] = Convert.ToDouble(v, System.Globalization.CultureInfo.InvariantCulture);
+            break;
+        case IEnumerable enumerable when v is not string:
+            point.Payload[item.Key] = enumerable.Cast<object>().Select(x => x?.ToString()).ToArray();
+            break;
+        default:
+            point.Payload[item.Key] = v.ToString();
+            break;
     }
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that converting all payload values to strings can cause loss of precision and type information, and it proposes a robust solution using type-switching to preserve original data types.

Medium
  • Update

@iceljc iceljc marked this pull request as ready for review August 9, 2025 02:55
Copy link

qodo-merge-pro bot commented Aug 9, 2025

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

In GetPagedCollectionData, CountAsync returns CountResponse; ensure you're using the actual count value (e.g., .Result or .Count) and that the return type for Count aligns with the expected int. Also verify offset construction with string UUID matches Qdrant expectations.

var totalCountTask = client.CountAsync(collectionName, filter: queryFilter);
var dataResponseTask = client.ScrollAsync(
    collectionName,
    limit: (uint)filter.Size,
    offset: !string.IsNullOrWhiteSpace(filter.StartId) ? new PointId { Uuid = filter.StartId } : null,
    filter: queryFilter,
    payloadSelector: payloadSelector,
    vectorsSelector: filter.WithVector);

await Task.WhenAll([totalCountTask, dataResponseTask]);

var totalPointCount = totalCountTask.Result;
var response = dataResponseTask.Result;

var points = response?.Result?.Select(x => new VectorCollectionData
{
    Id = x.Id?.Uuid ?? string.Empty,
    Data = x.Payload.ToDictionary(p => p.Key, p => p.Value.KindCase switch
    {
        Value.KindOneofCase.StringValue => p.Value.StringValue,
        Value.KindOneofCase.BoolValue => p.Value.BoolValue,
        Value.KindOneofCase.IntegerValue => p.Value.IntegerValue,
        Value.KindOneofCase.DoubleValue => p.Value.DoubleValue,
        _ => new object()
    }),
    Vector = filter.WithVector ? x.Vectors?.Vector?.Data?.ToArray() : null
})?.ToList() ?? new List<VectorCollectionData>();

return new StringIdPagedItems<VectorCollectionData>
{
    Count = totalPointCount,
    NextId = response?.NextPageOffset?.Uuid,
    Items = points
Type Handling

Payload conversion adds support for DoubleValue, but integer parsing uses long in filters; confirm consistency with Qdrant schema and potential overflow when casting to int elsewhere. Validate that skipping KnowledgePayloadName.Text during upsert is intentional and won’t remove needed text in payload queries.

    var points = response?.Result?.Select(x => new VectorCollectionData
    {
        Id = x.Id?.Uuid ?? string.Empty,
        Data = x.Payload.ToDictionary(p => p.Key, p => p.Value.KindCase switch
        {
            Value.KindOneofCase.StringValue => p.Value.StringValue,
            Value.KindOneofCase.BoolValue => p.Value.BoolValue,
            Value.KindOneofCase.IntegerValue => p.Value.IntegerValue,
            Value.KindOneofCase.DoubleValue => p.Value.DoubleValue,
            _ => new object()
        }),
        Vector = filter.WithVector ? x.Vectors?.Vector?.Data?.ToArray() : null
    })?.ToList() ?? new List<VectorCollectionData>();

    return new StringIdPagedItems<VectorCollectionData>
    {
        Count = totalPointCount,
        NextId = response?.NextPageOffset?.Uuid,
        Items = points
    };
}


public async Task<IEnumerable<VectorCollectionData>> GetCollectionData(string collectionName, IEnumerable<Guid> ids, VectorQueryOptions? options = null)
{
    if (ids.IsNullOrEmpty())
    {
        return Enumerable.Empty<VectorCollectionData>();
    }

    var exist = await DoesCollectionExist(collectionName);
    if (!exist)
    {
        return Enumerable.Empty<VectorCollectionData>();
    }

    var client = GetClient();
    var pointIds = ids.Select(x => new PointId { Uuid = x.ToString() }).Distinct().ToList();
    var points = await client.RetrieveAsync(collectionName, pointIds, options?.WithPayload ?? false, options?.WithVector ?? false);
    return points.Select(x => new VectorCollectionData
    {
        Id = x.Id?.Uuid ?? string.Empty,
        Data = x.Payload?.ToDictionary(p => p.Key, p => p.Value.KindCase switch 
        { 
            Value.KindOneofCase.StringValue => p.Value.StringValue,
            Value.KindOneofCase.BoolValue => p.Value.BoolValue,
            Value.KindOneofCase.IntegerValue => p.Value.IntegerValue,
            Value.KindOneofCase.DoubleValue => p.Value.DoubleValue,
            _ => new object()
        }) ?? new(),
        Vector = x.Vectors?.Vector?.Data?.ToArray()
    });
Config Semantics

ReasoningEffortLevel is applied only when _model is in _defaultTemperature map; this silently disables the feature for other models. Confirm intended gating and whether temperature override for these models should be decoupled from reasoning effort support.

private ChatCompletionOptions InitChatCompletionOption(Agent agent)
{
    var state = _services.GetRequiredService<IConversationStateService>();

    var temperature = float.Parse(state.GetState("temperature", "0.0"));
    if (_defaultTemperature.ContainsKey(_model))
    {
        temperature = _defaultTemperature[_model];
    }

    var maxTokens = int.TryParse(state.GetState("max_tokens"), out var tokens)
                    ? tokens
                    : agent.LlmConfig?.MaxOutputTokens ?? LlmConstant.DEFAULT_MAX_OUTPUT_TOKEN;

    var level = state.GetState("reasoning_effort_level")
                     .IfNullOrEmptyAs(agent?.LlmConfig?.ReasoningEffortLevel ?? string.Empty)
                     .IfNullOrEmptyAs(LlmConstant.DEFAULT_REASONING_EFFORT_LEVEL);
    var reasoningEffortLevel = ParseReasoningEffortLevel(level);

    return new ChatCompletionOptions()
    {
        Temperature = temperature,
        MaxOutputTokenCount = maxTokens,
        ReasoningEffortLevel = reasoningEffortLevel
    };
}

private ChatReasoningEffortLevel? ParseReasoningEffortLevel(string? level)
{
    if (string.IsNullOrWhiteSpace(level) || !_defaultTemperature.ContainsKey(_model))
    {
        return null;
    }

    var effortLevel = ChatReasoningEffortLevel.Low;
    switch (level.ToLower())
    {
        case "medium":
            effortLevel = ChatReasoningEffortLevel.Medium;
            break;
        case "high":
            effortLevel = ChatReasoningEffortLevel.High;
            break;
        default:
            break;
    }

    return effortLevel;
}

public void SetModelName(string model)
{
    _model = model;

Copy link

qodo-merge-pro bot commented Aug 9, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Risky dependency downgrades

The PR downgrades OpenAI to a beta and changes Qdrant.Client versions while
introducing new API surfaces (reasoning effort, filter groups, payload indexes).
Verify compatibility: o3/gpt-5 reasoning features require specific SDKs, and
Qdrant 1.15 API changes must match your BuildQueryFilter/index calls. If these
SDKs don't support the added options, vector search and chat may fail at
runtime; pin stable versions or add capability checks/fallbacks.

Examples:

Directory.Packages.props [49-50]
<PackageVersion Include="OpenAI" Version="2.2.0-beta.4" />
src/Plugins/BotSharp.Plugin.OpenAI/Providers/Chat/ChatCompletionProvider.cs [507-517]
var level = state.GetState("reasoning_effort_level")
                 .IfNullOrEmptyAs(agent?.LlmConfig?.ReasoningEffortLevel ?? string.Empty)
                 .IfNullOrEmptyAs(LlmConstant.DEFAULT_REASONING_EFFORT_LEVEL);
var reasoningEffortLevel = ParseReasoningEffortLevel(level);

return new ChatCompletionOptions()
{
    Temperature = temperature,
    MaxOutputTokenCount = maxTokens,
    ReasoningEffortLevel = reasoningEffortLevel

 ... (clipped 1 lines)

Solution Walkthrough:

Before:

// In Directory.Packages.props
<PackageVersion Include="OpenAI" Version="2.2.0-beta.4" />
<PackageVersion Include="Qdrant.Client" Version="1.15.0" />

// In ChatCompletionProvider.cs
var level = ...;
var reasoningEffortLevel = ParseReasoningEffortLevel(level);
return new ChatCompletionOptions()
{
    ...
    ReasoningEffortLevel = reasoningEffortLevel
};

After:

// In Directory.Packages.props
<PackageVersion Include="OpenAI" Version="2.2.0" /> // Pinned to stable
<PackageVersion Include="Qdrant.Client" Version="1.15.0" /> // Keep upgrade after verification

// In ChatCompletionProvider.cs
var level = ...;
// Add capability check before using beta features
if (IsReasoningEffortSupported()) {
    var reasoningEffortLevel = ParseReasoningEffortLevel(level);
    options.ReasoningEffortLevel = reasoningEffortLevel;
}
return options;
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a risky downgrade of the OpenAI package to a beta version and an upgrade of Qdrant.Client, and accurately links these to new features (ReasoningEffortLevel, FilterGroups) that could cause runtime failures if compatibility isn't ensured.

High
Possible issue
Correct reasoning capability gating
Suggestion Impact:The commit removed the reasoning effort level handling entirely (deleted retrieval, parsing, and the ParseReasoningEffortLevel method), thereby eliminating the flawed gating tied to the temperature map. While it did not introduce a separate capability set as suggested, it addressed the core issue by removing the coupled logic.

code diff:

-        var level = state.GetState("reasoning_effort_level")
-                         .IfNullOrEmptyAs(agent?.LlmConfig?.ReasoningEffortLevel ?? string.Empty)
-                         .IfNullOrEmptyAs(LlmConstant.DEFAULT_REASONING_EFFORT_LEVEL);
-        var reasoningEffortLevel = ParseReasoningEffortLevel(level);
-
         return new ChatCompletionOptions()
         {
             Temperature = temperature,
-            MaxOutputTokenCount = maxTokens,
-            ReasoningEffortLevel = reasoningEffortLevel
+            MaxOutputTokenCount = maxTokens
         };
-    }
-
-    private ChatReasoningEffortLevel? ParseReasoningEffortLevel(string? level)
-    {
-        if (string.IsNullOrWhiteSpace(level) || !_defaultTemperature.ContainsKey(_model))
-        {
-            return null;
-        }
-
-        var effortLevel = ChatReasoningEffortLevel.Low;
-        switch (level.ToLower())
-        {
-            case "medium":
-                effortLevel = ChatReasoningEffortLevel.Medium;
-                break;
-            case "high":
-                effortLevel = ChatReasoningEffortLevel.High;
-                break;
-            default:
-                break;
-        }
-
-        return effortLevel;
     }

Decouple reasoning-effort applicability from the temperature map. Using
_defaultTemperature.ContainsKey(_model) to gate reasoning will disable it for
supported models not listed. Introduce a separate check (e.g., a set of
reasoning-capable models) or remove the gate to honor provided levels when the
SDK accepts it.

src/Plugins/BotSharp.Plugin.OpenAI/Providers/Chat/ChatCompletionProvider.cs [18-541]

-private readonly Dictionary<string, float> _defaultTemperature = new()
+private readonly HashSet<string> _reasoningCapableModels = new(StringComparer.OrdinalIgnoreCase)
 {
-    { "o3", 1.0f },
-    { "o3-mini", 1.0f },
-    { "o4-mini", 1.0f },
-    { "gpt-5", 1.0f },
-    { "gpt-5-mini", 1.0f },
-    { "gpt-5-nano", 1.0f }
+    "o3", "o3-mini", "o4-mini", "gpt-5", "gpt-5-mini", "gpt-5-nano"
 };
 ...
 private ChatReasoningEffortLevel? ParseReasoningEffortLevel(string? level)
 {
-    if (string.IsNullOrWhiteSpace(level) || !_defaultTemperature.ContainsKey(_model))
+    if (string.IsNullOrWhiteSpace(level))
     {
         return null;
     }
-    var effortLevel = ChatReasoningEffortLevel.Low;
-    switch (level.ToLower())
+    if (!_reasoningCapableModels.Contains(_model))
     {
-        case "medium":
-            effortLevel = ChatReasoningEffortLevel.Medium;
-            break;
-        case "high":
-            effortLevel = ChatReasoningEffortLevel.High;
-            break;
-        default:
-            break;
+        return null;
     }
-    return effortLevel;
+    return level.ToLower() switch
+    {
+        "medium" => ChatReasoningEffortLevel.Medium,
+        "high" => ChatReasoningEffortLevel.High,
+        _ => ChatReasoningEffortLevel.Low
+    };
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: This suggestion correctly identifies a design flaw where reasoning capability is coupled with temperature settings, and proposes a good refactoring that decouples them, improving code maintainability and robustness.

Low
General
Safely parse GUID list

Avoid using an implicitly scoped guid from TryParse in an anonymous projection;
some compilers may treat the out variable as shared across iterations, causing
incorrect IDs. Parse in a preliminary loop and add validated GUIDs to a list to
ensure correctness.

src/Plugins/BotSharp.Plugin.KnowledgeBase/Services/KnowledgeService.Vector.cs [322-325]

-var pointIds = ids.Select(x => new { Id = x, IsValid = Guid.TryParse(x, out var guid), ParseResult = guid })
-                  .Where(x => x.IsValid)
-                  .Select(x => x.ParseResult)
-                  .ToList();
+var pointIds = new List<Guid>();
+foreach (var id in ids)
+{
+    if (Guid.TryParse(id, out var gid))
+    {
+        pointIds.Add(gid);
+    }
+}
  • Apply / Chat
Suggestion importance[1-10]: 3

__

Why: The existing LINQ code is correct on modern C# compilers. While the suggestion's concern about out variable capturing was valid for older language versions, it's not an issue here. The proposed change is a more verbose but equally correct alternative.

Low
  • More

@@ -3,4 +3,5 @@ namespace BotSharp.Abstraction.Agents.Constants;
public static class LlmConstant
{
public const int DEFAULT_MAX_OUTPUT_TOKEN = 1024;
public const string DEFAULT_REASONING_EFFORT_LEVEL = "low";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change to "minimal"?

@@ -46,7 +46,7 @@
<PackageVersion Include="Whisper.net.Runtime" Version="1.8.1" />
<PackageVersion Include="NCrontab" Version="3.3.3" />
<PackageVersion Include="Azure.AI.OpenAI" Version="2.2.0-beta.5" />
<PackageVersion Include="OpenAI" Version="2.2.0" />
<PackageVersion Include="OpenAI" Version="2.2.0-beta.4" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downgrade to beta?

@iceljc iceljc marked this pull request as draft August 12, 2025 02:21
@iceljc iceljc marked this pull request as ready for review August 12, 2025 14:26
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The updated code assigns user from a pre-fetched list but the user variable isn't declared in the shown scope; ensure a local variable exists and that users retrieval matches expected shape (list vs paged) to avoid null reference or compile errors.

var agents = await agentService.GetAgentOptions(agentIds);

var userIds = list.Select(x => x.User.Id).ToList();
var users = await userService.GetUsers(userIds);

foreach (var item in list)
{
    user = users.FirstOrDefault(x => x.Id == item.User.Id);
    item.User = UserViewModel.FromUser(user);
    var agent = agents.FirstOrDefault(x => x.Id == item.AgentId);
Null Handling

Multiple catch blocks now return empty arrays [] or null; ensure these sentinel returns are acceptable by callers and do not break existing expectations (e.g., GetVectorCollections previously returned Enumerable.Empty<>).

    }
    catch (Exception ex)
    {
        _logger.LogError(ex, $"Error when getting vector db collections.");
        return [];
    }
}

public async Task<VectorCollectionDetails?> GetVectorCollectionDetails(string collectionName)
{
    try
    {
        if (string.IsNullOrWhiteSpace(collectionName)) return null;

        var db = _services.GetRequiredService<IBotSharpRepository>();
        var configs = db.GetKnowledgeCollectionConfigs(new VectorCollectionConfigFilter
        {
            CollectionNames = [collectionName]
        }).ToList();

        var vectorDb = GetVectorDb();
        var details = await vectorDb.GetCollectionDetails(collectionName);
        if (details != null)
        {
            details.BasicInfo = configs.FirstOrDefault();
        }
        return details;
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, $"Error when getting vector db collection details.");
        return null;
Query Filter Semantics

BuildQueryFilter coerces values using bool/long parsing; check that this matches Qdrant payload schema types and that mixing Must of grouped Should/Must conditions yields intended results across AND/OR groups.

private Filter? BuildQueryFilter(IEnumerable<VectorFilterGroup>? filterGroups)
{
    Filter? queryFilter = null;

    if (filterGroups.IsNullOrEmpty())
    {
        return queryFilter;
    }

    var conditions = filterGroups.Where(x => !x.Filters.IsNullOrEmpty()).Select(x =>
    {
        Filter filter;
        var innerConditions = x.Filters.Select(f =>
        {
            var field = new FieldCondition
            {
                Key = f.Key,
                Match = new Match { Text = f.Value }
            };

            if (bool.TryParse(f.Value, out var boolVal))
            {
                field.Match = new Match { Boolean = boolVal };
            }
            else if (long.TryParse(f.Value, out var intVal))
            {
                field.Match = new Match { Integer = intVal };
            }

            return new Condition { Field = field };
        });

        if (x.FilterOperator.IsEqualTo("and"))
        {
            filter = new Filter
            {
                Must = { innerConditions }
            };
        }
        else
        {
            filter = new Filter
            {
                Should = { innerConditions }
            };
        }

        return new Condition
        {
            Filter = filter
        };
    });

    queryFilter = new Filter
    {
        Must =
        {
            conditions
        }
    };

    return queryFilter;
}

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Async API consistency and contracts

The PR converts many repository methods to async/ValueTask but leaves several
service/controller call sites and interface default implementations inconsistent
(e.g., IBotSharpRepository still throws NotImplementedException synchronously
for many methods, while services now await them). Ensure all repository
implementations (File, Mongo, DbContext) and consumers align on async signatures
and semantics, otherwise deadlocks or runtime exceptions will surface under
load. Additionally, the vector DB API was reshaped (search/filter/index/query
options); all providers (Memory, SemanticKernel, Qdrant) should fully implement
the new options consistently to avoid silent feature gaps in filtering,
ordering, and payload selection.

Examples:

src/Infrastructure/BotSharp.Core/Repository/BotSharpDbContext.cs [3-6]
public class BotSharpDbContext : Database, IBotSharpRepository
{
    public IServiceProvider ServiceProvider => throw new NotImplementedException();
}
src/Plugins/BotSharp.Plugin.KnowledgeBase/MemVecDb/MemoryVectorDb.cs [47-55]
public async Task<IEnumerable<VectorCollectionData>> Search(string collectionName, float[] vector, VectorSearchOptions? options = null)
{
    if (!_vectors.ContainsKey(collectionName))
    {
        return new List<VectorCollectionData>();
    }

    options ??= VectorSearchOptions.Default();
    var similarities = VectorHelper.CalCosineSimilarity(vector, _vectors[collectionName]);

Solution Walkthrough:

Before:

// IVectorDb interface defines new search options
public interface IVectorDb {
    Task<IEnumerable<Data>> Search(string collection, float[] vector, VectorSearchOptions? options);
}

// But some implementations ignore the new filtering capabilities
public class MemoryVectorDb : IVectorDb {
    public async Task<IEnumerable<Data>> Search(string collection, float[] vector, VectorSearchOptions? options) {
        // options.FilterGroups is ignored, leading to incorrect results.
        var results = ... // Logic only considers limit and vector.
        return results;
    }
}

After:

// IVectorDb interface defines new search options
public interface IVectorDb {
    Task<IEnumerable<Data>> Search(string collection, float[] vector, VectorSearchOptions? options);
}

// All implementations should fully support the new options
public class MemoryVectorDb : IVectorDb {
    public async Task<IEnumerable<Data>> Search(string collection, float[] vector, VectorSearchOptions? options) {
        var results = ...;
        if (options?.FilterGroups != null) {
            // Apply filtering logic based on FilterGroups
            results = ApplyFilters(results, options.FilterGroups);
        }
        return results;
    }
}
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies two critical, cross-cutting architectural inconsistencies that could lead to runtime exceptions and silent bugs, affecting both the data repository layer and the vector database abstraction.

High
Possible issue
Fix payload numeric parsing and culture issues

The current numeric parsing uses float.TryParse but assigns to DoubleValue,
which may cause precision loss and parsing failures in locales. Parse with
double.TryParse using invariant culture. Also handle arrays and nested objects
by serializing them to string to avoid runtime errors.

src/Plugins/BotSharp.Plugin.Qdrant/QdrantDb.cs [239-260]

 foreach (var item in payload)
 {
-    var value = item.Value?.ToString();
-    if (value == null || item.Key.IsEqualTo(KnowledgePayloadName.Text))
+    if (item.Key.IsEqualTo(KnowledgePayloadName.Text) || item.Value is null)
+        continue;
+
+    Value payloadValue;
+    switch (item.Value)
     {
-        continue;
+        case bool b:
+            payloadValue = new Value { BoolValue = b };
+            break;
+        case int i:
+            payloadValue = new Value { IntegerValue = i };
+            break;
+        case long l:
+            payloadValue = new Value { IntegerValue = l };
+            break;
+        case float f32:
+            payloadValue = new Value { DoubleValue = Convert.ToDouble(f32, CultureInfo.InvariantCulture) };
+            break;
+        case double f64:
+            payloadValue = new Value { DoubleValue = f64 };
+            break;
+        case string s:
+            if (bool.TryParse(s, out var b2))
+                payloadValue = new Value { BoolValue = b2 };
+            else if (long.TryParse(s, NumberStyles.Integer, CultureInfo.InvariantCulture, out var l2))
+                payloadValue = new Value { IntegerValue = l2 };
+            else if (double.TryParse(s, NumberStyles.Float | NumberStyles.AllowThousands, CultureInfo.InvariantCulture, out var d2))
+                payloadValue = new Value { DoubleValue = d2 };
+            else
+                payloadValue = new Value { StringValue = s };
+            break;
+        default:
+            // Fallback: serialize complex types to string
+            payloadValue = new Value { StringValue = JsonSerializer.Serialize(item.Value) };
+            break;
     }
 
-    if (bool.TryParse(value, out var b))
-    {
-        payloadValue = new Value { BoolValue = b };
-    }
-    else if (long.TryParse(value, out var l))
-    {
-        payloadValue = new Value { IntegerValue = l };
-    }
-    else if (float.TryParse(value, out var f))
-    {
-        payloadValue = new Value { DoubleValue = f };
-    }
-    else
-    {
-        payloadValue = new Value { StringValue = value };
-    }
-
-    payloadPairs.Add(item.Key, payloadValue);
+    payloadPairs[item.Key] = payloadValue;
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies and fixes multiple issues in payload processing, such as inefficient type conversion, potential data loss, and culture-specific parsing errors, by implementing a robust type-switching mechanism and using CultureInfo.InvariantCulture.

Medium
Use culture-invariant lowercasing

schemaType.ToLower() is culture-sensitive and can break in locales like Turkish.
Use ToLowerInvariant() to avoid culture bugs when mapping schema types.

src/Plugins/BotSharp.Plugin.Qdrant/QdrantDb.cs [637-672]

 private PayloadSchemaType ConvertPayloadSchemaType(string schemaType)
 {
-    PayloadSchemaType res;
-    switch (schemaType.ToLower())
+    switch ((schemaType ?? string.Empty).ToLowerInvariant())
     {
-        case "text":
-            res = PayloadSchemaType.Text;
-            break;
-        case "keyword":
-            res = PayloadSchemaType.Keyword;
-            break;
-        case "integer":
-            res = PayloadSchemaType.Integer;
-            break;
-        case "float":
-            res = PayloadSchemaType.Float;
-            break;
-        case "bool":
-            res = PayloadSchemaType.Bool;
-            break;
-        case "geo":
-            res = PayloadSchemaType.Geo;
-            break;
-        case "datetime":
-            res = PayloadSchemaType.Datetime;
-            break;
-        case "uuid":
-            res = PayloadSchemaType.Uuid;
-            break;
-        default:
-            res = PayloadSchemaType.UnknownType;
-            break;
+        case "text": return PayloadSchemaType.Text;
+        case "keyword": return PayloadSchemaType.Keyword;
+        case "integer": return PayloadSchemaType.Integer;
+        case "float": return PayloadSchemaType.Float;
+        case "bool": return PayloadSchemaType.Bool;
+        case "geo": return PayloadSchemaType.Geo;
+        case "datetime": return PayloadSchemaType.Datetime;
+        case "uuid": return PayloadSchemaType.Uuid;
+        default: return PayloadSchemaType.UnknownType;
     }
-
-    return res;
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out a potential internationalization bug by replacing ToLower() with ToLowerInvariant(), which is a best practice for culture-agnostic string comparisons, preventing issues in locales like Turkish.

Medium
General
Ensure text saved in payload

The code ensures data_source exists but does not ensure the primary text is
persisted in payload. Downstream code expects payload[KnowledgePayloadName.Text]
for comparisons. Add the text into payload to keep consistency with updates and
searches.

src/Plugins/BotSharp.Plugin.KnowledgeBase/Services/KnowledgeService.Vector.cs [168-176]

 var payload = create.Payload ?? new();
 if (!payload.TryGetValue(KnowledgePayloadName.DataSource, out _))
 {
-    payload[KnowledgePayloadName.DataSource] = !string.IsNullOrWhiteSpace(create.DataSource) ?
-                                                create.DataSource : VectorDataSource.Api;
+    payload[KnowledgePayloadName.DataSource] = !string.IsNullOrWhiteSpace(create.DataSource)
+        ? create.DataSource
+        : VectorDataSource.Api;
 }
+// Ensure text is present for later comparisons/reads
+payload[KnowledgePayloadName.Text] = create.Text;
 
 return await db.Upsert(collectionName, guid, vector, create.Text, payload);
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that the text should be part of the payload for consistency, as seen in the UpsertVectorCollectionData method. While the QdrantDb.Upsert method already handles this, adding it here improves code clarity and consistency within the KnowledgeService.

Low
  • More

@iceljc iceljc merged commit 26ff3ce into SciSharp:master Aug 12, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants