Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The data type of the Id member in the ScoredPoint class may need to be changed to int. #794

Closed
AwesomeYuer opened this issue May 3, 2023 · 6 comments
Assignees

Comments

@AwesomeYuer
Copy link

AwesomeYuer commented May 3, 2023

Describe the bug
The data type of the Id member in the ScoredPoint class may need to be changed to int.

To Reproduce

    public async Task qdrant_SK_Http_HNSW_index_Cosine_50_ProcessAsync()
    {
        QdrantMemoryStore memoryStore =
                new QdrantMemoryStore
                            (
                                GlobalManager.SelfHostQdrantHttpConnectionString
                                , 6333
                                , vectorSize: 4
                                //, ConsoleLogger.Log
                            );
        IKernel kernel = Kernel.Builder
            //.WithLogger(ConsoleLogger.Log)
            .Configure
            (
                c =>
                {
                    c.AddOpenAITextCompletionService("text-davinci-003", "123");
                    c.AddOpenAITextEmbeddingGenerationService("text-embedding-ada-002", "123");
                }
            )
            .WithMemoryStorage(memoryStore)
            .Build();

        var MemoryCollectionName = "small";
       //bug here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
        var searchResults = kernel
                                .Memory
                                .SearchAsync
                                        (
                                            MemoryCollectionName
                                            , "My favorite color is orange"
                                            , limit: 20
                                            , minRelevanceScore: 0.8
                                            , withEmbeddings: true
                                        );

        await foreach (var item in searchResults)
        {
            //Console.WriteLine(item.Metadata.Text + " : " + item.Relevance);
            _ = item.Relevance;
        }
    }

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
image

image

Desktop (please complete the following information):

  • OS: [e.g. Windows]
  • IDE: [e.g. Visual Studio, VS Code]
  • NuGet Package Version [e.g. 0.1.0]

Additional context
Add any other context about the problem here.

@tawalke
Copy link
Contributor

tawalke commented May 3, 2023

@AwesomeYuer Qdrant definition for Points object allows for both 64-bit integers and UUID strings

image

image

You can look at the ExtendedPointId object here:
https://github.com/qdrant/qdrant/blob/master/src/common/points.rs
image

@AwesomeYuer
Copy link
Author

AwesomeYuer commented May 4, 2023

@tawalke Thanks for your feedback in advance.
I think it would be better to support both int id and string uuid when deserializing as a string Id property from JSON.

Here is some sample code for reference, I have test it passed and succeed when invoke
QdrantVectorDbClient.FindNearestInCollectionAsync :

SearchVectorsResponse.cs

// Copyright (c) Microsoft. All rights reserved.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace Microsoft.SemanticKernel.Connectors.Memory.Qdrant.Http.ApiSchema;

#pragma warning disable CA1812 // Avoid uninstantiated internal classes: Used for Json Deserialization
internal sealed class SearchVectorsResponse : QdrantResponse
{
    internal sealed class ScoredPoint
    {
        [JsonPropertyName("id")]
        // add by AwesomeYuer@Microshaoft @ 2023-05-04
        [JsonConverter(typeof(NumberToStringConverter))]
        public string Id { get; }

        [JsonPropertyName("version")]
        public int Version { get; set; }

        [JsonPropertyName("score")]
        public double? Score { get; }

        [JsonPropertyName("payload")]
        public Dictionary<string, object> Payload { get; set; }

        [JsonPropertyName("vector")]
        public IEnumerable<float>? Vector { get; }

        [JsonConstructor]
        public ScoredPoint(string id, double? score, Dictionary<string, object> payload, IEnumerable<float> vector)
        {
            this.Id = id;
            this.Score = score;
            this.Payload = payload;
            this.Vector = vector;
        }
    }

    [JsonPropertyName("result")]
    public IEnumerable<ScoredPoint> Results { get; set; }

    [JsonConstructor]
    public SearchVectorsResponse(IEnumerable<ScoredPoint> results)
    {
        this.Results = results;
    }

    #region private ================================================================================

    private SearchVectorsResponse()
    {
        this.Results = new List<ScoredPoint>();
    }

    #endregion
}
#pragma warning restore CA1812 // Avoid uninstantiated internal classes

// add by AwesomeYuer@Microshaoft @ 2023-05-04
public class NumberToStringConverter : JsonConverter<string>
{
    public override string Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        if (typeToConvert != typeof(string))
        {
            throw new NotSupportedException($"{nameof(typeToConvert)}: {typeToConvert}");
        }
        var result = string.Empty;
        var tokenType = reader.TokenType;
        if
            (
                tokenType == JsonTokenType.String
            )
        {
            result = reader.GetString();
        }
        else if
            (
                tokenType == JsonTokenType.Number
            )
        {
            reader.TryGetInt32(out var valueInt);
            result = valueInt.ToString("D", NumberFormatInfo.InvariantInfo);
        }
        else
        {
            throw new NotSupportedException($"{nameof(JsonTokenType)}.{tokenType}");
        }
        return result!;
    }

    public override void Write(Utf8JsonWriter writer, string @value, JsonSerializerOptions options)
    {
        writer.WriteStringValue(@value);
    }
}

@AwesomeYuer
Copy link
Author

good

@craigomatic
Copy link
Contributor

Another simple option we can consider:

[JsonPropertyName("id")]
[JsonNumberHandling(JsonNumberHandling.WriteAsString)]
public string Id { get; }

https://learn.microsoft.com/en-us/dotnet/api/system.text.json.serialization.jsonnumberhandlingattribute

lemillermicrosoft pushed a commit that referenced this issue May 18, 2023
…1075)

### Motivation and Context
Qdrant supports both string and int for the id of a ScoredPoint. This
minor change allows the serialiser to handle both of these scenarios
while retaining string as the type for SK.

Thank you to @AwesomeYuer for the heads up in #794 

### Description
Added a single attribute decorating the attribute that massages ints to
string and leaves string as-is.
@craigomatic
Copy link
Contributor

Closing as the PR has been merged. Please reopen if this issue persists!

shawncal pushed a commit to johnoliver/semantic-kernel that referenced this issue May 19, 2023
…icrosoft#1075)

### Motivation and Context
Qdrant supports both string and int for the id of a ScoredPoint. This
minor change allows the serialiser to handle both of these scenarios
while retaining string as the type for SK.

Thank you to @AwesomeYuer for the heads up in microsoft#794 

### Description
Added a single attribute decorating the attribute that massages ints to
string and leaves string as-is.
@AwesomeYuer
Copy link
Author

glad to hear that

adrianwyatt pushed a commit that referenced this issue Jun 2, 2023
### Motivation and Context
This PR fixes a regression introduced in #1075 and includes unit tests
to correctly validate that both string and integer property ids are
supported.

### Description
The changes are based on the code suggested by AwesomeYuer over in #794
- the JsonNumberHandling attribute behaved in an unexpected way and was
removed.
salmon131 pushed a commit to salmon131/semantic-kernel that referenced this issue Jun 3, 2023
…oft#1313)

### Motivation and Context
This PR fixes a regression introduced in microsoft#1075 and includes unit tests
to correctly validate that both string and integer property ids are
supported.

### Description
The changes are based on the code suggested by AwesomeYuer over in microsoft#794
- the JsonNumberHandling attribute behaved in an unexpected way and was
removed.
shawncal pushed a commit to shawncal/semantic-kernel that referenced this issue Jul 6, 2023
…oft#1313)

### Motivation and Context
This PR fixes a regression introduced in microsoft#1075 and includes unit tests
to correctly validate that both string and integer property ids are
supported.

### Description
The changes are based on the code suggested by AwesomeYuer over in microsoft#794
- the JsonNumberHandling attribute behaved in an unexpected way and was
removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants