Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using OrderBy on field with possible NULL value: System.NotSupportedException: Cannot execute cross partition order-by queries on mix types #733

Closed
NickSevens opened this issue Aug 23, 2019 · 15 comments

Comments

@NickSevens
Copy link

@NickSevens NickSevens commented Aug 23, 2019

Describe the bug
I'm performing this code:

var otherItems = _container.GetItemLinqQueryable<CosmosItem>(true)
    .Where(i => i.Type != "Unboxed")
    .OrderBy(i => i.OptionalField3_Sortable);

which results in the CosmosDB SQL query:
SELECT VALUE root FROM root WHERE (root["Type"] != "Unboxed") ORDER BY root["OptionalField3_Sortable"] ASC

Unfortunately, this query fails with:

Unhandled Exception: System.NotSupportedException: Cannot execute cross partition order-by queries on mix types. Consider using IS_STRING/IS_NUMBER to get around this exception. Expect type: Null. Actual type: String. Item value: Microsoft.Azure.Cosmos.Query.OrderByItem.
at Microsoft.Azure.Cosmos.Query.ParallelQuery.OrderByConsumeComparer.CheckTypeMatching(IList`1 items1, IList`1 items2)
at Microsoft.Azure.Cosmos.Query.ParallelQuery.OrderByConsumeComparer.CompareOrderByItems(IList`1 items1, IList`1 items2)
at Microsoft.Azure.Cosmos.Query.CosmosOrderByItemQueryExecutionContext.OrderByEqualityComparer.Equals(CosmosElement x, CosmosElement y)
at Microsoft.Azure.Cosmos.Query.ItemProducer.d__64.MoveNext()

Additional context
My container contains items which have OptionalFIeld3_Sortable be either String values, or null.
The container's partition key is set on Type

PS: I'm very new at using CosmosDB, so it's very possible I'm doing something wrong here... :)

@NickSevens NickSevens changed the title Error when using OrderBy on field with possible NULL value: Error when using OrderBy on field with possible NULL value: System.NotSupportedException: Cannot execute cross partition order-by queries on mix types Aug 23, 2019
@NickSevens

This comment has been minimized.

Copy link
Author

@NickSevens NickSevens commented Aug 23, 2019

PS: when running the same query in Azure Portal's data explorer, the query DOES work!

@j82w

This comment has been minimized.

Copy link
Contributor

@j82w j82w commented Aug 23, 2019

@bchong95 any ideas?

@j82w j82w added this to To do in .NET v3 Board via automation Aug 23, 2019
@j82w j82w added this to To do in Cosmos DB SDK team via automation Aug 23, 2019
@NickSevens

This comment has been minimized.

Copy link
Author

@NickSevens NickSevens commented Aug 30, 2019

@j82w @bchong95 the error also seems to occur when having only a single partition (partitionkey is equal for all data created in the container)
Any help would be greatly appreciated!

@kirankumarkolli

This comment has been minimized.

Copy link
Member

@kirankumarkolli kirankumarkolli commented Aug 30, 2019

@j82w can you please follow-up with Brandon offline and update next steps?

@j82w

This comment has been minimized.

Copy link
Contributor

@j82w j82w commented Sep 3, 2019

@NickSevens the exception message is pretty clear that IS_STRING needs to be added to the query. The issue is the Linq conversion does not support IS_STRING/IS_NUMBER. We are currently looking into the best way to add support for this.

The reason the query works in the Azure Portal's data explorer is the .NET SDK is more strict on the types to avoid undefined behavior.

To fix your query use the following:

SELECT VALUE root 
FROM root 
WHERE (root["Type"] != "Unboxed" and IS_STRING(root["OptionalField3_Sortable"]))
ORDER BY root["OptionalField3_Sortable"] AS
@j82w j82w added LINQ and removed needs-investigation labels Sep 3, 2019
@NickSevens

This comment has been minimized.

Copy link
Author

@NickSevens NickSevens commented Sep 3, 2019

@j82w I misunderstood your reply initially, sorry.
The problem is that IS_STRING filters out NULL values, which returns an incomplete data set. (e.g. for server side paged data)
Would it at least be possible to override the behaviour? For example by making

settable?

@wahyuen

This comment has been minimized.

Copy link

@wahyuen wahyuen commented Sep 5, 2019

I believe we are also hitting this same issue, specifically for queries where we have nullable values (in our case, DateTime? property). We are also seeing inconsistencies depending on if you query with an OrderBy vs an OrderByDescending call, however, I suspect this might just be an issue in the order in which the comparisons are run.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.Azure.Cosmos;
using Microsoft.Azure.Cosmos.Fluent;
using Microsoft.Azure.Cosmos.Linq;
using Newtonsoft.Json;

namespace CosmosOrderTest
{
    public class Program
    {
        public const string LocalConnectionString =
            "AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==;";

        public static async Task Main(string[] args)
        {
            var cosmosClient = new CosmosClientBuilder(LocalConnectionString).Build();

            try
            {
                var database = await cosmosClient.CreateDatabaseIfNotExistsAsync("testdb");
                var container = await database.Database.DefineContainer("testcontainer", "/id").CreateIfNotExistsAsync();

                await container.Container.CreateItemAsync(new TestObject("1"));
                await container.Container.CreateItemAsync(new TestObject("2", DateTime.Now));
                await container.Container.CreateItemAsync(new TestObject("3"));
                await container.Container.CreateItemAsync(new TestObject("4", DateTime.Now));
                await container.Container.CreateItemAsync(new TestObject("5"));

                var options = new QueryRequestOptions {MaxItemCount = 2};

                // no order by
                var noOrderByQuery = container.Container.GetItemLinqQueryable<TestObject>(false, null, options).ToFeedIterator();

                Console.WriteLine("No Order By Query");

                var items = new List<TestObject>();
                var pagedResults = await noOrderByQuery.ReadNextAsync();

                Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                items.AddRange(pagedResults);

                var continuationToken = pagedResults.ContinuationToken;

                while (continuationToken != null)
                {
                    pagedResults = await container.Container.GetItemLinqQueryable<TestObject>(false, continuationToken, options).ToFeedIterator().ReadNextAsync();

                    Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                    items.AddRange(pagedResults);

                    continuationToken = pagedResults.ContinuationToken;
                }

                Console.WriteLine("Results:");

                foreach (var testObject in items)
                {
                    Console.WriteLine($"{testObject.Id} - {testObject.Timestamp}");
                }

                Console.WriteLine($"Total Count:{items.Count}");
                Console.WriteLine();

                // order by asc
                var orderByAsc = container.Container.GetItemLinqQueryable<TestObject>(false, null, options).OrderBy(x => x.Timestamp).ToFeedIterator();

                Console.WriteLine("Order By Ascending");

                items = new List<TestObject>();
                pagedResults = await orderByAsc.ReadNextAsync();

                Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                items.AddRange(pagedResults);

                continuationToken = pagedResults.ContinuationToken;

                while (continuationToken != null)
                {
                    pagedResults = await container.Container.GetItemLinqQueryable<TestObject>(false, continuationToken, options).OrderBy(x => x.Timestamp).ToFeedIterator().ReadNextAsync();

                    Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                    items.AddRange(pagedResults);

                    continuationToken = pagedResults.ContinuationToken;
                }

                Console.WriteLine("Results:");

                foreach (var testObject in items)
                {
                    Console.WriteLine($"{testObject.Id} - {testObject.Timestamp}");
                }

                Console.WriteLine($"Total Count:{items.Count}");
                Console.WriteLine();

                // order by desc
                var orderByDesc = container.Container.GetItemLinqQueryable<TestObject>(false, null, options).OrderByDescending(x => x.Timestamp).ToFeedIterator();

                Console.WriteLine("Order By Descending");

                items = new List<TestObject>();
                pagedResults = await orderByDesc.ReadNextAsync();

                Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                items.AddRange(pagedResults);

                continuationToken = pagedResults.ContinuationToken;

                while (continuationToken != null)
                {
                    pagedResults = await container.Container.GetItemLinqQueryable<TestObject>(false, continuationToken, options).OrderByDescending(x => x.Timestamp).ToFeedIterator().ReadNextAsync();

                    Console.WriteLine($"{pagedResults.Count} - {pagedResults.ContinuationToken}");

                    items.AddRange(pagedResults);

                    continuationToken = pagedResults.ContinuationToken;
                }

                Console.WriteLine("Results:");

                foreach (var testObject in items)
                {
                    Console.WriteLine($"{testObject.Id} - {testObject.Timestamp}");
                }

                Console.WriteLine($"Total Count:{items.Count}");
                Console.WriteLine();

            }
            finally
            {
                var database = cosmosClient.GetDatabase("testdb");
                await database.DeleteAsync();
            }

            Console.WriteLine("Done!");
        }
    }

    public class TestObject
    {
        public TestObject()
        {
        }

        public TestObject(string id, DateTime? timestamp = null)
        {
            Id = id;
            Timestamp = timestamp;
        }

        [JsonProperty("id")]
        public string Id { get; set; }
        [JsonProperty("timestamp")]
        public DateTime? Timestamp { get; set; }
    }
}

Yielding a result of:
image

Note that in the OrderBy scenario, we seem to only obtain the values where the value IS null, and receive no values where they are not.

In the case of OrderByDescending we get no values and are simply returned the above mentioned exception of:

System.NotSupportedException: 'Cannot execute cross partition order-by queries on mix types. Consider using IS_STRING/IS_NUMBER to get around this exception. Expect type: String. Actual type: Null. Item value: Microsoft.Azure.Cosmos.Query.OrderByItem.'

Given the following statement from ORDER BY clause in Cosmos DB

ASC | DESC

Specifies that the values in the specified column should be sorted in ascending or descending order. ASC sorts from the lowest value to highest value. DESC sorts from highest value to lowest value. ASC is the default sort order. Null values are treated as the lowest possible values.

have I made the wrong assumption that we should expect that an OrderBy should return null values first then non null values?

@NickSevens

This comment has been minimized.

Copy link
Author

@NickSevens NickSevens commented Sep 23, 2019

@j82w revisiting this issue, I keep wondering why I keep receiving the error mentioned above.

The issue occurs because the types null and string for example do not match, which I get technically, but doesn't make much sense to me... I believe this is a common use-case to order by a nullable field.

Also, the error mentions a cross-partition query, but even when passing the PartitionKey in the QueryRequestOptions, the same error occurs. I believe there at least should be a check on that part then?

Hope you can help me figure this one out...
Thanks in advance.

@j82w

This comment has been minimized.

Copy link
Contributor

@j82w j82w commented Sep 23, 2019

@bchong95 can you take a look?

@MrMint

This comment has been minimized.

Copy link

@MrMint MrMint commented Sep 30, 2019

With both the portals data explorer and docs leading us to believe sorting nullable strings was possible, we built out a sorting feature and have run into this same issue in the V2 client. Any plans to resolve this issue in V3?

@wahyuen

This comment has been minimized.

Copy link

@wahyuen wahyuen commented Oct 31, 2019

Looks like still an issue in the v3.3.3.

Is there any indication this might be looked at in the short/near term future? We are at a stage where we are looking at relatively large workarounds to arbitrarily populate null fields with a value just so sorting works and would like to assess if its worth us investing the time for this or not.

@bchong95

This comment has been minimized.

Copy link
Contributor

@bchong95 bchong95 commented Oct 31, 2019

The fix is in this PR: #952

@wahyuen

This comment has been minimized.

Copy link

@wahyuen wahyuen commented Nov 5, 2019

@bchong95 Just tested this on v3.4.0 which should have #952 in it....unfortunately I still think this is an issue?

Ran the same code as above...while it now does not exception out, it still seems to remove results from the result set depending on the order of items. Please see screenshot below where we expect to always return 5 results...but in test case 2 it seems to have only returned the undefined values and not the ones with the DateTime values.

image

@bchong95

This comment has been minimized.

Copy link
Contributor

@bchong95 bchong95 commented Nov 5, 2019

If you are draining through continuation tokens then you are going to face issues with any form of mixed type ORDER BY (UNDEFINED and STRING in this case). This because ORDER BY undefined is defined but range filters on undefined are not. The backend will need to support key set continuation tokens for this to happen, which is further out.

You add IS_DEFINED to your filter to work around this.

@bchong95 bchong95 closed this Nov 17, 2019
Cosmos DB SDK team automation moved this from To do to Done Nov 17, 2019
.NET v3 Board automation moved this from To do to Done Nov 17, 2019
@wahyuen

This comment has been minimized.

Copy link

@wahyuen wahyuen commented Nov 17, 2019

@bchong95 is there another ticket to track the work you mentioned regarding the range filters? this is still going to cause anyone who requires filtering on nullable columns with any sort of pagination to implement arbitrary work arounds (setting default values). Simply filtering out values without a value defined isn't really a viable solution for us (and one might assume other customers) given they have modelled the data as being nullable in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
.NET v3 Board
  
Done
6 participants
You can’t perform that action at this time.