New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use camel case in JSON when C# types use pascal case #317

Open
Liversage opened this Issue Aug 10, 2017 · 36 comments

Comments

Projects
None yet
@Liversage

Liversage commented Aug 10, 2017

I want to use a custom JsonSerializerSettings to control the format of the JSON stored in CosmosDB. In particular I want to use CamelCasePropertyNamesContractResolver to ensure that the JSON is camel cased even though the C# types use pascal case.

With the latest releases of the Microsoft.Azure.DocumentDB NuGet package it is now possible to provide a JsonSerializerSettings when creating a DocumentClient. However, in some cases the camel case versus pascal case issue is still a problem.

I experience two distinct problems but I have taken the liberty to include both in this report.

I use two document classes:

public class TestDocument
{
    public string Id { get; set; }
    public int Number { get; set; }
    public EmbeddedDocument Embedded { get; set; }
}

public class EmbeddedDocument
{
    public string Text { get; set; }
}

Here is code to create an document in CosmosDB based on these classes:

var jsonSerializerSettings = new JsonSerializerSettings {
    ContractResolver = new CamelCasePropertyNamesContractResolver()
};
var client = new DocumentClient(
    new Uri("https://localhost:8081/"),
    "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==",
    jsonSerializerSettings);
var database = new Database { Id = "TestDatabase" };
await client.CreateDatabaseIfNotExistsAsync(database);
var collection = new DocumentCollection { Id = "TestCollection" };
var databaseUri = UriFactory.CreateDatabaseUri(database.Id);
await client.CreateDocumentCollectionIfNotExistsAsync(databaseUri, collection);
var testDocument = new TestDocument {
    Id = "Alpha",
    Number = 42,
    Embedded = new EmbeddedDocument { Text = "Beta" }
};
var collectionUri = UriFactory.CreateDocumentCollectionUri(database.Id, collection.Id);
await client.CreateDocumentAsync(collectionUri, testDocument);

The JSON in CosmosDB is properly camel cased as expected (I have removed the underscore prefixed properties for clarity):

{
    "id": "Alpha",
    "number": 42,
    "embedded": {
        "text": "Beta"
    }
}

My first problem is that querying using LINQ does not work:

var query = client.CreateDocumentQuery<TestDocument>(collectionUri)
    .Where(document => document.Number == 42)
    .AsDocumentQuery();
var results = new List<TestDocument>();
while (query.HasMoreResults)
    results.AddRange(await query.ExecuteNextAsync<TestDocument>());

At this point results contains no elements and not one element as expected. The reason is that the LINQ provider does not understand that the property to use in the query is number and not Number.

My second problem occurs when I want to do a partial update of a document. It is possible to avoid a complete deserialization and serializtion of the JSON document by using the following code:

var documentUri = UriFactory.CreateDocumentUri(database.Id, collection.Id, "Alpha");
var response = await client.ReadDocumentAsync(documentUri);
var embedded = response.Resource.GetPropertyValue<EmbeddedDocument>("embedded");
embedded.Text = "Gamma";
response.Resource.SetPropertyValue("embedded", embedded);
await client.ReplaceDocumentAsync(response.Resource);

Notice that only EmbeddedDocument is deserialized and then serialized again after it has been modified.

Unfortunately, the resulting JSON in CosmosDB after the code above has executed is the following:

{
    "id": "Alpha",
    "number": 42,
    "embedded": {
        "Text": "Gamma"
    }
}

Notice how the embedded.text property has changed case to embedded.Text (from camel case to pascal case). The reason is that the Document class which is the value of response.Resource does not use the custom JsonSerializerSettings.

@ealsur

This comment has been minimized.

Contributor

ealsur commented Aug 10, 2017

@Liversage Just to help others debug the issue, which version of the SDK are you using? 1.16.1?

@Liversage

This comment has been minimized.

Liversage commented Aug 10, 2017

Sorry, forgot to mention, but yes, I use the latest version which as of this time of writing is 1.16.1.

@kirankumarkolli

This comment has been minimized.

Contributor

kirankumarkolli commented Aug 14, 2017

Partial document update: This is an existing feature gap and is under review. One possible work-around is to use stored procedures. Stored procedure can read the document and patch it. This will avoid one extra network call.

@Liversage

This comment has been minimized.

Liversage commented Aug 15, 2017

@kirankumarkolli While partial updates certainly would be nice I'm describing a slightly different scenario where the entire document is retrieved, modified and then stored again. However, only part of the document is deserialized before it is modified and then serialized again. All the JSON that is not affected by the update is kept intact. This reduces the CPU and memory required on the client side.

I realize that I use the term "partial update" in my issue. However, I should probably have used "partial deserialization and serialization" instead.

@kirankumarkolli

This comment has been minimized.

Contributor

kirankumarkolli commented Aug 15, 2017

non-generic ReadDocumentAsync is somewhat on this path (not fully what you are looking for). This API does a first level de-serialization though.

@Liversage

This comment has been minimized.

Liversage commented Aug 15, 2017

@kirankumarkolli Thanks for the response.

Actually, the issue I am reporting is that when I use the non-generic ReadDocumentAsync (as you suggest) and then use ReplaceDocumentAsync to write the response resource returned by ReadDocumentAsync then the CamelCasePropertyNamesContractResolver is no longer used for serialization (second problem in my report).

I understand why CosmosDB has to control the serialization of the root of the document because it contains some CosmosDB specific properties. However, the behavior I experience where a CamelCasePropertyNamesContractResolver only is used in some cases is still surprising and essentially makes it impossible to use it.

@bentefay

This comment has been minimized.

bentefay commented Aug 16, 2017

@Liversage Regarding your first problem, I have found that adding JsonProperty attributes to all the properties you wish to query from LINQ causes the SDK to emit the correct query (with 1.17.0). Does this work for you?

public class TestDocument
{
    public string Id { get; set; }
    [JsonProperty("number")]
    public int Number { get; set; }
    public EmbeddedDocument Embedded { get; set; }
}

Obviously this is unnecessarily redundant though, so it would be great to have the SDK actually respect the ContractResolver when transpiling LINQ queries.

@kirankumarkolli

This comment has been minimized.

Contributor

kirankumarkolli commented Aug 16, 2017

@Liversage we will investigate ReplaceDocumentAsync not using serialization settings part.

@pendragon-andyh

This comment has been minimized.

pendragon-andyh commented Aug 17, 2017

@Liversage

This comment has been minimized.

Liversage commented Aug 17, 2017

@pendragon-andyh My issue contains the code that demonstrates the problem. You can create a console application, add the NuGet package, add the two class definitions and paste the remaining code into the Main method and add some using statements. That's all. Personally, I use LINQPad to do this for even lower friction.

@pendragon-andyh

This comment has been minimized.

pendragon-andyh commented Aug 17, 2017

@Liversage

This comment has been minimized.

Liversage commented Aug 17, 2017

Thank you @pendragon-andyh, that is a neat workaround. I get the desired result if I replace the last chunk of code in my issue with the following slightly modified code:

var documentUri = UriFactory.CreateDocumentUri(database.Id, collection.Id, "Alpha");
var response = await client.ReadDocumentAsync(documentUri);
var embedded = response.Resource.GetPropertyValue<EmbeddedDocument>("embedded");
embedded.Text = "Gamma";
// Workaround to force the serializer.
var jsonSerializer = JsonSerializer.Create(jsonSerializerSettings);
var jObject = JObject.FromObject(embedded, jsonSerializer);
response.Resource.SetPropertyValue("embedded", jObject);
await client.ReplaceDocumentAsync(response.Resource);

I believe that the incorrect serialization that you describe happens in the JsonSerializable.SaveTo method. Calling SetPropertyValue sets the value of the property in the JObject field of JsonSerializable used to represent the JSON document. However, the serialization is only done when it is actually required.

Now, if LINQ queries also worked with camel case without having to put JsonProperty attributes all over the place... 😉

@pendragon-andyh

This comment has been minimized.

pendragon-andyh commented Aug 17, 2017

Glad to have helped.

I'm interested @Liversage ... does your c# "make a change to the Resource object" perform much better than "fully deserializing the type, making the change, and then serializing the full object"? My gut feel for the API is that it would not be very different.

Also, at a wider level your code is currently doing:

  • Read from (probably a read-replica) server.
  • Do some stuff.
  • Write to the write-primary server.

This is 2 round-trips - and has a risk that the object has changed between the read and the write operations. Even if you use the Etag optimistic-concurrency trick there is a risk that you will get a consistency exception and then have to retry.

The API's seems to favour the use of a stored-procedure for this type of operation. In that case the entire read+change+write would happen as a single atomic unit against the write-primary server. This would be 1 network round-trip and has zero chance of a consistency error. My only problems with sprocs are they kind-of go against the whole schema-less thing (because you have to deploy them and consider versioning) and you have to faff around with another language.

Also interested in how much the LINQ thing is a deal-breaker to you? Note: I am trying to write a API over DocumentDB (aiming for "productive, safe, and doesn't kick your POCO's in the nuts"). I am currently using the SqlQuerySpec approach (I'm a Dapper instead of EF guy). For complex queries, people seem to be having a whole heap of grief because of the disparities between LINQ syntax and what Cosmos DB's syntax.

Regards
Andy

@Liversage

This comment has been minimized.

Liversage commented Aug 17, 2017

@pendragon-andyh, to answer your questions:

[...] does your c# "make a change to the Resource object" perform much better than "fully deserializing the type, making the change, and then serializing the full object"? My gut feel for the API is that it would not be very different.

I haven't measured but I don't think so. I just implemented it like this because it seemed more efficient, but when the NuGet package allowed me to specify JsonSerializerSettings I discovered the issues I am reporting.

[...] has a risk that the object has changed between the read and the write operations

Not in my application. It is actor based and every document is owned by a specific single threaded actor. In general you are right of course.

Also interested in how much the LINQ thing is a deal-breaker to you?

The LINQ problem is a deal breaker because I just want to write and read POCO classes to and from CosmosDB without the hassle of decorating these classes with attributes. My solution is to use pascal case in the JSON. In the future some of this JSON will be served to a browser where the front-end developer will have to live with this formatting.

For complex queries, people seem to be having a whole heap of grief because of the disparities between LINQ syntax and what Cosmos DB's syntax.

My LINQ requirements are simple: I have to be able to filter on and perform ordering based on property values and limit the number of results. I have no problems doing that with the available API.

@pendragon-andyh

This comment has been minimized.

pendragon-andyh commented Aug 17, 2017

@bfarrimo

This comment has been minimized.

bfarrimo commented Oct 3, 2017

The fact that the LINQ provider doesn't pick up the global JSON serializer settings is an issue, and using the JsonPropertyAttribute is at best a poor workaround with a few caveats as I'll demonstrate below.

In the case of generic methods, they don't always read the attribute that you (or at least I) would I expect:

Scenario 1
Given the call client.CreateDocumentQuery<TestEntity>(uri).Where(x => !x.Deleted) correctly looks at the attribute on my 'TestEntity'.

Scenario 2
Given the following code:

interface IDeletable
{
    bool Deleted { get; set; }
}

void Test<TEntity>(DocumentClient client, Uri uri) where TEntity : IDeletable
{
    client.CreateDocumentQuery<TEntity>(uri).Where(x => !x.Deleted);
}

Test<TEntity>(client, uri);

Here, the API looks for a JsonPropertyAttribute on the Deleted property of the IDeletable interface, rather on the TEntity (in this case TestEntity) type.

Scenario 3
Given the following code:

interface IDeletable
{
    bool Deleted { get; set; }
}

void PredicateTest<TEntity>(DocumentClient client, Uri uri, Expression<Func<TEntity, bool>> predicate) where TEntity : IDeletable
{
    client.CreateDocumentQuery<TEntity>(uri).Where(predicate);
}

PredicateTest<TestEntity>(client, uri, x => !x.Deleted);

The API again correctly reads the JsonPropertyAttribute on the Deleted property of the TestEntity type.

Therefore, the correct workaround currently to use the LINQ provider is to place JsonProperty attributes on each property in your class and any interfaces it inherits from in case it's referred to like that at any point.

This is also required for the Id properties on your documents, as DocumentDb requires a lower case 'id' field and any LINQ queries using said field will not work properly.

I would obviously much prefer to be able to simply set my JSON serializer settings globally (to be camel cased) and then have the LINQ provider respect this in all cases, rather than have to jump through so many hoops and find issues almost every time I do something new with DocumentDb.

@jroselle

This comment has been minimized.

jroselle commented Oct 6, 2017

I wanted to add to bfarrimo's comment by mentioning that JsonProperty attribute also does not work on inherited class properties with the override modifier.

@justinSelf

This comment has been minimized.

justinSelf commented Oct 11, 2017

I'm experiencing this issue with ExecuteStoredProcedureAsync as well. I've setup the client to use the CamelCasePropertyNamesContractResolver but the documents are still being serialized as Pascal.

Using Core Client 1.5.1

@bytelabsco

This comment has been minimized.

bytelabsco commented Feb 23, 2018

In addition to what @justinSelf said regarding stored procedures, if you specify custom serializer settings, any parameters you pass in will be lost by the time the stored procedure executes, and all parameters will have no value.

@reddy6ue

This comment has been minimized.

reddy6ue commented Feb 23, 2018

This decision to use only Pascal case and to not let the user set default serialization settings in DocumentDB seems to be a major design flaw. I've also had the same experience dealing with this quirk of CosmosDB.

The best solution in my experience is to let the users decide which serialization settings they want on any particular collection.

@FVilevski

This comment has been minimized.

FVilevski commented Feb 23, 2018

Is there going to be fix for this anytime soon ?

@Liversage

This comment has been minimized.

Liversage commented Feb 23, 2018

@reddy6ue

This decision to use only Pascal case and to not let the user set default serialization settings in DocumentDB seems to be a major design flaw. I've also had the same experience dealing with this quirk of CosmosDB.

CosmosDB does not prescribe anything about the format of the JSON that you store. This is a problem in .NET SDK that you use to access CosmosDB that has a problem fixing the mismatch between C# conventionally using pascal case and JSON conventionally using camel case.

I have considered not using this SDK and instead writing my own library to access the REST API directly to avoid this serialization issue. However, there is actually a lot going on in the SDK and the effort required is not justified if the goal is to fix a casing issue.

@reddy6ue

This comment has been minimized.

reddy6ue commented Feb 26, 2018

@Liversage You are right. It's not a cosmosdb serialization issue, but the the default Newtonsoft serializer that comes with the azure sdk. I still don't understand why it's so hard to allow users to inject a serializer of their choosing.

@creyke

This comment has been minimized.

creyke commented Feb 26, 2018

I'm using a PascalCase language to write and javascript to read, hence the need to enforce camelCase. Now I can't use Linq. Please prioritize this fix.

@moshegutman

This comment has been minimized.

moshegutman commented Feb 26, 2018

Yeah this is annoying. My current workaround is to build the query manually.

var parameters = new Microsoft.Azure.Documents.SqlParameterCollection()
    {
        new Microsoft.Azure.Documents.SqlParameter("@number", number)
    };

Microsoft.Azure.Documents.SqlQuerySpec querySpec = new Microsoft.Azure.Documents.SqlQuerySpec("SELECT * FROM root WHERE (root.number <= @number)", parameters);

var query = documentClient.CreateDocumentQuery<TestDocument>(UriFactory.CreateDocumentCollectionUri("test", "testdocuments"), querySpec).AsDocumentQuery();


while (query.HasMoreResults)
{
    var response = await query.ExecuteNextAsync<TestDocument>();
    testDocuments.AddRange(response);
}
              
@scott-lin

This comment has been minimized.

scott-lin commented Mar 10, 2018

+1, please prioritize fixing this issue.

@tklopl

This comment has been minimized.

tklopl commented Mar 20, 2018

Bump.

@dominikfoldi

This comment has been minimized.

dominikfoldi commented May 22, 2018

Any update on this? This feature has to be a basic thing in the SDK...

@WillRock19

This comment has been minimized.

WillRock19 commented Jul 5, 2018

Do you guys have any update on the subject? The JSON format still a problem...

@mnmr

This comment has been minimized.

mnmr commented Aug 7, 2018

Can we please open-source the SDK so people can fork the code and/or fix this issue without having to reinvent the wheel? It's quite a disgrace that this is still an open issue.

@moshegutman

This comment has been minimized.

moshegutman commented Aug 7, 2018

@arramac Can we get an update on this?

@alexandertsema

This comment has been minimized.

alexandertsema commented Aug 30, 2018

@arramac any updates?

@github-usr-name

This comment has been minimized.

github-usr-name commented Oct 26, 2018

    // See https://github.com/Azure/azure-cosmosdb-dotnet/issues/317 ... this code is a ridiculously brittle kludge that
    // works for now
    private static IQueryable<T> MakeCamelCase<T>(this IQueryable<T> query, IDocumentClient documentClient, Uri collUri, FeedOptions opts)
    {
        if (query.Expression.NodeType != System.Linq.Expressions.ExpressionType.Constant)
        {
            dynamic sqlQueryObject = JsonConvert.DeserializeObject(query.ToString());
            string sql = sqlQueryObject.query;
    // Regex is actually a private member w/ Compiled flag set to avoid overhead.  Not that it really matters,
    // the guts of MS.DocumentDB.Core use uncached reflection with abandon
            sql = new Regex("\\[\"(.+?)\"\\]").Replace(sql, match => $"[\"{match.Groups[1].Value.ToCamelCase()}\"]");
            query = documentClient.CreateDocumentQuery<T>(collUri, sql, opts);
        }

        return query;
    }
@A139008

This comment has been minimized.

A139008 commented Dec 6, 2018

Please prioritise this fix - causing too many people too much grief

@dominikfoldi

This comment has been minimized.

dominikfoldi commented Dec 6, 2018

They says on their forum that this issue is not planned for now, but come on, this is really a problem..

https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/31355917-linq-provider-should-respect-jsonserializersetting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment