Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from Newtonsoft.Json to System.Text.Json #1160

Merged
merged 56 commits into from
Jun 28, 2024

Conversation

symbiogenesis
Copy link
Contributor

@symbiogenesis symbiogenesis commented Apr 8, 2024

Benefits of System.Text.Json:

  1. Faster than Newtonsoft, especially when using the source generators.

  2. The binary size for the System.Text.Json dll is a bit smaller, and also trimmable.

  3. Most people going forward will already be using System.Text.Json so people making apps for MAUI will have smaller app sizes by not having to bundle two serializers

@Mimetis
Copy link
Owner

Mimetis commented Apr 8, 2024

I'm using the DataContract, DataMember and IgnoreDataMember attributes (and not the classic JsonProperty from NewtonSoft.Json) because they are compatible with a lot of serializers (like MessagePackSerializer)

DMS is allowing you to use any serializer other than NewtonSoft.Json using the SerializerFactory in the web client / web server side.

It means you should be able to use your own serializer if the one used by default by DMS is not fitting your needs (for any reasons)

That's why we have an example that uses a MessagePackSerializer:

public class CustomMessagePackSerializer : ISerializer
{
    public async Task<object> DeserializeAsync(Stream ms, Type type)
    {
        var val = await MessagePackSerializer.DeserializeAsync(type, ms, MessagePackSerializerOptions.Standard
                       .WithResolver(ContractlessStandardResolver.Instance));
        return val;
    }

    public async Task<byte[]> SerializeAsync(object obj)
    {
        using var ms = new MemoryStream();
        await MessagePackSerializer.SerializeAsync(ms, obj, MessagePackSerializerOptions.Standard
                  .WithResolver(ContractlessStandardResolver.Instance));

        return ms.ToArray();
    }
}

And then use it in WebRemoteOrchestrator:

var webRemoteOrchestrator = new WebRemoteOrchestrator(serviceUri);
webRemoteOrchestrator.SerializerFactory = new CustomMessagePackSerializer();

var agent= new SyncAgent(clientProvider, webRemoteOrchestrator, options);

Allowing the user to use its own serializer implies your custom serializer should be able to handle the attributes used in DMS (for now [DataContract], [DataMember] and [IgnoreDataMember])

Now, that you know why we are using [DataContract], [DataMember] and [IgnoreDataMember], the current status, to migrate to System.Text.Json is:

  • First: Should we continue to use [DataContract], [DataMember] and [IgnoreDataMember] ?
  • Second: Should we use the System.Text.Json attributes like [JsonPropertyName] and [JsonIgnore] ?

First solution can work with System.Text.Json as we have a way to define custom contracts to handle any attributes.
We even have an example for [DataContract], [DataMember] and [IgnoreDataMember] : ZCS.DataContractResolver

Second solution will introduce a breaking change as I'm not sure MessagePackSerializer is able to handle these new attributes [JsonPropertyName] and [JsonIgnore]. (Not sure about that, maybe it's working ?)
Honest opinion ? I'm not sure we have anyone is using a custom serializer :)

A more complex solution would be to make the same kind of thing as System.Text.Json by introducing a way to specify a DefaultJsonTypeInfoResolver, but it's more advanced scenario...

What do you think ?

@Mimetis
Copy link
Owner

Mimetis commented Apr 8, 2024

Another thing to mention:

We have a way to customize serialization from web client to web server using the ISerializer interface, but it's not the only serializer we are using in DMS.

We are also using one serializer called LocalJsonSerializer.
This one is not customizable, and is not used to bring data from client to server.
This serializer is used internally to serialize data to the local disk, preventing memory consumption.
This serializer can use any serializer as far as it's compatible with all platforms.
And it should be fast, without memory consumption.
For sure System.Text.Json will be a great replacement for this one, I guess

@symbiogenesis
Copy link
Contributor Author

ZCS.DataContractResolver is nice. Their nuget package doesn't support netstandard 2.0 but not for any good reason.

I opened an issue over there.

But the whole thing is just 150 lines or so, and MIT licensed. We can pull in the file, with attribution, and all will be well. I have updated this PR accordingly.

I like this approach best because I can minimally affect the non-serialization-related code in DMS.

Pulling in this data contract resolver code also seems like a good idea, because it would allow me to more easily add some performance optimizations later. Their approach makes a lot of use of reflection, linq, etc.

Of course, I could upstream some of these changes to them, if they turn out well.

But even if that happens, and they add support, not having unnecessary nuget libraries reduces the attack surface with respect to supply chain attacks. The actual code simply being in this repo is the most secure.

Alas, even after doing this the unit tests are still failing for an unknown reason.

@Mimetis
Copy link
Owner

Mimetis commented Apr 9, 2024

I'm trying to see what happens, and it seems the System.Text.Json serialization is not doing the same thing as NewtonSoft.Json.

For instance, here is a piece of JSON I've found, just by serializing a ScopeInfo:

Before:

{
    "t": [
        {
            "n": "ProductCategory",
            "s": "SalesLT",
            "op": "SqlSyncProvider, Dotmim.Sync.SqlServer.SqlSyncProvider",
            "c": [
                {
                    "n": "ProductCategoryID",
                    "dt": "17",
                    "iu": true,
                    "iuc": true,
                    "ml": 12,
                    "odb": "NVarChar",
                    "oty": "nvarchar",
                    "db": 16
                }

After:

{
    "$id": "1",
    "t": {
        "$id": "2",
        "$values": [
            {
                "$id": "3",
                "n": "ProductCategory",
                "s": "SalesLT",
                "op": "SqlSyncProvider, Dotmim.Sync.SqlServer.SqlSyncProvider",
                "c": {
                    "$id": "4",
                    "$values": [
                        {
                            "$id": "5",
                            "n": "ProductCategoryID",
                            "dt": "17",
                            "iu": true,
                            "iuc": true,
                            "ml": 12,
                            "odb": "NVarChar",
                            "oty": "nvarchar",
                            "db": 16
                         }

Still not sure this is the bug we are hitting, but for sure we are not compatible with previous versions.

I continue my research...

@Mimetis
Copy link
Owner

Mimetis commented Apr 9, 2024

Ok, removing ReferenceHandler.Preserve fixed this issue.

Still searching :)

@Mimetis
Copy link
Owner

Mimetis commented Apr 9, 2024

I have make minor fixes in your code, removing ReferenceHandler.Preserve , adding a schema validation, and removing some environment lines added in json generation.

That being said, while I'm was looking at LocalJsonSerializer, I saw that you are loading the whole JSON file when reading lines, in different methods:

public static (SyncTable schemaTable, int rowsCount, SyncRowState state) GetSchemaTableFromFile(string path)
{
    if (!File.Exists(path))
        return default;

    string tableName = null, schemaName = null;
    int rowsCount = 0;

    SyncTable schemaTable = null;
    SyncRowState state = SyncRowState.None;

    string jsonString = File.ReadAllText(path);
    using var doc = JsonDocument.Parse(jsonString);
    var root = doc.RootElement;
public IEnumerable<SyncRow> GetRowsFromFile(string path, SyncTable schemaTable)
{
    if (!File.Exists(path))
        yield break;

    using var stream = File.OpenRead(path);
    using var reader = new StreamReader(stream);
    var root = serializer.Deserialize<JsonElement>(reader.ReadToEnd());

But the main purpose of LocalJsonSerializer is specifically to not read all the file in memory, to prevent memory pressure, if batch file are huge (especially on mobile devices)

The main purpose of LocalJsonSerializer and local file serialization is to prevent memory consumption.

It seems we can do the same following this doc: https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-utf8jsonreader .
But it seems damn way more difficult than using NewtonSoft.Json ....

For now I'm struggling on a locked file while reading a local file, still searching why ...

@symbiogenesis
Copy link
Contributor Author

Great job solving the test!

Now that you fixed it, it feels like an obvious oversight to not remove ReferenceHandler.Preserve when switching to this new approach. But it probably would've taken me a long time.

I switched from reading the whole file as json to reading from the UTF8 file stream directly as requested. That is clearly better, yes.

@symbiogenesis
Copy link
Contributor Author

I see the locked files.

@Mimetis
Copy link
Owner

Mimetis commented Apr 9, 2024

Don't work too much on the byte stream reader, I'm currently on it

@symbiogenesis
Copy link
Contributor Author

You can fix the locking issue by adding this.sw.Dispose() to the CloseFile() method in the LocalJsonSerializer.

@symbiogenesis
Copy link
Contributor Author

symbiogenesis commented Jun 24, 2024

@Mimetis fixed the conflicts. Tests passed.

You can feel free to update the PR with JsonExtensions functionality. Or merge and make a new PR.

@symbiogenesis symbiogenesis changed the title [WIP] switch from Newtonsoft.Json to System.Text.Json Switch from Newtonsoft.Json to System.Text.Json Jun 24, 2024
@Mimetis
Copy link
Owner

Mimetis commented Jun 24, 2024

@symbiogenesis
Copy link
Contributor Author

You are right about that, but somehow github erroneously told me everything passed. Might need to update the pipelines to report test failures correctly.

I will look at both the failure and try to improve the failure reporting.

@Mimetis
Copy link
Owner

Mimetis commented Jun 27, 2024

I'm on it !

@symbiogenesis
Copy link
Contributor Author

Github is having some delays today. Was able to manually merge your branch, and hopefully the actions start later today.

https://www.githubstatus.com/

@Mimetis
Copy link
Owner

Mimetis commented Jun 28, 2024

Ok, I guess we can merge it !!
This is a huge upgrade, I hope we won't have too much regressions with existing deployed DMS apps

@Mimetis Mimetis merged commit 3ca52e9 into Mimetis:master Jun 28, 2024
28 checks passed
@symbiogenesis symbiogenesis deleted the system-text-json branch June 28, 2024 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants