Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json serialization/deserialization support for c++ generator #99

Merged

Conversation

mutagene
Copy link
Contributor

@mutagene mutagene commented Oct 1, 2021

Adapted from @hiennguyenle's work in https://github.com/hiennguyenle/finn.

Unlike Hien Nguyen Le's work, this PR does not introduce a JSON type which can be marshalled across language boundaries - in this PR, the JSON serialisation is strictly used to serialise and deserialise from the c++ datatypes.

Because my purpose in creating this PR was primarily to allow human-friendly representation of DJINNI datatypes for e.g. logging, enums are serialized to strings (which correspond to their names in the IDL). In keeping with Hien Ngyuyen Le's work, the deserialisation layer is also maintained - although I'm not as clear on its use case.

This PR does not include the json+extension.hpp required for std::optional support from Hien Nguyen Le's json PR as that would have to be included in the djinni-support-lib.

json+extension.hpp required for serializing std::optional, as in Hien Nguyen Le's implementation, is produced by the generator if the c++ json extension is enabled.

@a4z
Copy link
Contributor

a4z commented Oct 1, 2021

Awesome! Thanks a lot!!

It will take me some time to look through that but it seems nice on a first glance.

One question I have is, how tight is the coupling to nlohmann/json? I was never a fan of that library since it's slow in every aspect, and the only selling point was a modern interface back then when c++11 was modern.

I think the decision we make now for the serializer lib will stick with us, so I would at least highlight this question since we are going to introduce dependencies.

The other thing is, I wonder if we could get somehow some usage example.
Having test in expected generated text is good and needed.
But we need also to see how that actually works and see how to make functionality available for users. A hello world, because if there is no quick start that shows how to use a feature, it's like not having it

@mutagene
Copy link
Contributor Author

mutagene commented Oct 1, 2021

One question I have is, how tight is the coupling to nlohmann/json? I was never a fan of that library since it's slow in every aspect, and the only selling point was a modern interface back then when c++11 was modern.

Ahhhh, yes, well, the coupling is pretty tight. hiennguyenle was using nlohmann's adl_serializer in the nlohmann namespace for the record, so I did the same with the enum - and to be honest, we're using nlohmann json in our project already, so this would work out pretty well for us. I'm not sure how best to reduce this coupling while still retaining the benefits of "free" json serialization/deserialization (for consumers who are happy with nlohmann json).

If we used the more flexible approach of putting the to_json/from_json methods in the type's namespace, we could leave the actual json implementation templated - but we'd still be pretty much constrained by how nlohmann json is doing things.

I think the decision we make now for the serializer lib will stick with us, so I would at least highlight this question since we are going to introduce dependencies.

Fair enough.

The other thing is, I wonder if we could get somehow some usage example. Having test in expected generated text is good and needed. But we need also to see how that actually works and see how to make functionality available for users. A hello world, because if there is no quick start that shows how to use a feature, it's like not having it

Yes, I agree - of course I tested the generated code in my own toy project - but I wasn't sure where such a project should be included in the djinni-generator repo.

@paulocoutinhox
Copy link
Contributor

Hi,

My suggestion is to use quotes instead of "<...>" to follow other headers pattern.

Thanks.

@mutagene
Copy link
Contributor Author

mutagene commented Oct 4, 2021

My suggestion is to use quotes instead of "<...>" to follow other headers pattern.

Is this with reference to the <nlohmann/json.hpp> include?

@a4z
Copy link
Contributor

a4z commented Oct 4, 2021

I have a little bit of a high workload this week and not sure how much time I am able to invest into my favorite spare time projects.

But as a question and thought experiment:

Do you think it would be possible to have an implementations like

--cpp-json-serialization <None/nlohmann>

and if future we might have , for example

--cpp-json-serialization <None/nlohmann/rapidjson>

and kind of 'hotsap' the json detail implementation at generation time

We do not need an other implementation right now, but just conceptually preparing this kind of flexibility, so we can control that option in future and don't close the door for users that might use already other json frameworks in their project.

@a4z
Copy link
Contributor

a4z commented Oct 4, 2021

My suggestion is to use quotes instead of "<...>" to follow other headers pattern.

Is this with reference to the <nlohmann/json.hpp> include?

I do not think that should change since this is a 3rd party header and should be threaded as a system header

@mutagene
Copy link
Contributor Author

mutagene commented Oct 4, 2021

Do you think it would be possible to have an implementations like

--cpp-json-serialization <None/nlohmann>

This makes sense to me, and isn't too onerous to do now. If more json libraries are added, it seems to me that it will be a bit ugly to continue shove them also into CppGenerator.scala as I have done now with the nlohmann implementation, but I'm not prepared or equipped to refactor CppGenerator now.

@paulocoutinhox
Copy link
Contributor

Wow! It is nice!

@a4z
Copy link
Contributor

a4z commented Oct 5, 2021

Thanks for adding the flexibility option! Making it more pretty once we might have multiple json lib support is a future problem.

After a first glance, I think this looks good.
I need just find a bit more time to play with it, but I think we will pull that in and have json soon support. Thanks a lot!

It would be an awesome addition to have some doc with some small usage example, or any other way that can show people how to have an easy start into using that feature.

@a4z
Copy link
Contributor

a4z commented Oct 8, 2021

This looks good, @mutagene , thanks a lot!
Do you have a small hello djinni json world example, that we could reference from the docs ?

@mutagene
Copy link
Contributor Author

mutagene commented Oct 8, 2021

@a4z
I added the following to generated-code-usage.md


C++ JSON Serialization support

Serialization from C++ types to/from JSON is supported. This feature is currently only enabled for nlohmann/json, and if enabled creates to_json/from_json methods for all djinni records and enums.

#include "my_record.hpp"
#include "my_record+json.hpp"
#include <iostream>
#include <nlohmann/json.hpp>

void foo(const my_record& record) {
  // convert record to json object
  nlohmann::json j = record;
  // dump serialized string
  std::cerr << j.dump(4);
  // create new instance of record from json object
  my_record cloned = j.get<my_record>();
}

I'd be happy to add a working example except I feel that could be quite a rabbit hole as there is currently no infrastructure in this repo for adding c++ executable apps/tests.

@mutagene
Copy link
Contributor Author

mutagene commented Oct 8, 2021

@a4z
By the way, I've also tried adding the json+extension.hpp as a resource file instead of putting it inline with the code; which do you prefer? My preference is towards having it in src/resources/extension/json/nlohmann/json+extension.hpp (maybe a bit verbose..) rather than inline, but I'm not really fussy on this.

  def writeJsonExtensionFile(f: IndentWriter => Unit) {
    createFileOnce(spec.cppHeaderOutFolder.get, "json+extension.hpp", (w: IndentWriter) => {
      w.wl("// AUTOGENERATED FILE - DO NOT MODIFY!")
      f(w)
    })
  }
  def writeNlohmannJsonExtensionFile() {
    writeJsonExtensionFile(w => {
      w.wl("""#pragma once
             |
             |#include <optional>
             |#include <chrono>

vs

  def writeJsonExtensionFile(f: IndentWriter => Unit) {
    createFileOnce(spec.cppHeaderOutFolder.get, "json+extension.hpp", (w: IndentWriter) => {
      w.wl("// AUTOGENERATED FILE - DO NOT MODIFY!")
      f(w)
    })
  }
  def writeNlohmannJsonExtensionFile() {
    val extensionContent = Source.fromResource("extension/json/nlohmann/json+extension.hpp").getLines.mkString("\n")
    writeJsonExtensionFile(w => {
      w.wl(extensionContent)
    })
  }

@a4z
Copy link
Contributor

a4z commented Oct 8, 2021

hm, don't know, if the resource gets embedded into the deliverable, and we still have 1 file to distribute, I do not have strong opinion, but I could imagine it might the be better to do it like that (as embedded resource)

@mutagene
Copy link
Contributor Author

mutagene commented Oct 8, 2021

hm, don't know, if the resource gets embedded into the deliverable, and we still have 1 file to distribute, I do not have strong opinion, but I could imagine it might the be better to do it like that (as embedded resource)

Ok; done.

@mutagene
Copy link
Contributor Author

Just a note: I added the serialisation/deserialisation for flags. I had missed this before, and I guess it fell through the cracks in part because the record in the all_datatypes integration test doesn't have a flags field. I've added this for the json integration test and validated the implementation with clang 12.

Whereas enums translate c++ enums into strings, flags translate into an array of strings in the resulting json. As with enums, the purpose being to create something human readable.

@a4z
Copy link
Contributor

a4z commented Oct 10, 2021

I am sorry to tell you that there are now some merge conflicts.
This will look much wilder than it is, since it is due to the fact that we have now a tool formatting the source code.

I guess the best way to solve them is to just add the changes that add scalafmt into your codebase, run the formatter over the source code, and then merge the rest of the main branch.

If you need any help, please let me know.

Adapted from hiennguyenle's work in https://github.com/hiennguyenle/finn

* Allow json serializer to be specified in djinni options, with only nlohmann currrently implemented
* Update documentation for json serialization
* Add trivial usage example for cpp serialization/deserialization
* Update to generate json+extension.hpp for nlohmann/json
@mutagene mutagene force-pushed the add-json-serialization-modified-from-finn branch from afe7712 to 24f8017 Compare October 10, 2021 21:06
@mutagene
Copy link
Contributor Author

Is anything else required on my end for this PR / any requests or further reservations? I'm looking forward to hopefully getting this to replace a lot of handwritten to_json methods 🙇‍♂️

@a4z
Copy link
Contributor

a4z commented Oct 13, 2021

Sorry for being that slow at the moment @mutagene
I think the PR looks. fine, I would just need some time to check it out locally and test it, but I haven't had time so far.

Will try to catch up asap.

@jothepro
Copy link
Contributor

Same here, I'm planning to do a full review this week (hopefully)!

Copy link
Contributor

@a4z a4z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot!

I did some quick tests, generating a simple record and enum, they seem to serialise and de-serialise nicely.
And I like that the enum value gets the name of the enum value in the output!

Please just check the generated records includes, I think there is no need to include the xml serialisation on that place.
If you do not use the json functionality, you should be able to not have any dependency to any json lib, only if you include the +json header, you should need that.
Otherwise I think it looks very good!

Copy link
Contributor

@a4z a4z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and is also successfully running with some local sample code that is very close to the doc. good stuff!

It would be nice if we could find some solution for the time_point, but I think it's OK to solve that in future if we can't find something good now.

Thanks a lot for this great PR!

@paulocoutinhox
Copy link
Contributor

@a4z
Copy link
Contributor

a4z commented Oct 15, 2021

Thanks Paulo!

I think the topic has some potential for upfront thoughts, and this is why Alexis did right to not implement anything.

imho, the question is:
Do we just want to have the data serialized, and C++ know what to do with it
or do we want something more readable (json format) that would look more like something

let jsonStr = JSON.stringify({'now': new Date()}) ;   
console.log(jsonStr);

-> {"now":"2021-10-15T06:30:02.415Z"}

and we can take the human readable date time and transform it from and to C++
better readable but slower.

Until I we have an anser for this we will take the code as it is, except anyone has an opinion that instead of implementing nothing we should do a "N/A" or so.

I am fine with as it is now, and solve that in future when we might have more user feedback,
but I also wait for a second review to see if there are other opinions

@a4z
Copy link
Contributor

a4z commented Oct 15, 2021

@mutagene , a SwedenCpp user group colleague , @bigdavedev , shared that with me
https://godbolt.org/z/1hna1G1zM
Do you think that could be a log-friendly and reasonable good solution ?

@a4z a4z requested review from jothepro and freitass October 15, 2021 12:44
@mutagene
Copy link
Contributor Author

@mutagene , a SwedenCpp user group colleague , @bigdavedev , shared that with me https://godbolt.org/z/1hna1G1zM Do you think that could be a log-friendly and reasonable good solution ?

Ahh, nice - yes, that's quite lovely.

@bigdavedev
Copy link

I would like to point out that the snippet I provided relies on both nlohmann::json and hinnant::date which is yet another dependency to manage. It's a trade off, but I would encourage one to rely on Howard Hinnant's code (the de facto guru and inventor of chrono for C++11 and C++20) than attempt to parse dates and times oneself...

Copy link
Contributor

@jothepro jothepro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this PR! I was sceptical at first but now that I tried it out and given the fact that it also supports deserialization I could even see myself (ab)using this new feature for more than just logging... ;)

I'm not sure if this has been discussed before, but I remember @freitass proposing to provide overwrites for operator<< as well, to make logging even more convenient? This doesn't necessarily need to happen in this PR, but what do you think about the idea in general?

docs/cli-usage.md Outdated Show resolved Hide resolved
s"--idl src/it/resources/${idlFile}.djinni --cpp-namespace custom_namespace --cpp-json-serialization nlohmann --cpp-out $outputPath/cpp --cpp-header-out $outputPath/cpp-headers"
)
Then(
"the generator should successfully generate just objc output and write all generate files to the given path, including headers"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"the generator should successfully generate just objc output and write all generate files to the given path, including headers"
"the generator should successfully generate just C++ output and write all generated files to the given path, including headers"

I'm not sure though that this is what you want to test here? The it call implies that this is about testing that the correct namespace is applied for the json extension?
To me it seems like testing that all expected files are generated (including the extension) has already been covered by it("Should generate json serializers for all data types")?

Can you try to reduce the scope of this test case or be more specific in the test description (it, when, then) what this test is about, if possible? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - the only data types that had to be tested were record, enum and flag, as their fully qualified names should appear in their corresponding adl_serializers.

src/main/scala/djinni/Main.scala Outdated Show resolved Hide resolved
@mutagene mutagene force-pushed the add-json-serialization-modified-from-finn branch from be4db11 to 5d7fc07 Compare October 17, 2021 17:05
- check --cpp-json-serialization; report error if invalid

- narrow unit tests

- add json de/serialization for date using Howard Hinnant's date.h header-only library
@mutagene mutagene force-pushed the add-json-serialization-modified-from-finn branch from 5d7fc07 to 6ee9912 Compare October 17, 2021 17:08
@mutagene
Copy link
Contributor Author

I'm not sure if this has been discussed before, but I remember @freitass proposing to provide overwrites for operator<< as well, to make logging even more convenient? This doesn't necessarily need to happen in this PR, but what do you think about the idea in general?

That sounds pretty handy, but as I'm not using the << operator for logging I don't have a strong opinion :|

@a4z
Copy link
Contributor

a4z commented Oct 18, 2021

I really would like to merge that PR already, since I think it has been long enough in the review process and I find it good.

But I have one concern now, with the change that dropped in over the weekend.
And this is the dependency to the date library.

While it is nice, it might be unwanted.
I should not get this dependency if I do not use date in my IDL , but still I might want to use json.

Also, this is just 1 possible implementation. Paulo showed an other one, that might be preferred by some users, and there are for sure more. (E.G, setting a user defined time zone on the date)

Therefore I think the best place to provide the date implementation is the documentation, and have no json to date conversion at all.

If the user wants one, adding (a small header with) the required template specialization is an easy task.
And it can happen totally on user site, all a user needs ot have the adl_serializer before call the to/from json

All a user would have to do is basically add a

  template <>
    struct adl_serializer<std::chrono::system_clock::time_point>
    {
        static void to_json(json &j, const std::chrono::system_clock::time_point& tp) {
            j =  //. ... whatever the user wants/needs
        }

        static void from_json(const json &j, std::chrono::system_clock::time_point& value) {
           value = // .. whatever the user wants/needs
        }
    };

before the usage of it in , e.g. (from the docs)

void foo(const my_record& record) {
  // convert record to json object
  nlohmann::json j = record;
  // dump serialized string
  std::cerr << j.dump(4);
  // create new instance of record from json object
  my_record cloned = j.get<my_record>();
}

and it will just work. If that happens inline, or via a header ...

Therefore I seriously think the best way would be to have no implementation for std::chrono::system_clock::time_point because

  • by providing an implementation we will make it hard (to impossible) for the user having some other implementation
  • there is not 1 way of doing it right , there are unknown many ones
  • not pay with an dependency that might be not needed or unwanted
  • having one or two code sampels in the documentation is as good as an implementation

Therefore I think the date implementation is good, but should possible be in the docs.

Thoughts?

@mutagene
Copy link
Contributor Author

mutagene commented Oct 18, 2021

Thoughts?

I was thinking a bit the same when I made the last commit. On the other hand, if we don't provide an implementation at all, then projects using date with json serialization enabled simply won't build.

I guess one solution could be to take @paulo-coutinho's solution and make this the default. If someone wanted to override it with the date.h implementation, they could write their own adl_serializer as outlined in the documentation:

#include <nlohmann/json.hpp>
#include <date/date.h>

namespace nlohmann {
    template <>
    struct adl_serializer<std::chrono::system_clock::time_point>
    {
        static void to_json(json &j, const std::chrono::system_clock::time_point& tp) {
            j = date::format("%F %T %Z", tp);
        }

        static void from_json(const json &j, std::chrono::system_clock::time_point& value) {
            if (j.is_null()) {
                auto dur = std::chrono::milliseconds(0);
                value = std::chrono::time_point<std::chrono::system_clock>(dur);
            } else {
                std::istringstream json_time{j.get<std::string>()};
                std::chrono::system_clock::time_point parsed_time{};
                // Time saved in UTC, so no need to extract time zone
                json_time >> date::parse("%F %T", value);
            }
        }
    };
}

-- although this would have to be included before json+extension.hpp, which is a bit of a pain 💦

@a4z
Copy link
Contributor

a4z commented Oct 18, 2021

I was thinking a bit the same when I made the last commit.

I am super happy we have the same point of view!

On the other hand, if we don't provide an implementation at all, then projects using date with json serialization enabled simply won't build.

That could be fine , to get a compiler error, but maybe there is one more way

I guess one solution could be to take @paulo-coutinho's solution and make this the default. If someone wanted to override it with the date.h implementation, they could write their own adl_serializer as outlined in the documentation:

Since we are in C++ land, we can always decorate something with defines to remove it 😁

#include <nlohmann/json.hpp>
  
namespace nlohmann {

#ifndef DJINNI_CUSTOM_JSON_DATE
     template <>
     struct adl_serializer<std::chrono::system_clock::time_point>
     {
         /// .. some default implelemtation
     },
#endif  

and then this

-- although this would have to be included before json+extension.hpp, which is a bit of a pain 💦

is not a problem anymore

I will add that as review suggestions.

I think with such adoption we should be done. Great discussion leading to awesome improvements and a real powerful library feature. Thanks a lot !!

Copy link
Contributor

@a4z a4z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I drafted the change to make the default json date converter optional

* use simple date serializer as default, with compilation flag to disable and allow custom serializers
* provide custom sample serializer in docs which uses hinnant/date library for human readable dates
* add credit to hiennguyenle/finn in source
Copy link
Contributor

@a4z a4z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

I think we are fine, if something left we can add that as extra issues for the future.

Thank you so much @mutagene , great work and great collaboration!

@a4z a4z requested a review from jothepro October 19, 2021 06:34
Copy link
Contributor

@jothepro jothepro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks from me as well, @mutagene !
Your PR has been a pleasure to review. Looking forward to future collaboration! 😊

I've got some minor final feedback, but all in all this looks great!

docs/generated-code-usage.md Outdated Show resolved Hide resolved
src/main/scala/djinni/Main.scala Outdated Show resolved Hide resolved
@@ -16,6 +16,7 @@ Djinni generator parses an interface definition file and generates:
- Python implementation of types
- C++/CLI implementation of types
- C++ code to convert between C++ and Java over JNI
- C++ code to serialize/deserialize types to/from JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI you may want to update the feature list on the gitpage as well: https://github.com/cross-language-cpp/cross-language-cpp.github.io/blob/main/docs/overview.md#main-features

This will show up here on the documentation website. :)

a4z and others added 2 commits October 19, 2021 16:11
Co-authored-by: jothepro <github@jothe.pro>
Co-authored-by: jothepro <github@jothe.pro>
@a4z a4z merged commit d1c8519 into cross-language-cpp:main Oct 19, 2021
@a4z
Copy link
Contributor

a4z commented Oct 19, 2021

I could not await to merge that, so I accelerated the last things and merged :-)
So happy we have that feature now on main !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants