Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator T() considered harmful #958

Open
theodelrieu opened this issue Feb 4, 2018 · 65 comments
Open

operator T() considered harmful #958

theodelrieu opened this issue Feb 4, 2018 · 65 comments

Comments

@theodelrieu
Copy link
Contributor

@theodelrieu theodelrieu commented Feb 4, 2018

operator T() considered harmful

There's two subjects of importance that I'd like to tackle.
This is the first one, I'll open another issue quickly (hopefully).

The problem

There's been several issues in the past related to the implicit conversion operator that the library provides.
The current Readme demonstrates the following use-case:

// conversion: json -> person
ns::person p2 = j;

This will call json::operator T() with T = ns::person, so no problems here.

You'd expect the following code to always work too:

ns::person p2;
p2 = j;

And it works! Sometimes.

If ns::person has a user-defined operator=, it will break with the following message:

t.cpp: In function ‘int main(int, const char**)’:
t.cpp:14:7: error: ambiguous overload for ‘operator=’ (operand types are ‘ns::person’ and ‘nlohmann::json {aka nlohmann::basic_json<>}’)
   p2 = j;
       ^
t.cpp:6:6: note: candidate: ns::person& ns::person::operator=(int)
   ns::person& operator=(int) { return *this; }
      ^~~~~~~~
t.cpp:3:8: note: candidate: constexpr ns::person& ns::person::operator=(const ns::person&)
 struct ns::person
        ^
t.cpp:3:8: note: candidate: constexpr ns::person& ns::person::operator=(ns::person&&)

Hard to understand that error, and it's not something that can be fixed by the library.
It's triggered by the compiler before any template machinery on our side.

Now with such code:

void need_precise_measurement(Milliseconds);

// later on
json j = json::parse(config);
need_precise_measurement(j);

It can fail in two cases:

  1. Someone adds an overload for need_precise_measurement. Compiler error.
  2. Someone changes the type of the argument to Nanoseconds. Runtime error at best.

Implicit conversions

Implicit conversions make the library pleasant to use. Especially when implicitely converting to JSON.
That's because we have a templated non-explicit constructor with lots of constraints to correctly handle user types.
There's also a single type to convert to: nlohmann::json, so everything works.

However, when converting implicitely from json, it's the complete opposite. If the compiler cannot decide which type it should implicit convert the nlohmann::json value to, you get a compiler error.

And it could happen in a codebase months after writing the code, (e.g. when adding an operator= to a type).

In the end the only consistent way to convert from a json value is to call json.get<T>.

Proposed change

I propose to simply and completely remove operator T().

It will obviously break code, so it can only be done in a major version.

Of course this change cannot be adopted right away, so I propose to add a deprecation warning inside operator T(), as we did for basic_json::basic_json(std::istream&) overloads.

Then it would be reasonable to remove it in a future major version.

@nlohmann
Copy link
Owner

@nlohmann nlohmann commented Feb 4, 2018

Thanks for opening a separate issue and separating this from #951.

I think I understand the problem, and this really may be something to tackle with 4.0.0. The question is whether there could be some middle ground, for instance a "whitelist" of supported types for which we would still have an implicit conversion. Or an extension point to let users add their desired implicit conversions. Or ...

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented Feb 4, 2018

First thing I don't like about having a whitelist is that it prevents consistency. Everyone has to remember which types are supported and nobody will.

Secondly, if types that we support get a new operator=: Ka-Boom.

Thirdly, it doesn't solve the second example I've shown with Milliseconds and Nanoseconds.

Now for the extension point, the best way to do that is to define another constructor or operator= to user types.

But if users want to have an extension point for std::string, that's not possible. As I said, compilation fails before we can do anything. It will fail before the extension point mechanism gets called as well.

I don't think we can have a middle-ground, but I might be wrong.

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented Feb 9, 2018

I think the customization point way can work, I'll try to experiment this weekend.

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented Feb 9, 2018

I had some time to experiment before lunch:

#include <iostream>
#include <string>
#include <type_traits>

template <typename T, typename = void> struct custom;

class json;

template <> struct custom<std::string> {
  static std::string convert(json const &j) { return "works"; }
};

/* Uncommenting this will break with: use of overloaded operator '=' is ambiguous
template <> struct custom<const char*> {
  static const char* convert(json const &j) { return "works"; }
};
*/


class json {
public:
  template <typename T, std::enable_if_t<sizeof(custom<T>), int> = 0>
  operator T() {
    return custom<T>::convert(*this);
  }
};

int main(int argc, char const *argv[]) {
  json j;
  std::string s;

  s = j;
  std::cout << s << std::endl;
}

But the problem remains, customization point or not, it does not solve anything...

As for the whitelist, we already have a blacklist inside operator T(), where we explicitely disable const char*, std::initializer_list<...> etc... just to make std::string work.

But it does not make std::vector, nor any user type with custom operator= work.
About a macro that would conditionally disable operator T(), I find that very dangerous. Just imagine a LibA which uses a LibB, both have json as a dependency, but not with the same #define...

If LibB implementation relies on -DJSON_DISABLE_OPERATOR_T=ON, and LibA does not, everything will break.

My conclusion: I fear that the operator T() feature is broken by design.

@stale

This comment was marked as off-topic.

@stale stale bot added the state: stale label Mar 11, 2018
@theodelrieu

This comment was marked as off-topic.

@stale

This comment was marked as off-topic.

@stale stale bot added the state: stale label Apr 14, 2018
@nlohmann nlohmann removed the state: stale label Apr 19, 2018
@match41
Copy link

@match41 match41 commented Apr 24, 2018

I vote for the removal of the implicit converter.

This is the same reason that C++ std::string does not convert to const char* implicitly.

@scinart
Copy link
Contributor

@scinart scinart commented May 8, 2018

I have a case that supports such removal. Having operator ValueType() const makes the following code very confusing.

template <typename T>
void f(T x)
{
    nlohmann::json j = nlohmann::json::array();
    j.push_back(x);
    std::tuple<T> t = j;
}

int main()
{
    f<std::string>("123"); // OK
    f<int>(123);           // throws an exception
    return 0;
}

There is nothing wrong with from_json_tuple, but rather, tuple has this strange behavior:

  template<typename... _UElements, typename
       enable_if<
	  _TMC<_UElements...>::template
                _MoveConstructibleTuple<_UElements...>()
              && _TMC<_UElements...>::template
                _ImplicitlyMoveConvertibleTuple<_UElements...>()
              && (sizeof...(_Elements) >= 1),
    bool>::type=true>
    constexpr tuple(_UElements&&... __elements)
    : _Inherited(std::forward<_UElements>(__elements)...) { }

  template<typename... _UElements, typename
    enable_if<
	  _TMC<_UElements...>::template
                _MoveConstructibleTuple<_UElements...>()
              && !_TMC<_UElements...>::template
                _ImplicitlyMoveConvertibleTuple<_UElements...>()
              && (sizeof...(_Elements) >= 1),
    bool>::type=false>
    explicit constexpr tuple(_UElements&&... __elements)
: _Inherited(std::forward<_UElements>(__elements)...) {	}

So when it is int, it requires an json->int conversion by using the implicit tuple constructor.

Perhaps it would be something like json_cast<T>(...) that does the explicit conversion, and throws exception on cast error?

@stale

This comment was marked as off-topic.

@stale stale bot added the state: stale label Jun 7, 2018
@nlohmann nlohmann removed the state: stale label Jun 7, 2018
@maddanio
Copy link

@maddanio maddanio commented Jul 3, 2018

if anyone cares: i am all for removing the implicit conversion. implicit conversions are a very common root of evil and should be avoided as much as possible.

@nlohmann
Copy link
Owner

@nlohmann nlohmann commented Jul 4, 2018

I still think that implicit conversion is one of the central features of the library that allows to write code in a simple fashion. I do not like the way other libraries force the user to always spell out which type goes in and out. I do understand the problems that arise with implicit conversions (and we do have them as well!), but I am not yet convinced that the dangers outweigh the benefits. But since the removal would be a breaking change, we have time for a longer discussion.

@maddanio
Copy link

@maddanio maddanio commented Jul 4, 2018

Hmm. Not convinced. You can always save mentioning the type by means. For example:

auto a = mytype{j}.
template<typename T> from_json(const json& k, T&& t);
@stale
Copy link

@stale stale bot commented Aug 3, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale label Aug 3, 2018
@maddanio
Copy link

@maddanio maddanio commented Aug 4, 2018

yikes. this just bit me in the behind. having operator T like that really opens very...interesting cans of worms.
For my approach to error reporting i extended the constructors of basic_json with an additional source_location element. but now they become ambiguous once on the (size_t, json) constructor, because json converts to anything...including source location...even though i never implemented such a conversion...

@stale stale bot removed the state: stale label Aug 4, 2018
@RosettaCoder
Copy link

@RosettaCoder RosettaCoder commented Apr 11, 2019

I understand and find 0,1,2 all perfectly fine. I just violently oppose step 3,
especially when it comes to basic built-in types and cstdint etc.

Your comment above is an interesting idea but doesn't handle built-in types very well, I have no interest in repeating the type all over the place, it violates the DRY principle. Sure it's fine when declaring a new auto variable, but it's a nuisance when the variable already exists.

Please good sir, kindly allow me to continue shooting myself in the foot in the time honored tradition of C/C++.

@maddanio
Copy link

@maddanio maddanio commented Apr 11, 2019

I get what you mean. I think a define may be a bad idea though. Macros have a very weird scope. Something like a using namespace or such would be much cleaner imho

@maddanio
Copy link

@maddanio maddanio commented Apr 11, 2019

Interesting problem space. The question keeps coming up if c++ can be as easy to use as Java Script, as safe as rust, as expressive as Haskell, it’s difficult. But macros are rarely a good way to get there. You will end up with a race who defined it first before including the header. And if you win you may end up breaking included template implementations.

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented Apr 12, 2019

I have no interest in repeating the type all over the place, it violates the DRY principle.

About the DRY principle: I only convert high-level types to/from json,the low-level (built-in) types are only used in from_json/to_json functions.
Thus, I never keep the JSON object alive very long in my code, and I only call get once.
Could you provide an example where you have to repeat yourself? It seems to me this code could be put in a from_json function.

Sure it's fine when declaring a new auto variable, but it's a nuisance when the variable already exists.

In this case you can use j.get_to(var).

Also, do not forget the main reason to remove the non-explicit conversion operator: it is broken.
The issues linked here, the comments are just a glimpse of what could happen to users.

I'd like to remind everyone that, in order to make the conversion to std::string "work", we have the following SFINAE trickery (just a part of it):

   not std::is_same<ValueType, typename string_t::value_type>::value and
#if defined(JSON_HAS_CPP_17) && (defined(__GNUC__) || (defined(_MSC_VER) and _MSC_VER <= 1914))
    and not std::is_same<ValueType, typename std::string_view>::value
#endif

The first line simply means one cannot convert json to char implicitly, then we must disable std::string_view, otherwise the conversion to std::string will fail to find an appropriate constructor.
Who knows what could break in future versions of the standard (not to mention user types), this is a maintenance bio-hazard.

The worst part is the interoperability with other libraries.

Let's take fmt as an example, I wrote a custom formatter for json objects, but implicit conversions triggered ambiguous overloads in fmt's internals...

The "fix" was to add the following in my own code:

namespace nlohmann
{
template <typename C>
fmt::internal::init<C, nlohmann::json const&, fmt::internal::custom_type>
make_value(nlohmann::json const& j)
{
  return {j};
}
}

It was painful to find, understand, and solve. It's as brittle as it can be, and will probably break at every fmt's new release.

Understanding why it happens is hard, I'm lucky to know the internals of the library well, but for someone that is new to it, and/or not very versed in template gory details it can just be a doorway to madness.

@RosettaCoder
Copy link

@RosettaCoder RosettaCoder commented Apr 14, 2019

All very solid arguments, why this feature should not be enabled by default and I think most users agree that it is a sensible way forwards. However I see no arguments why this feature could not remain opt-in, use at your own risk only, with disclaimers that it can fail at any moment.

One use-case for keeping the JSON object alive is to make small changes to it and then convert it to a binary format for sending across a socket.

get_to is a good addition, I use it when I have to, but it reads backwards compared to an assignment.
int x;
j.get_to(x);

jaredgrubb's proxy idea is acceptable to me.(but imho it has to be a member, not a free function) I don't know if it improves anything from your perspective though.

int x;
x = j.unwrap();

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented Apr 15, 2019

However I see no arguments why this feature could not remain opt-in

Here is an additional one: Your app uses json and libA, which also uses json but does not opt-in to implicit conversion, whereas your app does. This is an ODR-violation of the conversion operator.

That's undefined behavior. But let's say libA is header-only, now you are very likely to get compiler errors since the code required to convert from a json value is incompatible depending on if you opt-in or not.

// only works with implicit conversions
std::string s = j;
std::vector<int> v = j;

// only works with explicit conversions
std::vector<int> v(j);

// works with both because of hacks in the library's code
std::string s(j);

// does not work with any of them (due to the same hacks)
char c = j;
char c(j);

I'd like to emphasize that this is a design bug, not a feature.

If the sole reason to keep it available to users is inconvenience linked to the additional verbosity, I do not think it is a meaningful one.

Could you please post a code example which would be hindered by explicit conversions?

@RosettaCoder
Copy link

@RosettaCoder RosettaCoder commented Apr 15, 2019

First of all this has nothing to do with verbosity.

// violates DRY
bool b = json.get<bool>();

// violates traditional assignent order (hampers readability)
json.get_to(b);

// verbose but perfectly fine API
b = json.implicitly_convert_to_anything;

ODR-violation

  1. Should be possible to solve with @maddanio's namespace idea instead of abusing macros.
  2. As an other alternative to opt-in, one could allow whitelisting of all basic types
    I can't imagine 'int' ever failing, not even in c++57.

Header Only

  1. Compilation errors are your friend, if you misuse the library and get a compilation errors,
    all is as it should.

The reason I chose this library is because it has archived perfection,
no DRY violation or other similar issues like I explained above.

Think of it as a bug if you will, but some companies(MS) actually support their old bugs for decades. If implicit conversion had not existed, I likely would have chosen some other library, maybe a faster one, maybe one that uses less memory, but NO. I valued this one perfect feature above all and now you want to end-of-life this killer feature, if you do, I simply can't upgrade.

Maybe that is one solution, similar to the python2/3 split, keep the old branch alive almost forever only backporting security fixes, fine by me as a mere user, I'm happy with the current feature set. If I was @nlohmann I wouldn't be thrilled by maintaining two branches though. I think the other alternatives are much easier...

@ibkevg
Copy link

@ibkevg ibkevg commented Apr 15, 2019

For another data point I'll just add that we too chose this library in part for the syntactic elegance reasons cited by RosettaCoder as well as cleanliness of it's C++ language integration. However, after wrestling with problems caused by implicit conversions we went with an "explicit everywhere" policy. It's not that misuse of the library causes these problems, it's more that normal use of C++ and the normal evolution of C++ code can lead to these problems. theodelrieu's original post gives good examples of situations that violate the principle of least surprise.

@gracicot
Copy link
Contributor

@gracicot gracicot commented Apr 15, 2019

@RosettaCoder there is multiple problems with both the current API and the solution to allow implicit conversion to some types.

  1. To be easy to learn, consistent and easy to understand, the API should be uniform for any supported types. This should be true for int, bool, as well as std::string, std::vector and user provided types like ns::person. We should not have special cases that help one use case or a small subset of users, since it reduces usability for all (every special case is something that induce cognitive overhead). Right now, the current API cannot do that. The very first post in this issue highlight the problem. It cannot work in a lot of cases. User defined constructors and assignement operator are widly use, yet we break on it.
  2. The library should not break when both user code and this library's code don't change. Sadly, it can happen quite easily. This could hinder the user from upgrading other libraries, language revision, or even compiler minor version when a standard library issue is fixed. I had trouble with this when adding support for std::string_view.

This is unfortunately far from perfect as an API. C++ don't provide a good enough implicit conversion mechanism to be able to fix these two points.

The price to pay is a slight change in syntax for assignement and conversion to be explicit. In some really rare cases, a user must wrap the json value to continue to reflect on constructors and parameter types. Yet this is not the intended use case of this library. I encouraged to delay the deprecation into the next version for this purpose.

It seem not everyone is yet convinced to deprecate the implicit conversions. I'm not sure I can add anything else or provide better arguments than this.

About that:

Compilation errors are your friend, if you misuse the library and get a compilation errors,
all is as it should.

I completely agree with this. But the contrary is also true: If you use the library as it should, then it should compile correctly. Right now it can happen quite often that it won't work and mutiple bug report have been opened because supposedly valid use case would suddently break. This hurt users of the library, as well as the maintainers.

If someone truely need implicit conversion everywhere as of right now, one can create a wrapper around the library json type:

struct implicit_json : nlohmann::json {
    using nlohmann::json::json;

    template<typename T, std::enable_if_t<!std::is_base_of_v<nlohmann::json, T> && std::is_constructible_v<nlohmann::json, T> && !std::is_assignable_v<nlohmann::json, T>, int> = 0>
    my_json& operator=(T&& value) {
        static_cast<nlohmann::json&>(*this) = nlohmann::json(std::forward<T>(value));
        return *this;
    }

    template<typename T, std::enable_if_t<!std::is_base_of_v<nlohmann::json, T> && std::is_assignable_v<nlohmann::json, T>, int> = 0>
    my_json& operator=(T&& value) {
        static_cast<nlohmann::json&>(*this) = std::forward<T>(value);
        return *this;
    }

    template<typename T, typename = decltype(std::declval<nlohmann::json>().get<T>())>
    operator T () const {
        return this->get<T>();
    }
};

Live at Compiler Explorer

This in fact can be shipped completely separately from the library.

Edit: Of course, a thinner wrapper like implicit(json) I wrote above would be preferred to that one in my opinion.

@maddanio
Copy link

@maddanio maddanio commented Apr 15, 2019

Hmm, so how about something like unwrap? Maybe the best middle ground?

@maddanio
Copy link

@maddanio maddanio commented Apr 15, 2019

Should be very simple to implement. Basically keep a const ref and expose an implicit conversion operator that calls the explicit one of the wrapped json object. mytype foo = unwrap(j). Or maybe better name? implicit? auto_cast, ...?

@RosettaCoder
Copy link

@RosettaCoder RosettaCoder commented Apr 15, 2019

@gracicot
Great, finally feels like we are getting somewhere, thanks!

We probably have to override items() also though?

@gracicot
Copy link
Contributor

@gracicot gracicot commented Apr 15, 2019

@RosettaCoder yes, this small code sample just show the idea is at least possible, I did not tried to make it complete.

@nlohmann
Copy link
Owner

@nlohmann nlohmann commented Apr 16, 2019

I like where this discussion is going! Unfortunately, I have little time in the moment, so I won’t be able to participate.

@maddanio
Copy link

@maddanio maddanio commented Apr 30, 2019

uh, this just bit me again hard. try this:
std::optional<nlohmann::json> foo{nlohmann::json{}};
instead of just constructing an optional with a value it will go trying to deserialize the json...

@theodelrieu
Copy link
Contributor Author

@theodelrieu theodelrieu commented May 2, 2019

How so? The code works fine on my side.

@maddanio
Copy link

@maddanio maddanio commented May 2, 2019

Hmm, then there must have been something special going on on my side.

@maddanio
Copy link

@maddanio maddanio commented May 2, 2019

Btw, are there any plans how to proceed here? Will operator T() be deprecated? Any chance the magic constructors for object/array could then also be deprecated?

@maddanio
Copy link

@maddanio maddanio commented May 2, 2019

Oh, and the templated implicit constructor (i.e. other direction)

@gracicot
Copy link
Contributor

@gracicot gracicot commented May 2, 2019

@maddanio The first step is to add a compile time switch to make the conversion operator explicitm. It hasn't been merged yet but I hope it will be before the next version. I don't know if there is any plan to change the constructor. I would like to, since I would be able to drop my wrapper.

@gracicot
Copy link
Contributor

@gracicot gracicot commented Jun 30, 2019

I recently updated my project to find out that every implicit conversion to std::string_view didn't compile anymore. Even if most place was using my explicit conversion wrapper, there is still places where one or two slipped past my attention.

I would really appreciate if a switch for explicit conversion was merged, since they still enable a somewhat nice syntax without being risky to use. Breakage are easily introduced with bug fixes, and it did happened for me in non major versions.

@chemphys
Copy link

@chemphys chemphys commented Oct 2, 2019

I don't know if this is gonna be useful or has been mentioned anywhere else, but there is a bypass for this issue (not elegant, but fixes temporary the issue)

Let's say we have something like this:

// Try to get box
    // Default: no box (empty vector)
    std::vector<double> box;
    try {
        box= j["MBX"]["box"];
        throw 1;
    } catch(int i) {
        box.clear();
    }
    box_ = box;

As mentioned in the issue description, it won't work due to the T() problem. However, if it is needed to keep the structure, an ugly fix is to do:

// Try to get box
    // Default: no box (empty vector)
    std::vector<double> box;
    try {
        std::vector<double> box2 = j["MBX"]["box"];
        box = box2;
        throw 1;
    } catch(int i) {
        box.clear();
    }
    box_ = box;

I am not sure if this will be helpful to anybody, but this makes it work keeping the scopes in the right place. If is irrelevant, please feel free to delete the message.

@abrownsword
Copy link

@abrownsword abrownsword commented May 6, 2020

I just got bitten by this -- a code change caused a json variable to be implicitly converted to a bool, and it would throw an exception (which was caught and handled a bit too silently). I didn't actually expect json -> bool to be implicit and had assumed the compiler would show me all the places where I had to update the code when making the code change. This is bad, and I vote for operatorT() to go the way of the dinosaur.

@Lord-Kamina
Copy link
Contributor

@Lord-Kamina Lord-Kamina commented May 26, 2020

I came for the conversions; I stayed for the drama.

(Just began using nlohmann-json on a project; subscribing to thread)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
4.0.0
Candidate
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.