Use more stdint.h types and document disabling stdint.h #149

spl · 2014-09-16T20:00:20Z

Summary

Replaces Use standard 32-bit typedefs #145
Rename RAPIDJSON_NO_INT64DEFINE to RAPIDJSON_NO_STDINT and revise documentation to address @pah's concern about unconditionally using int32_t/uint32_t.

As I discussed in #145, I think this change improves several things:

Better type documentation
The name of the macro RAPIDJSON_NO_STDINT more accurately describes its purpose than RAPIDJSON_NO_INT64DEFINE did. It disables the #include <stdint.h> and doesn't only disable the definition of int64_t/uint64_t. Of course, this is more significant with the usage of int32_t/uint32_t.

@pah had the concern that the type definition int32_t implied the implementation type would be restricted to 32 bits on a 64-bit system. I'm certain all 64-bit systems will use the most efficient implementation type for int32_t. To check, I looked at stdint.h on Mac OS X and Linux as well as msinttypes/stdint.h: all of them use int.

miloyip · 2014-09-17T01:30:43Z

I am still unsure whether it is a good choice to make this change. It can be openly discussed.

But if it is to be changed, there will be a lot of places to be changed, including GenericDocument (some interfaces and member variables), Writer, PrettyWriter, SAX examples, etc.

spl · 2014-09-17T08:09:41Z

I'm happy to discuss it.

Thanks for pointing out the other places. I'll do those within the next few days, as I have time.

pah · 2014-09-17T11:18:56Z

@pah had the concern that the type definition int32_t implied the implementation type would be restricted to 32 bits on a 64-bit system. I'm certain all 64-bit systems will use the most efficient implementation type for int32_t. To check, I looked at stdint.h on Mac OS X and Linux as well as msinttypes/stdint.h: all of them use int.

First of all, I agree that RapidJSON currently assumes int to be exactly 32 bits, see e.g. GenericValue::Number::I. But as we currently don't support architectures where this is not the case, this is not a real problem for now. My only concern would be in that area.

I suggested to use INT_MAX and sizeof(int) in the corresponding places to avoid this implicit assumption. Or simply add a RAPIDJSON_STATIC_ASSERT(sizeof(int)==4) somewhere for now.

I still don't see the real benefit in using int32_t. It is harder to type and provides no added safety on all currently relevant platforms. IMHO, int and unsigned are perfectly reasonable types to use.

pah · 2014-09-17T15:07:41Z

Oh, and I should add, that this change may even lead to ambiguity problems, depending on the actual definition of int32_t. int32_t is usually a typedef to int (in which case it makes no difference anyways). But on some platforms/implementations (QNX, anyone?), it may well be a typedef to long, which would then lead to ambiguous (e.g. GenericValue constructor) overloads when called with a plain integer literal (which is of type int, not int32_t):

void foo(bool) {}
void foo(long) {}
// foo(1) -> foo(int) -> ambiguous!

Therefore, I would rather not change the current types. Still, adding a static assert to catch failing assumptions on the int size and layout sounds like a good idea.

spl · 2014-09-17T18:55:10Z

@pah I will try to understand and address your concerns.

int32_t is harder to type than int.

I don't see how this is a valid issue. It's 1 character shorter than unsigned: 7 vs. 8. You are willing to type int64_t but not int32_t? This seems like a relatively weak argument.
int32_t provides no added type safety over int.

But you mention QNX which uses long for fast 32-bit integers. So shouldn't the implementation use long instead of int and have the compiler check that?

Another important reason for using it is type documentation. int32_t shows the expected bounds of the type, whereas int doesn't indicate any bounds other than the system's. Documentation in code is better than documentation in a manual or comments.
int32_t may lead to ambiguous overloading when int32_t is long and the user provides an int literal (e.g. to the GenericValue constructor).
1. I think the obvious thing to do is cast the int to int32_t in that case. It also makes the user realize and check that they are actually using 32-bit values. This may be a minor annoyance, but I don't see it as a big problem.
2. Looking at the QNX link, they say that
  
  The types each specify a signed integer type that supports the fastest operations among those whose representation has at least eight, 16, 32, or 64 bits, respectively.
  
  Wouldn't you want to use the fastest 32-bit type, even if it is not int?

miloyip · 2014-09-18T02:08:55Z

In Google's C++ style guide, there are some suggestions which we can discuss and consider.

<stdint.h> defines types like int16_t, uint32_t, int64_t, etc. You should always use those in preference to short, unsigned long long and the like, when you need a guarantee on the size of an integer. Of the C integer types, only int should be used. When appropriate, you are welcome to use standard types like size_t and ptrdiff_t.

We use int very often, for integers we know are not going to be too big, e.g., loop counters. Use plain old int for such things. You should assume that an int is at least 32 bits, but don't assume that it has more than 32 bits. If you need a 64-bit integer type, use int64_t or uint64_t.

For integers we know can be "big", use int64_t.

You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.

If your code is a container that returns a size, be sure to use a type that will accommodate any possible usage of your container. When in doubt, use a larger type rather than a smaller type.

Use care when converting integer types. Integer conversions and promotions can cause non-intuitive behavior.

spl · 2014-09-18T06:30:44Z

@miloyip Good idea to look at other style guides.

I think those guidelines make sense in general. For types that are not user-facing and/or explicitly sized – e.g. in loops – using int and not something like uint32_t makes sense. That's my interpretation.

While we're looking at Google code, protobuf uses stdint.h types.

And revise the documentation.

spl · 2014-09-28T09:08:03Z

I have updated the code to change the types wherever I thought was reasonable. Your feedback is appreciated.

spl · 2014-10-08T10:22:36Z

I'm just curious about the status on this. Do you want to discuss it or think it over more? Is there something I should change?

pah · 2014-10-08T14:53:22Z

I still think that this is an unreasonable change. Potentially having to cast(!) an int (literal) on some platforms is just a horribly bad user experience for no real benefit.

If we want to add something like this, we should probably rather use something like rapidjson::(u)int32 and rapidjson::(u)int64, typedef'd to (u)int(32|64)_t and user-overridable (similar to SizeType). A static assert to enforce the correct sizes as mentioned above should then be added as well.

My 2¢.

spl · 2014-10-08T19:43:36Z

Potentially having to cast(!) an int (literal) on some platforms is just a horribly bad user experience for no real benefit.

As I mentioned above, I think the benefits are clear. The usage of stdint.h types, esp. in cases of serialization, is pretty standard.

Also, I noticed itoa.h uses int32_t and uint32_t. Thus, the Writer is already implicitly casting an int to an int32_t and an unsigned to an uint32_t.

Anyway, I guess we'll have to agree to disagree. 😄

If we want to add something like this, we should probably rather use something like rapidjson::(u)int32 and rapidjson::(u)int64, typedef'd to (u)int(32|64)_t and user-overridable (similar to SizeType).

I'm not against this, though I don't really see the need for it. But I'm happy to go along. Do you want each of them to be separately overridable?

A static assert to enforce the correct sizes as mentioned above should then be added as well.

Definitely reasonable.

@miloyip Your thoughts?

pah · 2014-10-08T21:29:31Z

Also, I noticed itoa.h uses int32_t and uint32_t.

Yes, but not as part of an overload set with bool, as does GenericValue.

Anyway, I guess we'll have to agree to disagree. 😄

Agreed. 😺

I'm not against this, though I don't really see the need for it. But I'm happy to go along. Do you want each of them to be separately overridable?

It's probably fine to have a single switch for both (or rather all four) types.

As mentioned before: RapidJSON's assumption on int being exactly 32-bit is an implementation detail, not at all needed for the user-visible API. The currently existing int overload can be seen as a pure convenience (or performance) feature. Featurewise, having only 64-bit types would be sufficient for a JSON library.

And I just don't want to break user code due to the (potentially) changing overload set. At least, we should then provide a way to keep the old behaviour by allowing an override of the new "default" type selections.

But I guess we should wait for @miloyip's position on this topic.

lichray · 2014-10-23T20:36:10Z

I against the change. The interface should use standard types, and if the implementation relies on bitwise operations we can choose extended integer types. If rapidjson's implementation assume int to be exactly 32bits, then it should be changed to int32_t, but your PR include changes to interface also. Implicit conversion from interface types to implementation types is OK, since the conversion does not actually happen at runtime.

gnzlbg · 2015-02-20T08:03:41Z

Does using int, long, short, ... mean that a file written with rapidjson in one platform might not be readable in another platform?

pah · 2015-02-20T16:52:27Z

@gnzlbg: JSON is a text format, so it can be safely shared across platforms.

gnzlbg · 2015-02-20T17:01:59Z

@pah but can you accidentally write a 64bit int as ASCII number and then try to read it into a 32bit int in another platform because in one platform int = 64bit and in another int = 32bit?

pah · 2015-02-20T17:27:40Z

After parsing a JSON document, you need to validate the inputs anyway. And v.IsInt() would return false, if the numerical value doesn't fit into 32-bit.

gnzlbg · 2015-02-20T18:03:52Z

So does that mean that it is not possible to write 64bit ints?

pah · 2015-02-20T18:34:19Z

Of course you can read/write 64-bit integers, there's (Is,Get,Set)Int64 as well.
See the online documentation.

miloyip · 2015-04-11T07:05:06Z

Thank @spl very much for offering this PR. After quite a lot of discussion, I think it is quite a style issue rather than improving in practical situations. As it comes to an official release now, I do not want to potentially break the interface. I think this can be reconsidered for a next major version (which may change the API) in future.

spl added 2 commits September 28, 2014 10:29

Rename RAPIDJSON_NO_INT64DEFINE to RAPIDJSON_NO_STDINT

30787d5

And revise the documentation.

Change int->int32_t and unsigned->uint32_t

7a9b616

spl mentioned this pull request Sep 28, 2014

Use standard 32-bit typedefs #145

Closed

pah mentioned this pull request Feb 14, 2015

Compilation fixes #241

Merged

miloyip closed this Apr 11, 2015

spl deleted the no-stdint branch June 27, 2017 10:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more stdint.h types and document disabling stdint.h #149

Use more stdint.h types and document disabling stdint.h #149

spl commented Sep 16, 2014

miloyip commented Sep 17, 2014

spl commented Sep 17, 2014

pah commented Sep 17, 2014

pah commented Sep 17, 2014

spl commented Sep 17, 2014

miloyip commented Sep 18, 2014

spl commented Sep 18, 2014

spl commented Sep 28, 2014

spl commented Oct 8, 2014

pah commented Oct 8, 2014

spl commented Oct 8, 2014

pah commented Oct 8, 2014

lichray commented Oct 23, 2014

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

miloyip commented Apr 11, 2015

Use more stdint.h types and document disabling stdint.h #149

Use more stdint.h types and document disabling stdint.h #149

Conversation

spl commented Sep 16, 2014

miloyip commented Sep 17, 2014

spl commented Sep 17, 2014

pah commented Sep 17, 2014

pah commented Sep 17, 2014

spl commented Sep 17, 2014

miloyip commented Sep 18, 2014

spl commented Sep 18, 2014

spl commented Sep 28, 2014

spl commented Oct 8, 2014

pah commented Oct 8, 2014

spl commented Oct 8, 2014

pah commented Oct 8, 2014

lichray commented Oct 23, 2014

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

gnzlbg commented Feb 20, 2015

pah commented Feb 20, 2015

miloyip commented Apr 11, 2015