Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more stdint.h types and document disabling stdint.h #149

Closed
wants to merge 2 commits into from
Closed

Use more stdint.h types and document disabling stdint.h #149

wants to merge 2 commits into from

Conversation

spl
Copy link
Contributor

@spl spl commented Sep 16, 2014

Summary

  • Replaces Use standard 32-bit typedefs #145
  • Rename RAPIDJSON_NO_INT64DEFINE to RAPIDJSON_NO_STDINT and revise documentation to address @pah's concern about unconditionally using int32_t/uint32_t.

As I discussed in #145, I think this change improves several things:

  1. Better type documentation
  2. The name of the macro RAPIDJSON_NO_STDINT more accurately describes its purpose than RAPIDJSON_NO_INT64DEFINE did. It disables the #include <stdint.h> and doesn't only disable the definition of int64_t/uint64_t. Of course, this is more significant with the usage of int32_t/uint32_t.

@pah had the concern that the type definition int32_t implied the implementation type would be restricted to 32 bits on a 64-bit system. I'm certain all 64-bit systems will use the most efficient implementation type for int32_t. To check, I looked at stdint.h on Mac OS X and Linux as well as msinttypes/stdint.h: all of them use int.

@miloyip
Copy link
Collaborator

miloyip commented Sep 17, 2014

I am still unsure whether it is a good choice to make this change. It can be openly discussed.

But if it is to be changed, there will be a lot of places to be changed, including GenericDocument (some interfaces and member variables), Writer, PrettyWriter, SAX examples, etc.

@spl
Copy link
Contributor Author

spl commented Sep 17, 2014

I'm happy to discuss it.

Thanks for pointing out the other places. I'll do those within the next few days, as I have time.

@pah
Copy link
Contributor

pah commented Sep 17, 2014

@pah had the concern that the type definition int32_t implied the implementation type would be restricted to 32 bits on a 64-bit system. I'm certain all 64-bit systems will use the most efficient implementation type for int32_t. To check, I looked at stdint.h on Mac OS X and Linux as well as msinttypes/stdint.h: all of them use int.

First of all, I agree that RapidJSON currently assumes int to be exactly 32 bits, see e.g. GenericValue::Number::I. But as we currently don't support architectures where this is not the case, this is not a real problem for now. My only concern would be in that area.

I suggested to use INT_MAX and sizeof(int) in the corresponding places to avoid this implicit assumption. Or simply add a RAPIDJSON_STATIC_ASSERT(sizeof(int)==4) somewhere for now.

I still don't see the real benefit in using int32_t. It is harder to type and provides no added safety on all currently relevant platforms. IMHO, int and unsigned are perfectly reasonable types to use.

@pah
Copy link
Contributor

pah commented Sep 17, 2014

Oh, and I should add, that this change may even lead to ambiguity problems, depending on the actual definition of int32_t. int32_t is usually a typedef to int (in which case it makes no difference anyways). But on some platforms/implementations (QNX, anyone?), it may well be a typedef to long, which would then lead to ambiguous (e.g. GenericValue constructor) overloads when called with a plain integer literal (which is of type int, not int32_t):

void foo(bool) {}
void foo(long) {}
// foo(1) -> foo(int) -> ambiguous!

Therefore, I would rather not change the current types. Still, adding a static assert to catch failing assumptions on the int size and layout sounds like a good idea.

@spl
Copy link
Contributor Author

spl commented Sep 17, 2014

@pah I will try to understand and address your concerns.

  1. int32_t is harder to type than int.

    I don't see how this is a valid issue. It's 1 character shorter than unsigned: 7 vs. 8. You are willing to type int64_t but not int32_t? This seems like a relatively weak argument.

  2. int32_t provides no added type safety over int.

    But you mention QNX which uses long for fast 32-bit integers. So shouldn't the implementation use long instead of int and have the compiler check that?

    Another important reason for using it is type documentation. int32_t shows the expected bounds of the type, whereas int doesn't indicate any bounds other than the system's. Documentation in code is better than documentation in a manual or comments.

  3. int32_t may lead to ambiguous overloading when int32_t is long and the user provides an int literal (e.g. to the GenericValue constructor).

    1. I think the obvious thing to do is cast the int to int32_t in that case. It also makes the user realize and check that they are actually using 32-bit values. This may be a minor annoyance, but I don't see it as a big problem.

    2. Looking at the QNX link, they say that

      The types each specify a signed integer type that supports the fastest operations among those whose representation has at least eight, 16, 32, or 64 bits, respectively.

      Wouldn't you want to use the fastest 32-bit type, even if it is not int?

@miloyip
Copy link
Collaborator

miloyip commented Sep 18, 2014

In Google's C++ style guide, there are some suggestions which we can discuss and consider.

<stdint.h> defines types like int16_t, uint32_t, int64_t, etc. You should always use those in preference to short, unsigned long long and the like, when you need a guarantee on the size of an integer. Of the C integer types, only int should be used. When appropriate, you are welcome to use standard types like size_t and ptrdiff_t.

We use int very often, for integers we know are not going to be too big, e.g., loop counters. Use plain old int for such things. You should assume that an int is at least 32 bits, but don't assume that it has more than 32 bits. If you need a 64-bit integer type, use int64_t or uint64_t.

For integers we know can be "big", use int64_t.

You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.

If your code is a container that returns a size, be sure to use a type that will accommodate any possible usage of your container. When in doubt, use a larger type rather than a smaller type.

Use care when converting integer types. Integer conversions and promotions can cause non-intuitive behavior.

@spl
Copy link
Contributor Author

spl commented Sep 18, 2014

@miloyip Good idea to look at other style guides.

I think those guidelines make sense in general. For types that are not user-facing and/or explicitly sized – e.g. in loops – using int and not something like uint32_t makes sense. That's my interpretation.

While we're looking at Google code, protobuf uses stdint.h types.

@spl spl mentioned this pull request Sep 28, 2014
@spl
Copy link
Contributor Author

spl commented Sep 28, 2014

I have updated the code to change the types wherever I thought was reasonable. Your feedback is appreciated.

@spl
Copy link
Contributor Author

spl commented Oct 8, 2014

I'm just curious about the status on this. Do you want to discuss it or think it over more? Is there something I should change?

@pah
Copy link
Contributor

pah commented Oct 8, 2014

I still think that this is an unreasonable change. Potentially having to cast(!) an int (literal) on some platforms is just a horribly bad user experience for no real benefit.

If we want to add something like this, we should probably rather use something like rapidjson::(u)int32 and rapidjson::(u)int64, typedef'd to (u)int(32|64)_t and user-overridable (similar to SizeType). A static assert to enforce the correct sizes as mentioned above should then be added as well.

My 2¢.

@spl
Copy link
Contributor Author

spl commented Oct 8, 2014

Potentially having to cast(!) an int (literal) on some platforms is just a horribly bad user experience for no real benefit.

As I mentioned above, I think the benefits are clear. The usage of stdint.h types, esp. in cases of serialization, is pretty standard.

Also, I noticed itoa.h uses int32_t and uint32_t. Thus, the Writer is already implicitly casting an int to an int32_t and an unsigned to an uint32_t.

Anyway, I guess we'll have to agree to disagree. 😄

If we want to add something like this, we should probably rather use something like rapidjson::(u)int32 and rapidjson::(u)int64, typedef'd to (u)int(32|64)_t and user-overridable (similar to SizeType).

I'm not against this, though I don't really see the need for it. But I'm happy to go along. Do you want each of them to be separately overridable?

A static assert to enforce the correct sizes as mentioned above should then be added as well.

Definitely reasonable.

@miloyip Your thoughts?

@pah
Copy link
Contributor

pah commented Oct 8, 2014

Also, I noticed itoa.h uses int32_t and uint32_t.

Yes, but not as part of an overload set with bool, as does GenericValue.

Anyway, I guess we'll have to agree to disagree. 😄

Agreed. 😺

I'm not against this, though I don't really see the need for it. But I'm happy to go along. Do you want each of them to be separately overridable?

It's probably fine to have a single switch for both (or rather all four) types.

As mentioned before: RapidJSON's assumption on int being exactly 32-bit is an implementation detail, not at all needed for the user-visible API. The currently existing int overload can be seen as a pure convenience (or performance) feature. Featurewise, having only 64-bit types would be sufficient for a JSON library.

And I just don't want to break user code due to the (potentially) changing overload set. At least, we should then provide a way to keep the old behaviour by allowing an override of the new "default" type selections.

But I guess we should wait for @miloyip's position on this topic.

@lichray
Copy link
Contributor

lichray commented Oct 23, 2014

I against the change. The interface should use standard types, and if the implementation relies on bitwise operations we can choose extended integer types. If rapidjson's implementation assume int to be exactly 32bits, then it should be changed to int32_t, but your PR include changes to interface also. Implicit conversion from interface types to implementation types is OK, since the conversion does not actually happen at runtime.

@pah pah mentioned this pull request Feb 14, 2015
@gnzlbg
Copy link

gnzlbg commented Feb 20, 2015

Does using int, long, short, ... mean that a file written with rapidjson in one platform might not be readable in another platform?

@pah
Copy link
Contributor

pah commented Feb 20, 2015

@gnzlbg: JSON is a text format, so it can be safely shared across platforms.

@gnzlbg
Copy link

gnzlbg commented Feb 20, 2015

@pah but can you accidentally write a 64bit int as ASCII number and then try to read it into a 32bit int in another platform because in one platform int = 64bit and in another int = 32bit?

@pah
Copy link
Contributor

pah commented Feb 20, 2015

After parsing a JSON document, you need to validate the inputs anyway. And v.IsInt() would return false, if the numerical value doesn't fit into 32-bit.

@gnzlbg
Copy link

gnzlbg commented Feb 20, 2015

So does that mean that it is not possible to write 64bit ints?

@pah
Copy link
Contributor

pah commented Feb 20, 2015

Of course you can read/write 64-bit integers, there's (Is,Get,Set)Int64 as well.
See the online documentation.

@miloyip
Copy link
Collaborator

miloyip commented Apr 11, 2015

Thank @spl very much for offering this PR. After quite a lot of discussion, I think it is quite a style issue rather than improving in practical situations. As it comes to an official release now, I do not want to potentially break the interface. I think this can be reconsidered for a next major version (which may change the API) in future.

@miloyip miloyip closed this Apr 11, 2015
@spl spl deleted the no-stdint branch June 27, 2017 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants