Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ulong values for Json #1118

Merged
merged 1 commit into from Jun 3, 2015

Conversation

Projects
None yet
7 participants
@IgorStepanov
Copy link
Contributor

commented May 21, 2015

Hello, Vibe.D!

I've written about my problem on forum: http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/thread/25482/

I use vibe.d on small commercial product, and two days ago I got this trouble.
One REST server (api.shodan.io) sent me a JSON like {..., serial: 17559991181826658461, ...}
I read the official JSON standard: http://json.org/
Unfortunately, it doesn't define bounds for a number value, thus 99999999999999999999999999 is a correct JSON value.
However the most real systems uses 64-bit (as maximum) signed and unsigned integers for describe integral values.
Thus, supporting of ulong is very needed for users.

I've added a new Type value: uint_, which represents ulong values.
I prefer int_ type instead of uint_ when it possible.
If the integral value x <= long.max, I write is to Json as int_ even if it hasn't the sign.

If this way is unacceptable, please suggest another way to parse ulong values.
We may convert it to long or to string.
Get an exception during parsing of correct JSON answer of a REST server is very, because I can't get the REST answer as string (using vibe.web.rest) and parse it manually.
Thanks.

@IgorStepanov IgorStepanov force-pushed the IgorStepanov:ulong-json branch 2 times, most recently from 0f12262 to abfdf9a May 21, 2015

@Geod24

This comment has been minimized.

Copy link
Contributor

commented May 21, 2015

Using ulong just pushes the limit, it doesn't really fix the problem. Me think supporting BigInt could definitely fix this. @s-ludwig ?

@@ -108,6 +110,7 @@ struct Json {
Null = null_, /// Compatibility alias - will be deprecated soon
Bool = bool_, /// Compatibility alias - will be deprecated soon
Int = int_, /// Compatibility alias - will be deprecated soon
Uint = uint_, /// Compatibility alias - will be deprecated soon

This comment has been minimized.

Copy link
@Geod24

Geod24 May 21, 2015

Contributor

I doubt we need to add a compatibility alias for a new feature.

@etcimon

This comment has been minimized.

Copy link
Contributor

commented May 21, 2015

I also vote for a BigInt here

@s-ludwig

This comment has been minimized.

Copy link
Member

commented May 21, 2015

BigInt would also have the advantage of better forward compatibility with std_data_json. So I'd also be in favor of that solution.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2015

So, you are all prefer BigInt? Ok. Should it replace the long value, or there should be both types?
In the second case, how should work parsing of int values?

@s-ludwig

This comment has been minimized.

Copy link
Member

commented May 21, 2015

It should work similar to the current ulong solution. So all integral numbers fitting in long should still be stored as a long and anything bigger as a BigInt. The only issue is that the naive implementation of this (convert string to BigInt and the BigInt to long if possible) will probably have a non-trivial performance impact. But for cases where that matters, we could introduce a version (VibeJsonDisableBigInt) that uses the old string -> long parsing mode.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2015

Ok, I'll do it.

@IgorStepanov IgorStepanov force-pushed the IgorStepanov:ulong-json branch 3 times, most recently from 759fdcb to a73efec May 21, 2015

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2015

@s-ludwig done

@@ -92,28 +117,59 @@ struct Json {
}
}

version (VibeUseJsonBigInt) {
long bigIntToLong() inout

This comment has been minimized.

Copy link
@s-ludwig

s-ludwig May 22, 2015

Member

Missing private

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 22, 2015

Author Contributor

fixed

string string_;
Json[string] object;
Json[] array;
}

This comment has been minimized.

Copy link
@s-ludwig

s-ludwig May 22, 2015

Member

We need to keep the workaround unless the issues mentioned in the comment don't apply anymore.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 23, 2015

Author Contributor

I missed this comment, sorry. And I don't understand it. Do you tell about replacing old void*[2] payload with this union?
We don't know the BigInt size (and we can't be sure, that it will not be changed in future), thus a union, I think, is the safest choise.

This comment has been minimized.

Copy link
@s-ludwig

s-ludwig May 25, 2015

Member

Yes exactly. I did the inverse some time ago to fix those issues.

You can do void*[max((BigInt.sizeof+(void*).sizeof-1)/(void*).sizeof, 2)] m_data; to guarantee correctness. We could try switching back to a union, but I know that I had severe issues with it in the past, so that we should be really sure that it doesn't reintroduce those.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 25, 2015

Author Contributor

Why void*?
Why not ubyte[max(BigInt.sizeof, (void*).sizeof * 2)] m_data;?

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 25, 2015

Author Contributor

Ok, I've changed it as you say. Is it ok now?
BTW, what troubles with union do you have met?

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 25, 2015

Author Contributor

ldc raises a strange error with void*[] inside biguintcore. See test results.
I'm trying ubyte[], but I think, it will fail too.
Yes. error still raises inside BigUint invariant.
There are nothing strange, this field may contain a garbarge, instead of BigInt.init

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 25, 2015

Author Contributor

Ok, I've added explicit initialization of m_bigint before assigning. Now all works.
Rebased.
Is it ok now?

version (VibeUseJsonBigInt) {
BigInt opAssign(BigInt v)
{
if (v >= long.min && v <= long.max) {

This comment has been minimized.

Copy link
@s-ludwig

s-ludwig May 22, 2015

Member

I think making this choice only in the parser is a better idea. It could be surprising to have code like this trigger an error: void test(BigInt num) { Json j = num; assert(j.type == Json.Type.bigInt); }

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 22, 2015

Author Contributor

fixed

@s-ludwig

This comment has been minimized.

Copy link
Member

commented May 22, 2015

Thanks @IgorStepanov! I'm leaning towards making the version only apply to the parsing stage, but to keep general support for BigInt anyway. The current approach just seems to make it too likely to break foreign code (such as a final switch (json.type)). Thoughts?

Apart from my comments above, seems to look fine. I'll do a more thorough review later.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2015

I'm leaning towards making the version only apply to the parsing stage, but to keep general support for BigInt anyway. The current approach just seems to make it too likely to break foreign code (such as a final switch (json.type)). Thoughts?

Ok, I've changed the code. See the second commit.
However, If we talk only about parsing, we may simply check the number length in skipNumber, and construct BigInt only if it needed.
Note, JSON doesn't support non-trivial integer literals like octal, or hex thus parsing of JSON ints is so easy.
We may simply compute result while parsing a number (num *= 10; num += (s[idx]) = '0');) and check the overflow at each step. And no special version.

@marcioapm

This comment has been minimized.

Copy link
Contributor

commented May 22, 2015

Sorry for being late to the party, but what about supporting cent/ucent, instead? :)
At first sight it seems like a more natural extension than BigInt.

I'm not sure what the state of this is in DMD but I remember it was worked on recently for LDC.

Seems like a bad spot to be in for performance sensitive scenarios - deciding between not supporting values >= 2^63 and < 2^64 or using BigInts which require extra heap storage currently, whereas the cent would fit in the Json value's buffer - converting cent back to ulong is also pretty fast.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2015

@marcioapm cent isn't supported in dmd and it doesn't solve the general problem: json standard allows an unlimited numbers.
For the floating point numbers a trouble is less essential: all float types allow to represent any real number (-inf, +inf) with some error.
For an integer number json, current parser may reject a correct value which greater then long.max. I is not good.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2015

@s-ludwig I've add an overflow check to skipNumber, and removed version (VibeJsonDisableBigInt):
now I build a BigInt value only if number is greater then long.max.

@@ -108,12 +128,12 @@ struct Json {
Null = null_, /// Compatibility alias - will be deprecated soon
Bool = bool_, /// Compatibility alias - will be deprecated soon
Int = int_, /// Compatibility alias - will be deprecated soon
BigInt = bigint, /// Compatibility alias - will be deprecated soon

This comment has been minimized.

Copy link
@schuetzm

schuetzm May 24, 2015

Contributor

This is unnecessary.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 24, 2015

Author Contributor

Ok, removed.

@IgorStepanov IgorStepanov force-pushed the IgorStepanov:ulong-json branch 2 times, most recently from 19d39e8 to 6496323 May 24, 2015

@Geod24

This comment has been minimized.

Copy link
Contributor

commented May 25, 2015

Could you squash your commits into a single one ?

@@ -17,4 +17,12 @@ void main()
Json parent = obj;
parent.remove("item1");
foreach (i; obj) writeln(i);

version (VibeJsonDisableBigInt) {
} else {

This comment has been minimized.

Copy link
@Geod24

Geod24 May 25, 2015

Contributor

This seem to be the only occurence of the version left.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

My bad. I'll remove it.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

done

import std.bigint;
auto obj2 = parseJsonString(`{"serial":17559991181826658461}`);
writeln("serial: ", obj2["serial"]);
assert(obj2["serial"] == BigInt(17559991181826658461UL));

This comment has been minimized.

Copy link
@Geod24

Geod24 May 25, 2015

Contributor

Also, it'd be good to have a test that test number outside of a size_t boundary, to ensure that this part is also working.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

Ok. Please wait few minutes.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

done

@@ -44,7 +44,7 @@ unittest {
public import vibe.data.serialization;

public import std.json : JSONException;
import std.algorithm : equal, min;
import std.algorithm : equal, min, max;

This comment has been minimized.

Copy link
@Geod24

Geod24 May 25, 2015

Contributor

Minor: I would suggest that we reduce, and even remove, any selective import at the top. Until 313 and 314 are fixed, those cause symbols leakage, and upgrading a version of Vibe.d which removes one of the import might break code. And when 313 / 314 are fixed, it'll definitely breaks a lot of code...

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

Do you suggest to move std.algorithm : max inside struct Json?
I can't move/remove import std.algorithm : equal, min because it irrelevant with this PR.

This comment has been minimized.

Copy link
@Geod24

Geod24 May 26, 2015

Contributor

I'll just make it import std.algorithm and call it a day.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

done

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

/tests/redis/source/app.d used equal from json. I've added an import to app.d

void initBigInt()
{
BigInt init_;
(cast(ubyte*)m_data.ptr)[0 .. BigInt.sizeof] = (cast(ubyte*)&init_)[0 .. BigInt.sizeof];

This comment has been minimized.

Copy link
@Geod24

Geod24 May 26, 2015

Contributor

So that's the LDC bug I guess. Do you have an issue number, or something ? I think anyone reading this in a 6+ months time would be like "WHAT? WHY?".

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

I think it is not a bug. Simply dmd and gdc uses another BigInt implementation: with inline asm.
The general idea: BigInt.init contains a BigUInt instance, which initially should contatin a [0] array.
When we get BigInt via getDataAs, it may contain the garbarge (null or something worse) (and BigUInt invariant is fault). Thus, we should explicitly write BigInt.init to storage before the first using.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

I think anyone reading this in a 6+ months time would be like "WHAT? WHY?".

I should add the comment, yes?

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

Comments have been added.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2015

Could you squash your commits into a single one ?

All commits represents different approaches. If the my last approach will be considered best, I squash commits. Otherwice, it will be much easy to rollback to previous version.

@@ -181,6 +199,8 @@ struct Json {
/// ditto
long opAssign(long v) { m_type = Type.int_; m_int = v; return v; }
/// ditto
BigInt opAssign(BigInt v) { m_type = Type.bigint; initBigInt(); m_bigint = v; return v; }

This comment has been minimized.

Copy link
@Geod24

Geod24 May 26, 2015

Contributor

IDK how BigInt are implemented, but I can see the cycle: 'blitting the default value' -> 'assigning data' -> 'Do it again' as potentially problematic: if BigInt use malloced memory, it's more likely that on assignement, either malloc or free (or both) happen.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

IDK how BigInt are implemented, but I can see the cycle

Default value may be only static value, because BigInt is a struct, and BigInt.init should be known at compile time.
Thus everytime, a, b and c in the following example has a identical (bitwise) value. Thus I simply write BigInt.init to the m_data.

    BigInt a;
    BigInt b;
    BigInt c;

This comment has been minimized.

Copy link
@Geod24

Geod24 May 26, 2015

Contributor

What I was refering to was, if someone calls opAssign with a BigInt value multiple time, the finalizers of all but the last 'instance' will not be run.

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

This case is fixed.
initBigInt() is called only if old m_type in assign is not Type.bigint
However there are another situation:
Json j = BigInt(...);
j = 5;
I may add a special destructor for it, however BigInt doesn't have ~this, I've checked id.
What do you think?

@Geod24

This comment has been minimized.

Copy link
Contributor

commented May 26, 2015

All commits represents different approaches. If the my last approach will be considered best, I squash commits. Otherwice, it will be much easy to rollback to previous version.

That request is just to make the history easier to understand. I often find myself blameing a file, to see when the code was introduced, and why. Having to go through various modifications for the same P.R. just eats a lot of time (same goes for history).

As per the rollback, with git, you can easily checkout a branch that points to your current commit, and squash them together. git never destroys old objects - unless nothing points to it, and they are gc'ed, which is ~ 2 weeks later, but won't happen with another branch -.

@IgorStepanov IgorStepanov force-pushed the IgorStepanov:ulong-json branch 3 times, most recently from e9d66b6 to 02de141 May 26, 2015

BigInt opAssign(BigInt v)
{
if (m_type != Type.bigint)
initBigInt();

This comment has been minimized.

Copy link
@Geod24

Geod24 May 26, 2015

Contributor

IIUC, in the following code:

Json bi;
bi = BigInt("42");
bi = 42L;
bi = BigInt("42");

The bug is still present, isn't it ?

This comment has been minimized.

Copy link
@IgorStepanov

IgorStepanov May 26, 2015

Author Contributor

Ok, I've fixed it. (see the below description)

@s-ludwig

This comment has been minimized.

Copy link
Member

commented May 26, 2015

@IgorStepanov WRT union vs void*[...], the issues I had were memory leaks, but also it broke using std.algorithm.swap (and thus std.sort etc.). void* needs to be used instead of ubyte to tell the GC to look for pointers within the array. If we'd store for example a Json[] within an ubyte[] array, it may simply get collected even if it's still referenced.

@s-ludwig

This comment has been minimized.

Copy link
Member

commented May 26, 2015

@Geod24 The destructor issue is actually a good point. In fact, the same issue is in std_data_json/JSONNumber and nobody has noticed it yet (probably because it looks so natural with the union syntax). Maybe it would be worth to at least add a warning when a type with destructor is used within a union and that union is stored in a struct without destructors (same for postblit and possibly opAssign).

Also, I'll have to find the tagged union implementation for Phobos again that I had started some time ago and submit a PR. Algebraic is (not so) nice for many things, but often it's simply desirable to work with a type enum instead of TypeInfos and one shouldn't have to mess with this kind of low level stuff in regular code.

Igor Stepanov

@IgorStepanov IgorStepanov force-pushed the IgorStepanov:ulong-json branch from 02de141 to 5a3224b May 26, 2015

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2015

@s-ludwig @Geod24 I've fixed BigInt destruct issue.
Now when we try to call opAssign for BigInt objects, and save non-BigInt to it, finiBigInt is called.
finiBigInt simply creates an automatic BigInt init_ variable, and swaps it with the stored BigInt value.
When finiBigInt is finished, old BigInt value will be destroyed with init_, and m_bigint now contains a BigInt.init value, which can be rewritten with any non-BigInt value without any possible leaks.

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jun 2, 2015

Looks good to me now. I'll just make some minor, mostly cosmetic, adjustments after merge. Does anyone else still see any issue?

@Geod24

This comment has been minimized.

Copy link
Contributor

commented Jun 2, 2015

Well... I look forward to std.data.json :D

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jun 3, 2015

Good thing is that this change improves compatibility with std.data.json.

Merging now. Thanks Igor for your work!

s-ludwig added a commit that referenced this pull request Jun 3, 2015

Merge pull request #1118 from IgorStepanov/ulong-json
support ulong values for Json

@s-ludwig s-ludwig merged commit 309ea82 into vibe-d:master Jun 3, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

s-ludwig added a commit that referenced this pull request Jun 3, 2015

Cosmetic changes. See #1118.
- Json.Type.bigint -> "bigInt" (to match "BigInt")
- Move private methods to the bottom of struct Json
- Move type check into finiBigInt and rename to runDestructors

s-ludwig added a commit that referenced this pull request Jun 3, 2015

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jun 3, 2015

Okay, got the changes in. The only interface change has been to rename Type.bigint to Type.bigInt.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented Jun 3, 2015

Okay, got the changes in. The only interface change has been to rename Type.bigint to Type.bigInt.

Thanks!
BTW, when do you plan to release the next vibe.d version (with this patch)? I mean, that when I'll able to fetch it with dub.
I use vibe.d in a commercial product, and if the next release will not be soon, I'll write temporary code to handle this issue until the next version will be released.

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jun 3, 2015

I could tag an alpha/beta release right away.

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jun 3, 2015

Okay 0.7.24-beta.1 is tagged now.

@IgorStepanov

This comment has been minimized.

Copy link
Contributor Author

commented Jun 3, 2015

Okay 0.7.24-beta.1 is tagged now.

Thanks a lot! :)

@Geod24 Geod24 referenced this pull request Jun 8, 2015

Merged

fix BigInt initialization #1126

@MartinNowak

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2015

???
Javascript doesn't have integer types at all, it uses double-precision (64-bit) float for any Number.
math - What is JavaScript's highest integer value that a Number can go to without losing precision? - Stack Overflow
Try alert(01234567890123456789);.

@s-ludwig

This comment has been minimized.

Copy link
Member

commented Jul 10, 2015

I don't think that this really matters, since this is JSON and not JavaScript. JSON doesn't specify anything about the numerical representation, so the implementation is basically free to choose whatever fits. Since large integers are used in practice, it would be quite arbitrary to not support them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.