-
-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[issue 15229] Range-ified BigInt's Ctor to Accept Ranges of Characters #3876
Conversation
c7c92d7 to
4f1d199
Compare
|
Autotester is failing with some pretty nasty errors. Implementation bug, perhaps? |
|
@quickfur See the last two bullet points |
|
Does it help if you split the unittest into smaller chunks? Or is the code interdependent? |
|
I will check that when I get the chance, but it makes me worried and I would like to know the root cause if possible. |
|
I wasn't suggesting that as a solution, but more a way to narrow down what might be going wrong. |
|
This assert is the culprit https://github.com/JackStouffer/phobos/blob/4f1d199538bbeaf47d14e700ae5915b3f9de8f69/std/bigint.d#L1151 commenting it out makes all tests pass. The only reason I can think of for it throwing an out of memory error is the |
|
Thanks for the tip, and sorry for the delay.... I managed to reproduce the problem locally; I should be able to narrow it down to learn why it's happening. Definitely something fishy is going on with the code here... |
| return true; | ||
| } | ||
|
|
||
| auto len = (s.save.walkLength - first_non_zero + 15) / 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is the cause of your bug. In the while loop just before this, you already consumed all the leading zeroes, so walkLength no longer includes the zeroes in the count. However, you still subtract by first_non_zero. When the latter happens to be longer than the length of the remaining digits, a size_t underflow happens and wraps around to a very large value (to be precise, 4611686018427387903), which the following line then tries to allocate a buffer for. So the out-of-memory error should not be surprising. :-)
The fix is trivial: there is no need to subtract first_non_zero at all, since you have already discarded those zeroes in the preceding while loop. Remove that, and everything should work as expected.
(Aren't you glad there's a unittest that just happens to have a test case that triggered this bug? It would have slipped in unnoticed otherwise...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. In fact, you probably don't even need to compute first_non_zero, since the zeroes are already consumed by the loop, and you can just work with the remaining digits directly.
8792fd1 to
f02e87f
Compare
|
Thank you for finding the root cause of the bug. I have added a unit test for the decimal walk length bug that used to fail but with the latest changes passes.
Thanks Don Clugston for your thorough unit tests! |
|
There is still the issue of the forward vs bidirectional range issue I made in the original post. Here is a good solution IMO:
Opinions? |
|
|
||
| uint hi = biguintFromDecimal(tmp, s[firstNonZero..$]); | ||
| auto predict_length = (18 * 2 + 2 * s.save.walkLength) / 19; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are leading zeroes no longer stripped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
Hmm. In theory, there is no need to require bidirectional ranges for Once you know the actual number of hex digits, say let's call it |
|
P.S. in fact, counting actual hex digits (excluding |
|
Oh actually, just noticed that by the time you get to |
|
Could you elaborate, or direct me to a resource that does already, on what you mean by "hex digits per word"? |
|
Basically, To store a converted digit into the i'th digit of the j'th word, you just do So basically, the initial scan of the forward range will determine how many hex digits you have, then you compute which digit of which word corresponds with the first digit, and then scan the forward range the 2nd time, converting each digit and writing them into the result. |
58460d4 to
a77d200
Compare
|
@quickfur I don't have the time anymore to enhance this, but I think this is definitely an improvement over the current state of the code, so I think this should be pulled as is. If this gets pulled, I will make an enhancement request on the issue tracker on the improvements that should be further made to this. |
|
ping random Phobos reviewers @andralex @yebblies @burner @DmitryOlshansky @schveiguy |
| // check for signs and if the string is a hex value | ||
| if (len > 1) | ||
| { | ||
| for (i = 0; i < 2; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this logic needlessly convoluted. Sign checking doesn't need to be interwoven with checking for 0x. It should be a very simple, 2-step process:
// Check sign
if (range.front == '+')
range.popFront(); // skip '+'
else if (range.front == '-')
{
neg = true;
range.popFront();
}
// Check for hex prefix
if (range.save.startsWith("0x"))
// is hex
else
// is decimal
Furthermore, filtering out _ seems to be done prematurely. Do we really support literals of the form _-_0_x_1234_567A?? I would think that the negative sign and the 0x prefix should not allow any intervening _'s. So really, sign checking and hex prefix checking should be done first, and then _ gets filtered out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs do state "Underscores are permitted in any location", but that was a lie anyway, so I will update that.
|
@quickfur fixed |
|
Oh no, autotester is failling with "invalid digit" errors from biguintcore. :-( |
|
Works on my machine. Not quite sure what to do here. |
|
1731 contains nothing that would cause an exception: WTF? |
|
|
@quickfur @schveiguy Added back in supposedly redundant check, it's green now. |
|
64 bit tester now running out of memory. CC @braddr |
|
Nothing wrong with the tester host itself, as far as I can tell. Something's pushed the memory usage up over the threshold or close enough to it to fail periodically. |
|
Looks good now. Anything else need fixing? |
|
Ping @quickfur @schveiguy |
|
Ping. This has been sitting here ready to go for a while now. |
| import std.exception : enforce; | ||
| import std.conv : ConvException; | ||
|
|
||
| enforce!ConvException(!s.empty, "Can't initialize BigInt with an empty range"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use assert for such simple checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@wilzbach Fixed |
| // return true if OK; false if erroneous characters found | ||
| // FIXME: actually throws `ConvException` on error. | ||
| bool fromDecimalString(const(char)[] s) pure @trusted | ||
| /// return true if OK; false if erroneous characters found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be documented?
|
@DmitryOlshansky Fixed |
|
Auto-merge toggled on |
|
Can we change the title so that the dlang bot finds this? :) |
| return true; | ||
| } | ||
|
|
||
| auto len = (s.save.walkLength + 15) / 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why we have the magic +15? If I read the below correctly (s.length/ 8) + 1 should work as we use at most 1/8 of tmp per char.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual trick of 16-1, so it rounds up correctly when divided by 16. However it is then multiplied by 4 so net total is /4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiplied by 4 so net total is /4
Imho it's 8 - see:
++partcount;
if (partcount == 8)
|
Thanks! |
|
This has caused a regression. |
This contains the commit from Added ReferenceBidirectionalRange to std.internal.test.dummyrange #3874 which this PR requires and was separated out for convenienceThis is technically a breaking change because the ctor no longer errors on parsing problems, but throws