New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #2902: Avoid Array allocation in ieee754tostring implementations #2917
Fix #2902: Avoid Array allocation in ieee754tostring implementations #2917
Conversation
@@ -4,6 +4,8 @@ import java.io.InvalidObjectException | |||
import java.util.Arrays | |||
import scala.util.control.Breaks._ | |||
|
|||
import scala.scalanative.runtime.ieee754tostring.ryu._ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revealed by this PR, but caused by the changes of this PR.
To separate concerns, I recommend opening an Issue
stating that this class should be private[lang]
.
ref: https://www.oreilly.com/library/view/java-in-a/0596007736/re107.html
That way, the append0
used in this class will not leak into
the wild.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Actually I saw that the constructor was already private[lang]
, so I stupidly thought that it was totally fine.
And I think things are a bit more complicated. If you look at trimToSize()
, it is expected to be public:
https://docs.oracle.com/javase/8/docs/api/java/lang/StringBuilder.html#trimToSize--
However, the SN StringBuilder doesn't implement an override, it is inherited from AbstractStringBuilder.
https://github.com/scala-native/scala-native/blob/main/javalib/src/main/scala/java/lang/StringBuilder.scala
Also what is sill not clear to me, is how SN actually hide all javalib classes/methods that are not part of the Java API. This magic is unclear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to create 2 issues about the breakable for the AbstractStringBuilder class
I believe the practice is to make them some form of private
. For example, I have recently
been doing some work in java.net
There is a file there SocketHelpers.scala
which is
definitely not part of the Java API. I would have to check how the class & methods are declared.
In my own recent java.net
work, I have needed private constructors (to pass Array rather
that InetAddress containing Array, sound familiar?). I because of usage patterns I needed
to mark them private[net]
. I would have preferred a much more limited scope, but could
not accomplish that .
Does this make sense?
Let me know if there is a particular area you want me to look at. I do get let out of java.net
once in a great while. I am looking forward to Christmas parole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that an "abstract class" having a private constructor should be totally fine and we can thus keep it like this.
However, what is most important I think, for consistency and readability, is to flag what methods are "private[lang]" (not part of Java API).
Don't you think?
Anyway, I think a new issue is required for this one, as it goes a bit further than the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, divide & conquer, or at least have fewer concerns beating on us in the gauntlet as we try to run it.
object RyuDouble { | ||
|
||
final val RESULT_STRING_MAX_LENGTH = 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this constant is critical it would be nice to know where
it came from or was determined. Prior Ryu art?
Agreed, the other constants below it are not attributed,
but we did not add them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to put a name that is self-explicit.
If you look at the current code, this value is put as is in when doing new scala.Array[Char](24)
without additional information.
So how can we improve that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is fine, I think.
My concern, as we have both had in other contexts, is "Where did this magic number come from?"
I suggest a one line comment similar to
"// Scala/Java magic number 24 is derived from original RYU C code magic number 25 (no NUL)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a link/reference for the RYU C code magic numbers?
Maybe I could add this to the code as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good sleuthing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a run at chasing the origin back to the IEEE754 standard itself but could not easily find a
baseline reference. Nothing derived in Wikipedia either.
Ryu describes itself as "shortest length" so if they figure 24 is the longest length, that is probably
as good as an attribution as we are going to get.
We probably concur and are encouraged by:
- At least we are not leaving a magic number without attribution
- It is not something that we guessed or made up.
Sometimes that is the best we get.
} | ||
|
||
@inline | ||
private def copyLitteralToCharArray( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the double ''tt" in copyLitteral
if yes, does it
have a meaning? Did you mean (consistently) literal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry for the typo.
It should be copyLiteralToCharArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of you to leave a typo so:
- You know reviewer read that far in the code. Sorta like a college paper.
- Something for a reviewer to feel good about focusing on the small issues
and leaving the critical issues of the PR alone.
// Step 1: Decode the floating point number, and unify normalized and | ||
// subnormal cases. | ||
// First, handle all the trivial cases. | ||
// Handle all the trivial cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a matter of personal preference, style, and differing
estimates of how often NaN and +/- Infinity are going
to be encountered.
I would tend to delete this trivial check and keep the wrapper as thin as feasible. It is not wrong, just lengthens every doubleToString
with a redundant check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
The motivation behind the duplication is to be sure that:
- we don't do
new scala.Array[Char](_RyuDouble.RESULT_STRING_MAX_LENGTH)
for trivial cases - doubleToChars is doing these checks
What we could do is to have an intermediate private def _doubleToCharsNoCheck()
that would be called by doubleToString()
and doubleToChars()
.
This way the checks would be performed only once in the public methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't do new scala.ArrayChar for trivial cases.
As we so much of what we have been considering recently, so much depends upon workloads and we devos
have not real control over that. What is the expected ratio over all time & workloads of NaN, +/- Infinity versus
other IEEE7454 numbers, I hesitate to call them 'real' as in Pinocchio 'real boy' real for fear of confusion.
I wish I knew, or had the time to know, the compose sequence for the mathematicians symbol for the set of Real numbers.
Nan & friends need to be detected. I am willing to believe that the expected ration of Actual-to-trivial is almost
overwhelmingly in favor of the Actual numbers. To my thinking that path should be as fast/short as possible.
Nan and friends need to be detected and handled, but, to my thinking, they do not need to be any faster than any
Actual number if that slows down the "expected" path. Yes, the doing allocations on the "trivial" path is not
strictly necessary. And Yes, one can easily design "pathological" workloads which will trigger these "unnecessary"
allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As always, said in a friendly manner, Your PR, your call.
Getting the PR to "Done" is more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get your point sorry.
So I suggest to implement the intermediate private def _doubleToCharsNoCheck()
.
new String(result, 0, strLen) | ||
} | ||
|
||
@inline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not affect the central idea of this PR.
Again, depending upon estimates of usage and personal preference, I would try without the "@inline". NaN & kin need to be correct but not necessarily lightening fast. People get concerned about code size. Although below I am going to suggest something which will increase code size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that since it's private + only two lines of code, it should be totally fine. Really against this?
However, I realized that we could still optimize String construction there. I think it would be better to use the internal String constructor:
def this(start: Int, length: Int, data: Array[Char]) = {
this()
value = data
offset = start
count = length
}
It would save an additional System.arraycopy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| I thought that since it's private + only two lines of code, it should be totally fine. Really against this?
I am not really 'against` this and I do not want to hold this PR up.
As a fellow devo, I am leery that it is doing what you intend. I am also aware that my wanting to save
a few bytes is silly given the HUGE data tables declared in this file.
My working assumption is that the compiler can do better at figuring out when to inline than
I can on-the-fly. I am not sure how good that part of the SN compiler is but GCC and LLVM are
reputed to be excellent & getting better.
When something is inline'ed on an if
path, it increases the size of the gap that the 'branch-not-taken'
must jump over. I believe that on modern machines, both paths are fetched onto the instruction (I-)cache and
may be speculatively executed. Inline code takes up more of the I-cache.
If these were 'frequent" paths, as an analyst, I would expect @inline
to have a (small) actual
benefit (but would still try to measure). Since they are 'cold' paths I doubt they are worth it.
However, you may prefer a different end of the egg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def this(start: Int, length: Int, data: Array[Char]) = {
Indeed, that would save I think both an Array allocation and the arraycopy you mentioned.
Good catch!
You/we would need to change the scope of the called constructor away from private[lang]
to something like private[scalanative]
or private[scala.scalanative]
We would have to try
one or both to see if/what works. I think that having the constructor public is not an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You/we would need to change the scope of the called constructor away from private[lang]
to something like private[scalanative] or private[scala.scalanative]
Good point. I fear it won't be possible to match both namespaces (scala.scalanative
and javalib
), so it's quite an issue.
Another option would be to just remove the entry points floatToString()/doubleToString()
.
This would also solve the previous point about _xxxToCharsNoCheck()
.
I remember you rose this concern in the issue #2902:
The SN implementation was announced a long time on the Ryu repository page as existing. I know of no direct users
but, I believe, we must behave as though at least one exists.
That means preserving the current method but also does not mean that an overload is not possible/feasible.
An other alternative could be to leave this code as is (non optimized) and to mark the floatToString()/doubleToString()
methods as @deprecated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, my favorite way would be to have a @deprecated annotation that is not breaking compilation...
We could also implement alternative wrappers like:
def doubleToChars(
value: Double,
roundingMode: RyuRoundingMode
): (scala.Array[Char], Int) = { ... }
or
def doubleToString(
value: Double,
roundingMode: RyuRoundingMode,
createStringFn: (scala.Array[Char], Int) => String
): String = { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sitting second (supportive) chair here. Given a 5000 metre level view, I'd be concerned
at putting too much work into either deprecation or elaborate wrappers that will cause people
to have to change code anyway. That is, if there are any people using those two methods
anyway.
(I suspect that all that code may have been cloned, but I doubt that anybody in the wild
is using it. Famous last words before pain. If they are, we can document the 'thin wrapper')
Deleting the d2s and f2s routines, until we can no longer get away with it, vastly
simplifies the implementation of this PR as you suggested. That will almost
certainly lead to its being merged and beginning to pay back the effort invested
in it. (I seem to be on a big "earning its keep" theme today and/or recently).
A few days ago, or even yesterday, I would not have liked the "delete the methods" option
but I have warmed to it since you suggested it.
PS: if one wanted to be friendly, one could put a small "broken heart" comment
in each of the two files explaining that d2s/f2s had been removed because
the new toChararacter
are much faster and saying how the docs have
an example wrapper for those who can not change existing code.
That way, anyone who searches the file wondering where they went gets
the reasoning and a pointer to a ready made fix. Probably Ok for a 0.N
breaking version (as long as we document the breakage).
Field tests well tell us if our guesses are confirmed or way-off-base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly off topic but for amusement, what errors where you getting when you tried "@deprecated"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting the d2s and f2s routines, until we can no longer get away with it, vastly simplifies the implementation of this PR as you suggested.
Ok so let's do that now.
what errors where you getting when you tried "@deprecated"
[error] ./scala-native/javalib/src/main/scala/java/lang/Float.scala:323:14: method floatToString in object RyuFloat is deprecated: Use floatToChars() instead, as it requires no additional String/Array allocations
[error] RyuFloat.floatToString(f, RyuRoundingMode.Conservative)
[error] ^
[error] one error found
[error] (javalib2_11 / Compile / compileIncremental) Compilation failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so let's do that now.
I concur.
| RyuFloat.floatToString(f, RyuRoundingMode.Conservative)
Ah, the deprecation is doing exactly what it is supposed to do. Silly computers do what you tell them to do
not what you want. The missing piece of information, won by hard experience here, is that "Fatal Warnings"
are turned ON in the SN build. So, the deprecation Warning becomes a show stopping Error.
Tales out of school: I did that to myself 6 or 8 weeks ago when I deprecated vfork()
while javalib Process.scala
was still using it. The PR to avoid the use of vfork()
had not yet merged.
Guess I should not be sharing that story in public.
@@ -35,6 +35,11 @@ class StringBufferTest { | |||
assertEquals("2.5", newBuf.append(2.5f).toString) | |||
assertEquals("3.5", newBuf.append(3.5).toString) | |||
} | |||
|
|||
@Test def appendFloats(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most importantly, having this test is an improvement.
Naming thing is always hard and a matter of personal preference, intent, & experience.
The scope of discussion being "floating point numbers" and
that having different, defined, and sometimes overloaded usages, the title appendFloats()
leads to be expect
a series of "f" (IEEE float) appends, with not explicit or implied "d" (IEEE double).
Since this uses, probably by design, both "f" and "d" floating
point, perhaps something like "testMixedFloatingPointAppends()`. You know your intent, so you may be able to come up with a better name (testMixedIEEE754Appends()?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that this test is in StringBufferTest
.
It will test this PR there, but, perhaps provincially, I would have expected
it in (and had assumed, reading more quickly than a reviewer should...)
that it would be in StringBuilder
(as StringBuilder
is the more current
idiom).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scope of discussion being "floating point numbers...
Actually I was also disturbed by the naming of def appendFloat()
, function that is doing stuffs with both Float and Double values, and which is just above the added test.
I stupidly decided to stick to the chosen naming, but this could be improved, your are right.
I just noticed that this test is in
StringBufferTest
.
Totally right, I should have added this test in both files, my mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stupidly
Given the complexity of all of this tangle, following precedent is not stupid at all.
It is a good starting point, up until getting to the "Does what they were doing suit
present circumstances/needs point."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So shall we do the following renamings?
StringBufferTest.appendFloat()
-> StringBufferTest.mixedFloatingPointAppend()
StringBufferTest.appendFloats()
-> StringBufferTest.mixedFloatingPointAppends()
And add StringBuilderTest.mixedFloatingPointAppend()
/ StringBuilderTest.mixedFloatingPointAppends()
?
I could also reorganize code to have StringBxxxTest.appendFloats()
and StringBxxxTest.appendDoubles()
.
Any preference?
Mine would go for number 2 (reorganize tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for asking.
I think that testing both Float and Double is an improvement.
Given a finite personal & professional lifespan, I personally would put one set of tests.
into StringBuilder
with a comment that they exercise the method(s) inherited from
AbstractStringBuilder
.
The older, not quite deprecated, StringBuffer
could be left with either a deletion (i.e. nothing visible)
a comment that StringBuffer
inherits those methods from AbstractStringBuilder
and they
are exercised in StringBuilder
.
Normally, I would go the comment route in StringBuffer
, but here I think it invites confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's fine to you, I would go for the unit tests implemented in last commit (7983512)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David,
Fine work! Thank you.
LGTM - Looks good to me.
I look forward to seeing & encourage your next PR.
It slightly lengthens the toString()
path but I think the overall IEEE754 conversion cost amortized
over a series of workloads will shrink quite a bit.
I am not sure if the SN compiler will recognize that
the new toString
is short and inline it, saving one call. If we
ever get to the point of measuring that, Life will be good.
@@ -4,6 +4,8 @@ import java.io.InvalidObjectException | |||
import java.util.Arrays | |||
import scala.util.control.Breaks._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any of the breakable
s are in the hot path that could affect your performance. Sometimes they are tricky to remove. Sometimes ScalaOps
can help, return
with maybe a var
or val
or perhaps recursion. It would be nice if they weren't in there at all in the javalib
as that is a Scala construct. See https://alvinalexander.com/scala/scala-break-continue-examples-breakable/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eric, rightly so. I created issue #2918 to capture your "removal" concern.
I created #2919 for the also rightly noted "performance" concern.
It is like living in the American South; touch something and all the bugs
crawl out.
I also reviewed the file AbstractStringBuilder.scala
. All the usages
of the problematic breakable
occur in methods not touched by this PR.
I propose that your concern be noted but that changing unrelated, and
now recorded, Issues not be required of this PR. Your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concurred, most heartily
* add comment to describe RESULT_STRING_MAX_LENGTH magic numbers * add private method _xxxToCharsNoCheck() (e.g. isNaN) to avoid redundant checks when calling xxxToString() * mark xxxToString() methods as @deprecated (note: this line is commented, as it is currently leading to errors in tests) * add new unit tests in RyuDoubleTest and RyuFloatTest (not asked by the reviewer) * add unit test in StringBuilderTest * renamed and reorganize tests in StringBufferTest * fix typo: litteral -> literal
The last commit addresses most of the points discussed above. Note: I'm having difficulties to depreciate the methods xxxToString(), it leads to errors on my laptop. |
@@ -694,7 +696,9 @@ object RyuDouble { | |||
|
|||
// format: on | |||
|
|||
@noinline def doubleToString( | |||
// @deprecated("Use doubleToChars() instead, as it requires no additional String/Array allocations", "0.5.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes leaving a "usage" comment here.
Personal opinion, I would not @deprecate
, if for no other reason that we have to create entries in the 0.5.0 release notes file. There is heavy contention for that file, so it means weeks of waiting and having to come back in a few years and actually do
the removal. Who said that, not me!
I have several such deprecation announcements & removals in my backlog, and I am dreading having to do them. Not to spread discouragement, but to focus limited developer time.
Something like "Consider using doubleToChars() in new development. mumble."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should I let this line like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|> So, should I let this line like this?
Sorry, I do not understand. Could you rephrase? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, do I let the following comment line as is?
// @deprecated("Use doubleToChars() instead, as it requires no additional String/Array allocations", "0.5.0")
Or should I rephrase this?
@@ -745,26 +739,49 @@ object RyuDouble { | |||
* new offset as: old offset + number of created chars (i.e. last modified | |||
* index + 1) | |||
*/ | |||
@noinline def doubleToChars( | |||
@noinline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this @noinline
adds no value. You may want to remove it now that the code has been restructured.
The @noinline
in the long routine remains relevant although I can not imagine a compiler which would inline that much code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 3c79f90
@@ -694,7 +696,9 @@ object RyuDouble { | |||
|
|||
// format: on | |||
|
|||
@noinline def doubleToString( | |||
// @deprecated("Use doubleToChars() instead, as it requires no additional String/Array allocations", "0.5.0") | |||
@noinline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about no-longer-relevant @inline
above 'toChars` below.
Sorry, I entered the comment out-of-order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 3c79f90
} | ||
|
||
@noinline | ||
private def _doubleToCharsNoCheck( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR can certainly advance like this.
Your/our concern is that the doubleToChars()
path be as
fast as feasible. You are introducing a call
instruction
with all of its overhead in what is supposed to be a fast_path.
IIUC, this is to save an Array allocation in doubleToString
if
the argument is NaN, +/- Infinity.
With the continuing chant of "It depends on the workload" in the background, is it worth imposing the cost of a call
on every non-Special value in favor of making Special calls a bit faster and less expensive?
I show a lack of charity when I suggest that any workload with enough NaNs & Infinities to benefit probably has more problems that Array allocations. There is a song lyric something about "betray my ignorance in the moment that I speak:.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my SN experience, I can tell that I had mainly performance issues associated with GC pressure.
But as you said "It depends on the workload".
My conclusion is that, if doubleToString()
is removed (now or in the future) we can delete the intermediate _doubleToCharsNoCheck()
call.
To me it solves most issues: number of calls, number of allocations, internal String constructors stays isolated in javalib.
The only remaining issue is breaking code of someone else...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubleToChars??? Did you mean doubleToString??
Having been on the other side of the issue, I also hate annoying people by breaking
working code. 0.5.n is announced as a breaking version, so we have a little more
latitude. Still "break at last resort" or "break only if you are sure of cost/benefit".
There should be a longer field test/RC test, especially if the multithread work go in.
If you are comfortable with the current iteration, I propose that you delete
the two "// @deprecated" lines and submit. That would get the central ideas
of this PR exercised and racking up experience hours/weeks.
Not to extend this set of changes to +/- Infinity, but the "deprecate/remove/leave_as_is"
decision will become a lot clearer in another (hypothetical) PR.
How does one say "Ship it" in French?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubleToChars??? Did you mean doubleToString??
Yes, thanks, I corrected in my comment.
If you are comfortable with the current iteration, I propose that you delete
the two "// @deprecated" lines and submit. That would get the central ideas
of this PR exercised and racking up experience hours/weeks.
Performed in the two last commits :)
How does one say "Ship it" in French?
We would say "envoie le", but the deliverer would more idiomatically reply "et voilà" ;-)
@@ -37,6 +37,8 @@ package ieee754tostring.ryu | |||
|
|||
object RyuFloat { | |||
|
|||
// Scala/Java magic number 15 is derived from original RYU C code magic number 16 (which includes NUL terminator). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, thank you.
@@ -37,6 +37,8 @@ package ieee754tostring.ryu | |||
|
|||
object RyuDouble { | |||
|
|||
// Scala/Java magic number 24 is derived from original RYU C code magic number 25 (which includes NUL terminator). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, thank you.
Hidden somewhere in the stream above is David's suggestion that the (There are implementation reasons for not doing such a wrapper inside this PR). |
…d put simplified versions in RyuDoubleTest/RyuFloatTest
* re-implement Double.toString/Float.toString in order to use RyuDouble.doubleToChars()/RyuFloat.floatToChars() * simplify doubleToString()/floatToString() example wrappers in unit tests
Et voilà :) In the two last commits I removed the methods RyuDouble.doubleToString()/RyuFloat.floatToString() and I moved a simplified version in the unit tests, to provider wrapper examples as discussed previously. Double.toString and Float.toString have also been updated. LGTM, except if you have new comments |
David, the code looks very good. You have put a lot of thought and effort into this one. I think there is a small bit of housekeeping to do.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job David.
I, for one, am interested in using this code.
I think we are waiting for CI runs to get released. I do not have the privileges to do that.
I cloned this code, built, & ran the tests on my X86_64 Linux system and all the relevant
tests passed.
We may need to sort through any CI issues revealed there before final sign-off.
Fingers crossed 🤞 Thank you so much Lee for having pushed further the quality of this PR, your reviewer contribution is both remarkable and appreciated.
Housekeeping done :)
I'll take care of this ;-)
Not yet. I have recently focused on theoretical aspects. I need to find dedicated time for proper benchmarking. |
@WojciechMazur ready for final tests before ignition ;-) |
Perhaps this PR could be used for both |
Porting to |
@LeeTibbert @WojciechMazur I guess we are now fine with this one as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you for this contribution! I have only 1 remaining comment about access modifier.
And thank you @ekrich and @LeeTibbert for the exhaustive code review!
@@ -106,6 +108,36 @@ abstract class AbstractStringBuilder private (unit: Unit) { | |||
count += 1 | |||
} | |||
|
|||
// Optimization: use `RyuFloat.floatToChars()` instead of `floatToString()` | |||
final def append0(f: scala.Float): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is some usage of append0
that would not allow for making these methods protected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not aware of the other PR that changed the access to methods of AbstractStringBuilder, now when it's merged we can rebase this PR and change the access of the added append0 methods
@david-bouyssie If you would have a bit of spare time,I would appreciate if you could release the merge conflicts created after merging other PRs. Thank you |
Merge is done. |
@WojciechMazur thanks for the merge. |
…ementations (scala-native#2917) * Fix scala-native#2902: Avoid Array allocation in ieee754tostring implementations * Address reviewer suggestions in PR scala-native#2917: * add comment to describe RESULT_STRING_MAX_LENGTH magic numbers * add private method _xxxToCharsNoCheck() (e.g. isNaN) to avoid redundant checks when calling xxxToString() * mark xxxToString() methods as @deprecated (note: this line is commented, as it is currently leading to errors in tests) * add new unit tests in RyuDoubleTest and RyuFloatTest (not asked by the reviewer) * add unit test in StringBuilderTest * renamed and reorganize tests in StringBufferTest * Remove unnecessary @noinline annotations * Remove methods RyuDouble.doubleToString()/RyuFloat.floatToString() and put simplified versions in RyuDoubleTest/RyuFloatTest * PR scala-native#2902 improvements: * re-implement Double.toString/Float.toString in order to use RyuDouble.doubleToChars()/RyuFloat.floatToChars() * simplify doubleToString()/floatToString() example wrappers in unit tests * Add a note to highlight `result.length - offset >= RESULT_STRING_MAX_LENGTH` (cherry picked from commit 9446fe6)
@david-bouyssie Sorry, I have no idea how could I skip that! So far we don't have multithreading so it's not that important. There might be more places that might need thread-safety fixes, so probably it would be better to dedicate a separate PR for this, along with a comparison of JDK specifications for these classes.. |
I just insisted on this point, because I understood that experimental multithreading would be introduced in 0.5.x |
* Update scalafmt to 3.5.9 (#2862) * Update scalafmt to 3.5.9 * Use .scalafmt.conf version in script * Remove quotes around version (cherry picked from commit 87361fb) * Adding missing `java.lang.Character` functionality (#2871) * Added highSurrogate and lowSurrofate to java.lang.Character extraced from toSurrogate * Added tests for highSurrogate and lowSurrogate (cherry picked from commit 9a043fd) * Update ReferenceQueue.scala (#2878) Remove stub annotation import (cherry picked from commit ff62fe4) * Fix header definition of Commix WeakRefGreyList (#2890) This addresses #2889 (cherry picked from commit ae27796) * Update Scala versions in the builds (#2896) * Add Scala 2.12.17 and Scala 2.13.10 * Update versions used in the CI * Remove depereactions in nscplugin * Update settings to handle deprecations * Update scalalib patches * Update partest files * Increase timeout for multiarch builds * Fix source incompatibilites for CommandLineParser * Fix ambigious import (Scala 2.12.17 fix) * Fix java release settings + disable source flag in tests * Add missing partest junit test files * Fix scripted tests config * Disable failing BigIntTest in unoptimzied code (cherry picked from commit 5438abb) * Fix I2892: Implement sys/select pselect method (#2895) (cherry picked from commit 427fea4) * Update sbt to 1.7.2 (#2859) * Update sbt to 1.7.2 (cherry picked from commit f509b38) * Towards an independent javalib #1 (#2887) * ScalaOps deprecations and removed methods * Update UnixProcess * Update WindowsProcess * Update Collections from Scala.js * Update AbstractMap and update ScalaOps * Minor cosmetic updates * Update File, remove unused private classes * Remove from generated docs * Remove compile errors and touch up toString * Revert javalib deprecation setting * Use scalaOps for a code simplification * Add CollectionsTest and adapt for use * Update AbstractMapTest, cosmetic only * Move AssertThrows to new package * Fix package for tests-ext * Missed one bad substitution (cherry picked from commit 48b3342) * Fix #2891: posixlib spawn is now implemented. (#2894) * Fix #2891: Implement posixlib spawn.scala * Job is not done until the documentation is updated * Delete include to keep Scala 2.12.16 happy (cherry picked from commit ba9ab71) * Port `j.u.ArrayDeque` from JSR 166 (#2898) * Port `ArrayDeque` from JSR 166 * Port JSR 166 tests * Add `Spliterator` interface (cherry picked from commit 8df8499) * Fix #1826: Add documentation for GC settings (#2910) * Add info for None GC and more detail * Fix #1826: Add documentation for GC settings (cherry picked from commit 40a2330) * Fix two sets of quality preception diminishing typos (#2914) (cherry picked from commit faba287) * Fix #2903: Avoid systematic checking of String integrity in IEEE754Helpers (#2907) (cherry picked from commit 8d491c5) * Fix #2921: Commix Heap.c now compies with Clang 15.0.3 (#2922) (cherry picked from commit 8c9b652) * Fix #2678: Provide examples of using NativeConfig (#2926) (cherry picked from commit b617b3e) * Fix #2927: Expunge non-JVM j.l.String#getValue() (#2928) (cherry picked from commit a6109a6) * Fix #I2925: A j.l.String constructor now yields immutable strings (#2929) * Fix #I2925: A j.l.String constructor now generates immutable strings * Add tests writted by David Bouyssie (cherry picked from commit e4705a9) * Don't unapply unecessary unboxing in lambdas (#2938) * Add reproducer for issues 2880 * Make sure no unnecessary unboxing would be applied to primitive types * Fix build to allow testing with Scala 2.11 * Revert "Fix build to allow testing with Scala 2.11" This reverts commit 28fe4b8. * Use experimantal mode in NIRcompiler for tests * Replace ammonite script with scala-cli due to depreacted node.js runner in actions (cherry picked from commit 207d2a7) * Fix CI deprecation warnings and multiach builds (#2939) * Update ations/cache to v3 * Replaced deprecated set-output commands * Replace ammonite script with scala-cli due to depreacted node.js runner in actions * Fix codegen bug for x86 architectures * Refactor multiarch CI builds (cherry picked from commit 864efc0) * Fix #2943: Provide smarter & faster ProcessTest method (#2944) (cherry picked from commit 2567ca9) * Fix #2935: Ensure StringBuilder does not alter existing child Strings (#2936) (cherry picked from commit 6cbe5cc) * Add additional javalib String immutability tests (#2934) (cherry picked from commit 3ab6eac) * Enable the use of memcpy in IEEE754Helpers.bytesToCString (#2941) * Some changes to enable use of memcpy in IEEE754Helpers.bytesToCString: * replace while loop by `memcpy()` call * throw NumberFormatException when String is empty (otherwise bytesToCString would throw ArrayIndexOutOfBoundsException) * replace `var res = f(cStr, end)` by `val res = f(cStr, end)` (var not needed there) * PR2941: use Ptr/Array conversion instead of Ptr arithmetic (cherry picked from commit 317ff28) * Optimize method `AbstractStringBuilder.append0(CharSequence, Int, Int)` (#2909) * Optimize method `AbstractStringBuilder.append0(CharSequence, Int, Int)` For reference, see equivalent Android libcore2 implementation (Apache 2 license) at: https://android.googlesource.com/platform/libcore2/+/master/luni/src/main/java/java/lang/AbstractStringBuilder.java#151 * Revert last commit * Optimize method `AbstractStringBuilder.append0(CharSequence, Int, Int)` For reference, see equivalent Android libcore2 implementation (Apache 2 license) at: https://android.googlesource.com/platform/libcore2/+/master/luni/src/main/java/java/lang/AbstractStringBuilder.java#151 * Update attribution to Android Luni in License.md * Mention Android Luni * Format code * Fix NPE * Changes relative to recent reviewer suggestions: * replace `string.length()` by `string.count` to improve code understanding * add unit tests in StringBufferTest.scala and StringBuilderTest.scala to ensure String immutability when doing SB.toString followed by SB modification * Another commit (WIP for issue #2930) (cherry picked from commit abf7dcf) * Install ping for ProcessTest (#2945) (cherry picked from commit 967c262) * Fix build after cherry-picking * Add Scala 3.2.1 to cross versions * Fix encoding of `scala.Nothing` and `scala.Null` in type signatures (#2949) * Reproduce issue #2858 * Fix encoding of phantom Scala types Nothing and Null * Revert spurious changes (cherry picked from commit 4ca323b) * Remove non local returns from java.net.URI (#2951) (cherry picked from commit 24c0039) * Partial fix #2822: remove non-local returns from j.math.Primality (#2948) (cherry picked from commit 57612fc) * Remove deprecated assert idiom from five regex Tests (#2954) * Remove deprecated assert idiom from two regex Tests * Remove now unnecessary import of ThrowsHelper * Remove deprecated assert from three more files (cherry picked from commit e225653) * Remove deprecated assert idiom from three java.lang Tests (#2953) * Remove deprecated idiom from three java.lang Tests * Remove now unnecessary import (cherry picked from commit 4872c8b) * Handle passing null for unboxed arguments of extern methods (#2950) * Reproduce issue #2866 * Remove redundant build settings * clenup (cherry picked from commit 688c0ff) * Report error when referencing local state in CFuncPtr (#2957) * Report error upon usage of local state in CFuncPtr * Fix failing tests and improve Scala 2.11 detections (cherry picked from commit 6fb8352) * Use unsafe array ops in `IEEE754Helpers` (#2960) (cherry picked from commit b9c3d94) * Fix lost information about value class instance (#2959) * Reproduce issue `AnyVal` causes segfault #2712 * Make sure to recreate value class instance upon returning from lambda (cherry picked from commit 584d5c2) * Use `memcpy` in `{to,from}CString` (#2962) * Use `memcpy` in `{to,from}CString` * Guard against 0-length strings * Fix empty string * Reuse val (cherry picked from commit 3150626) * Remove shebang from scala-cli scripts due to failures in scalafmt (#2961) * Remove sheband from scala-cli script * update scalafmt overrides * run scalafmt in non-interactive mode to limit ammont of logs * Revert "run scalafmt in non-interactive mode to limit ammont of logs" This reverts commit 421ace2. * try fix fialing scalafmt test (cherry picked from commit 76bec79) * Encode main class name to match outputs of the compiler (#2955) * Reproduce issue #2790 * Encode main class to match the outputs of the compiler * Use scala.reflect.NameTransformer directly instead of reimplementing its logic * Include package name encoding with exception of dots * Make sure to create mirror class even if module is not a candidate for forwarders (match JVM codegen in scala compiler) * Fix failing tools test (cherry picked from commit 02f38ff) * Partial fix #2963: Add missing SIGTERM & kin (#2964) (cherry picked from commit 570e7b4) * IEEE754Helpers optimizations: stackalloc instead of Zone + String.getBytes removal (#2965) (cherry picked from commit d8fd989) * Get rid of compilation warnings in tests and tools using Scala 3 (#2966) * Extract UnboxEntry class from local scope of Hashtable.entrySet * Extract local class from ArrayTest * Reenable fatal warnigns for JVM tests, accept warnings when compiling native tests * Fix handling deprecations in tools for Scala3 * Supress more warning in tests using Scala3 (cherry picked from commit 396c163) * Fix #2902: Avoid Array allocation in ieee754tostring implementations (#2917) * Fix #2902: Avoid Array allocation in ieee754tostring implementations * Address reviewer suggestions in PR #2917: * add comment to describe RESULT_STRING_MAX_LENGTH magic numbers * add private method _xxxToCharsNoCheck() (e.g. isNaN) to avoid redundant checks when calling xxxToString() * mark xxxToString() methods as @deprecated (note: this line is commented, as it is currently leading to errors in tests) * add new unit tests in RyuDoubleTest and RyuFloatTest (not asked by the reviewer) * add unit test in StringBuilderTest * renamed and reorganize tests in StringBufferTest * Remove unnecessary @noinline annotations * Remove methods RyuDouble.doubleToString()/RyuFloat.floatToString() and put simplified versions in RyuDoubleTest/RyuFloatTest * PR #2902 improvements: * re-implement Double.toString/Float.toString in order to use RyuDouble.doubleToChars()/RyuFloat.floatToChars() * simplify doubleToString()/floatToString() example wrappers in unit tests * Add a note to highlight `result.length - offset >= RESULT_STRING_MAX_LENGTH` (cherry picked from commit 9446fe6) * Upgrade to 3.6.1 (#2968) (cherry picked from commit 6bbcfed) * CI: Add publishing flow and centralize release logic (#2967) * Move building aggreation to Build, remove helper bash scripts * Add publish flow * add pgp passphrase * Setup pgp key hanlding * Set verionSchema for published projects (cherry picked from commit bcdb50a) * Fix #2893: Implement posixlib wait.scala (#2969) * Fix #2893: Implement posixlib wait.scala * Silly Rabbit! Do not run POSIX tests on Windows (cherry picked from commit b9a220a) * Remove -XX:MaxMetaspaceSize flag in scripted tests due to CI failures A few java.net.Inet*Address fixes (#2877) * Workaround missing gai_strerror() on Windows * Re-work InetAddressTest isReachable to avoid Windows specific condition * Explore getnameinfo() SEGV problem * Better handling of numeric host storage in getAllByName() (cherry picked from commit 75b2d6f) Fix mima binary incompatibilites Run scalafmt and fix workflows * Don't allow to execute `runMain` task (#2971) * Configurable Interflow optimizer (#2819) * make sure one method only done once; optimize processor.advance * add inlineDepth and callersize into config * add optimizerConfig * make optimizer configuatable Co-authored-by: yuly <yuly16@mails.tsinghua.edu.cn> Co-authored-by: Wojciech Mazur <wojciech.mazur95@gmail.com> (cherry picked from commit ca81ba6) * Allow to link as dynamic library (#2145) * Bootstrap support for building shared library in config * Allow linking to shared library * Add shared library scripted test * Add compile time checks and tests * Sync Scala 3 compiler plugin with Scala 2 * Generate dllexport modifiers for exported symbols on Windows * Generate gxx personality for externs * Restrict generation of dllexport only to Sig.Extern names * Check if new compilation flags fix exception on mac * Add mixxing isMac method in scirpted Platform * Generate dylib constructor to init ScalaNative * Set lib prefix for library outputs * Allow to produce static libraries * Build static library using llvm-ar directly * Don't allow to generate '.ll' file with empty basename * Upgrade al-cheb/configure-pagefile-action to remove CI warnings Co-authored-by: adampauls <adpauls@microsoft.com> (cherry picked from commit a4df2de) * Add JDK9 constructors to `j.m.BigInteger` (#2974) * Add JDK9 ctors to `j.m.BigInteger` * Port ctor tests from Scala.js * Add JDK9 tests * Add error checking (cherry picked from commit 1a3981d) * Remove last two Scala 3.2 build warnings (#2977) (cherry picked from commit 38cef3e) * Implement `java.lang.Math.fma` (#2979) * Fix JDK compliance CI tests Co-authored-by: Eric K Richardson <ekrichardson@gmail.com> Co-authored-by: Jamie Willis <J_mie@hotmail.co.uk> Co-authored-by: Mark Hammons <markehammons@gmail.com> Co-authored-by: LeeTibbert <LeeTibbert@users.noreply.github.com> Co-authored-by: Arman Bilge <armanbilge@gmail.com> Co-authored-by: David Bouyssié <6719996+david-bouyssie@users.noreply.github.com>
Fix #2902
The issue ticket explain the rationale of the modifications.
The main goal of the PR is to provide a new API for ieee754tostring ops (
RyuDouble.doubleToChars()/RyuFloat.floatToChars()
) that minimizes the number of String/Array allocations.There is a long discussion below about the pros and cons of keeping the old and the new APIs together, versus removing the old one. We finally opted for the second option.
The removal will have to be documented in the 0.5.0 Release Notes (could be treated in PR #2924).
Notes relative to
0.4.x
backporting: using commits up to 3c79f90 should be fine.Summary of the different modifications
For performance reasons, and also to force the use of this API in both SN and external world, we concluded that providing a single, but optimized API, is a better solution.
Thus
Double.toString/Float.toString
were re-implemented in order to usedoubleToChars()/floatToChars()
.Simplified versions (without input integrity checks) of
doubleToString()/floatToString()
were put inRyuDoubleTest/RyuFloatTest
, providing thus working examples of plugin replacements fordoubleToString()/floatToString()
.This PR also introduces a documented constant RESULT_STRING_MAX_LENGTH (in both RyuDouble and RyuFloat), which represents the maximum characters that can be produced during the conversion. It is considered as a magic number derived from the C implementation (see discussion below).
Important: the caller must ensure that
Array.length - offset >= RESULT_STRING_MAX_LENGTH
.This constant may thus be used when creating the Array passed to
doubleToChars()/floatToChars()
.See examples in
RyuDoubleTest/RyuFloatTest
andDouble.toString/Float.toString
.Other notable changes: