-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigDecimal vs Rational #6
Comments
For stability, some thoughts on Rational from the draft README, explaining my current point of view. Many languages in the Lisp tradition include fractions of arbitrary-size integers as a basic data type, alongside IEEE-754 64-bit binary floating point numbers. We're not proposing fractions as a built-in type for JavaScript for a couple reasons:
|
Most popular decimal libraries on And fraction / rational: |
I would consider striking "still limited expressiveness" from the reasoning, since it is a concern for literally any finite sized representation of real numbers and thus could be used to shut down any alternative (including, symmetrically, this one). Fractions may be limited relative to, say, arbitrary polynomial solutions, but they are still a strict superset of decimals (trailing zeros aside). "Efficiency" is also an interesting concern when viewed as being relative to the efficiency of being implemented as a library. It seems to me that BigDecimal is effectively BigInt + dot position (where dot position is preserved on addition and added on multiplication), and it's trivial to e.g. store dollars as 100s of cents, while BigRational would require a user-space GCD implementation which is not able to introspect the BigInt numerator/denominator values. "Trailing zeros" is a valid concern if this is an important aspect of the semantics, but conversely it raises a lot of semantics questions (already around arithmetic, comparisons, and |
I should add that anecdatally, I've found need for arbitrary precision fractions in the past (most recently for exact probability calculations) but I've not yet found need for arbitrary precision decimals that wasn't satisfied by something similar to counting cents instead of dollars. That said, I'm acutely aware that there are more programmers out there than me and there are use-cases that I haven't predicted or appreciated, so consider this a single vote rather than an attempt at an authoritative statement. |
What about rationals as not two, but three big integers under the hood? Numerator, denominator, and precision, to store significant trailing zeroes. The precision will always be the bigger number between BigDecimal values participating in a computation, and can be reduced by rounding. In literals, it would be determined by the number of fractional digits, including trailing zeroes. The precision value will be used to determine formatting with toString(), or to cut irrational divisions at that point (with no rounding), e.g.
IMHO, this is intuitive from a user's point of view, and addresses the trailing zeroes problem (not sure about the "limited expressiveness"), as well as the division operator. I agree with @LeszekSwirski that "efficiency" should not be a deterrent if implementing this in userland would be measurably slower. |
We would prefer rationals for our case. We build an application that can execute user defined mathematical equations (think of it a little like Excel). Users expect these calculations to be carried out precisely.
The issues flagged above for rationals aren't a problem for us. Trailing zeros are a non-issue, and the number chained operations in our use case is small enough that the performance impacts of rationals are not problematic for us. Any sort of online-calculator that executes user defined mathematical operations probably has a similar use for rationals. |
Any decimal can be represented as rational with power of 10 denominator. So having rationals that support preserving denominator after math operations would give decimals for free. |
Web Codecs would have liked or will have a use for a w3c/webcodecs#122 (comment) is a good summary of the situation by @chcunningham, but @sandersdan has also been thinking about this a lot. I distinctively remember @jernoble talking about this in standards meetings more than once in the context of media APIs. It's very common to use rational when dealing with multimedia content, because (e.g.) the frame rate of video in the NTSC standard is defined to be precisely Time in professional video software (and in APIs that are intended to be used to implement those piece of software) is often using a frame count and a time base (example of the rational number type from ffmpeg, Microsoft Windows WMF, Apple's Core Media, to cite what is used o the three major desktop platforms). Because videos can be very long, precision matters: depending on various factors, this can cause audio/video desynchronization, video frame drop, and otherwise causes some confusions, for example having the audio track shorter than the video track, in the same media file, because of rounding issues, or uncertainty about the temporal position of a certain event in a video, but this is certainly not an exhaustive list. |
Quoting @boenrobot :
We could do the same with non-rational |
@scottfr Regarding the first point above, it seems like we could address your issue if we were to allow an "unlimited" option for specifying the number of decimal digits? |
@jessealama What would that mean? |
That's a bad idea, because (correct me if I'm wrong)
Unless... trailing zeros in the literal set the precision of the calculation (that's how Unix const lil_third = 1.0m / 3.0m //0.3m
const BIG_third = 1.000000000000000000000000000m / 3.000000000000000000000000000m //0.333333333333333333333333333m But that is no longer "unbounded", it's "arbitrary-precision", literally This still begs the question "what to do if the precision of each const third = 1.0m / 3.00m * 1.000m //0.333m |
@Rudxain All good points. I was speaking imprecisely. I had only division in mind (that was the use case mentioned by @scottfr), and was tacitly thinking of cases where the result can be exactly represented (no infinite repeating digits as in 1/3). In that case, it seems to me that you could achieve what I am (perhaps sloppily) calling "unlimited mode" by specifying an upper limit on the number of digits in the result. It seems to me that, for division, you can get an exact result (again, under the assumption that there is a finite, exactly representable result) by taking the sum of the number of digits of the two arguments. Would that work for you, Scott? (Maybe I'm wrong and you might need a bit more than the sum of the digits, but it does seem to me that there ought to be a bound.) With that aside, I ought to step back from the suggestion of "unlimited mode" (whatever that means). Indeed, I was confusing "arbitrary precision" with "unlimited". The former is what's under discussion here, whereas the latter (insofar as it makes sense at all) isn't what the Decimal proposal is about. Sorry for the confusion. |
Wait, you may be onto something there. That's actually a good idea! (if implemented carefully). I think that what you mean is to tell the interpreter "hey, use as much precision as you can if the result needs finite memory, otherwise fallback to the default limit". It would work something like this: 1.0m / 3.0m //0.3m
1m / 256m //0.00390625m
1m / 255m //0m
1.000m / 255m //0.003m It seems reasonable enough for me, but I expect it would be VERY confusing for beginners |
Yes, that can be done. It would slow things down a little bit, but given two big decimals I'm not sure how much, in practice, the normalization and exact representability checks would slow things down. The GCD check is annoying, but checking divisibility by 2 and 5 only ought to be pretty fast. I think if one were aiming for fast decimals (again, thinking only of division here), it should be possible to work with a given limit (in which case, you would just start doing long division, stopping when the desired number of digits have been computed). One interesting class of cases would be those where an exact representation is available, but you don't see it because the computation gets truncated because one specified "too few" digits. (In my examples, I'm less thinking of literals and more in terms of an object with operations that construct big decimals, with operations like addition, division, etc, possibly with optional arguments. I agree that REPL-style examples, written out in terms of literals, may indeed be confusing, at least initially.) |
Edit: I re-read your comment and now I get it
That seems efficient
Speaking of that, we could allow repeating decimals to use a precision equal to some multiple of the period. Examples: 1m / 7m //0.142857m
1.0000000m / 7m //0.142857142857m Irrationals would still need to be truncated to the limit. But we must take into account that it's not always possible to determine if a number is irrational, for similar reasons as the Halting Problem.
I totally agree with that! Specially for future-proofing. Maybe we shouldn't allow literals at all. When a BigDecimal is constructed, the user is allowed to specify the radix (default 10) I think it's a bad decision to only support base 10 (#75), even if it's simpler/easier to implement (see Tech Debt). If someone needs base 2 for performance, or base 20 because of a country, or base 12 because of a potential future where some humans switch to that radix, BigDecimal would become almost useless. However, I'm concerned about "equivalent bases". If a user chooses base 100, it would have the same behavior as base 10, the only diff would be performance (see packed BCD). Same applies to binary, all radices that are powers of 2 are equivalent, but higher radices have better performance (until 2^256, where we get diminishing returns, because 64bit CPUs have max efficiency at 128b) Perhaps, engines could be allowed to optimize radix B "requests" to be internally represented as the highest B^N that the CPU is most efficient when operating, only if B isn't a perfect power. If B is a perfect power, and it's "too big", the engine may optimize it to a lower (but equivalent) base. The only diff being |
I think I focused on relatively poor arguments at the top of this post. The main reason I currently prefer decimal over rational is semantic: On the decimal data type, a base-10-oriented "round" operation makes sense and is common, whereas this is neither typical nor especially sensible on a rational type. |
In #2, the README describes advantages of BigDecimal over Rational. Let's continue the discussion in this thread. Are there further factors we should consider?
The text was updated successfully, but these errors were encountered: