all: inf.dec -> apd.Decimal #13551

maddyblue · 2017-02-13T05:29:09Z

Notes for docs:

this changes/explains the behavior for ROUND. See the discussion around here: https://reviewable.io/reviews/cockroachdb/cockroach/13551#-KcwrN33JECvMtotq8K2:-KdAZqtbFIOrYhpom-Gc:bq7n2cp
There's a table in there which compares what other databases do. May or may not be usable.
we have something uncanny going on for LOG/LN. Consider checking out issue sql: float log and sqrt are inconsistent in their error handling #13642 for details.

This change is

knz · 2017-02-14T14:09:21Z

Looks good overall, although I still have an open concern about the default precision in string representations.
Minor nit to the error checking in math functions.

I did not review the changes to the encoding package because that's an area I do not understand well (yet). Perhaps @nvanbenschoten is the better person to look at this.

Reviewed 38 of 42 files at r1, 1 of 1 files at r2.
Review status: 39 of 43 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.

pkg/sql/parser/aggregate_builtins.go, line 308 at r1 (raw file):

			// TODO(mjibson): allow this function to return an error in case of
			// div-by-zero.
			panic(err)

Is division by zero the only possible error returned by Quo()? If so, please remove the check for err entirely: the definition of the sumAggregate ensures that this can never be called with a zero divisor.

pkg/sql/parser/aggregate_builtins.go, line 556 at r1 (raw file):

			a.tmpDec.SetCoefficient(t)
			// TODO(mjibson): handle this error
			_, _ = ExactCtx.Add(&a.decSum.Decimal, &a.decSum.Decimal, &a.tmpDec)

What error can that be? (Please clarify in "TODO" message).

pkg/sql/parser/aggregate_builtins.go, line 594 at r1 (raw file):

	t := datum.(*DDecimal)
	// TODO(mjibson): handle this error
	_, _ = ExactCtx.Add(&a.sum, &a.sum, &t.Decimal)

ditto

pkg/sql/parser/aggregate_builtins.go, line 754 at r1 (raw file):

	ed.Sub(&a.tmp, d, &a.mean)
	ed.Add(&a.sqrDiff, &a.sqrDiff, ed.Mul(&a.delta, &a.delta, &a.tmp))
	// TODO(mjibson): check for and handle ed.Err().

ditto

pkg/sql/parser/aggregate_builtins.go, line 765 at r1 (raw file):

	dd := &DDecimal{}
	ed.Quo(&dd.Decimal, &a.sqrDiff, &a.tmp)
	// TODO(mjibson): check for and handle ed.Err().

ditto

pkg/sql/parser/aggregate_builtins.go, line 798 at r1 (raw file):

		return NewDFloat(DFloat(math.Sqrt(float64(*t))))
	case *DDecimal:
		// TODO(mjibson): handle this error.

same as sum above, this value can never be negative.

pkg/sql/parser/builtins.go, line 1255 at r1 (raw file):

			ReturnType: fixedReturnType(TypeDecimal),
			fn: func(_ *EvalContext, args Datums) (Datum, error) {
				// TODO(mjibson): make sure this fits in an int32.

In the interim, please add a phrase in the Info field below about this ("if the 2nd value to round is larger than ... then the results are undefined")

pkg/sql/parser/builtins.go, line 1307 at r1 (raw file):

	"sqrt": {
		floatBuiltin1(func(x float64) (Datum, error) {
			// TODO(mjibson): this should probably be removed to match log/ln.

It's the other way around: the other two must be extended to check their argument and return an error when it's invalid.
(At least to match pg. Later we can have error modes.)
Perhaps you can add the check to the other two since it's rather straightforward.

pkg/sql/parser/eval_test.go, line 66 at r1 (raw file):

		{`4:::int // 2.1:::decimal`, `1`},
		// Division is always done on floats or decimals.
		{`4 / 5`, `0.8`},

See my comments about sqrt in alias_typesn

pkg/sql/testdata/aggregate, line 475 at r1 (raw file):

SELECT AVG(k), AVG(v), SUM(k), SUM(v) FROM kv
----
5 2.8 30 14

Same question as for sqrt in alias_types

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

----
1.414213562373095   3
4   3

I'm not sure about this. Is this the precision we want for sqrt? Given that the result is actually exact to many more decimals, there would be more information in 4.0000000000 than there is in just 4. I'm open to a different view.

pkg/sql/testdata/builtin_function, line 663 at r1 (raw file):

# round up for decimals
# These results are indeed different than floats. Compare with postgres.
# Float rounding uses banker, decimal rounding uses half up.

I checked with Oracle and mssql too. Both use half-up rounding. Please add a note to this effect ("it's rather common ...") both here and in the built-in definition for round().

pkg/sql/testdata/window, line 50 at r1 (raw file):

----
5
5

Same question as for sqrt in alias_types

Comments from Reviewable

nvanbenschoten · 2017-02-14T17:42:19Z

I like how this is turning out so far! I'm working my way through all the changes and saving the encoding stuff for last. I mirror @knz's concern about the string representation, as well as a few minor issues about our use of APD and the interface it exposes (ie. questions about SetString, Floor, etc.).

Reviewed 32 of 42 files at r1.
Review status: 39 of 43 files reviewed at latest revision, 25 unresolved discussions, some commit checks failed.

glide.lock, line 407 at r1 (raw file):

  - tap
  - transport
- name: gopkg.in/inf.v0

Can we remove this yet?

pkg/sql/parser/aggregate_builtins.go, line 308 at r1 (raw file):

Previously, knz (kena) wrote…

Is division by zero the only possible error returned by Quo()? If so, please remove the check for err entirely: the definition of the sumAggregate ensures that this can never be called with a zero divisor.

Remove the error or the comment? We should definitely prefer panicing over ignoring errors

pkg/sql/parser/aggregate_builtins.go, line 556 at r1 (raw file):

			a.tmpDec.SetCoefficient(t)
			// TODO(mjibson): handle this error
			_, _ = ExactCtx.Add(&a.decSum.Decimal, &a.decSum.Decimal, &a.tmpDec)

errcheck is going to yell at you whenever you ignore errors. I'd just panic for now.

pkg/sql/parser/aggregate_builtins.go, line 765 at r1 (raw file):

	dd := &DDecimal{}
	ed.Quo(&dd.Decimal, &a.sqrDiff, &a.tmp)
	// TODO(mjibson): check for and handle ed.Err().

ditto to panicing for now instead of avoiding the check.

pkg/sql/parser/builtins.go, line 45 at r1 (raw file):

	"github.com/cockroachdb/cockroach/pkg/util/timeutil"
	"github.com/cockroachdb/cockroach/pkg/util/uuid"
	"github.com/lib/pq/oid"

move this guy up

pkg/sql/parser/builtins.go, line 1240 at r1 (raw file):

		}, "Rounds `val` to the nearest integer."),
		decimalBuiltin1(func(x *apd.Decimal) (Datum, error) {
			return roundDecimal(x, 0)

How does c.Quantize(d, 0) compare to c.Round(d)?

pkg/sql/parser/builtins.go, line 1336 at r1 (raw file):

			dd := &DDecimal{}
			frac := new(apd.Decimal)
			x.Modf(&dd.Decimal, frac)

Trunc seems like a good addition to the library itself.

Also, why not make Modf support nil fractions so trunc could simply be x.Modf(d, nil)?

pkg/sql/parser/datum.go, line 491 at r1 (raw file):

	// and the precision to 2000 places. Any rounding or other inexact conversion
	// will result in an error.
	dec, res, err := HighPrecisionCtx.NewFromString(s)

Didn't we add a SetString method to Context?

pkg/sql/parser/datum.go, line 556 at r1 (raw file):

// Format implements the NodeFormatter interface.
func (d *DDecimal) Format(buf *bytes.Buffer, f FmtFlags) {
	buf.WriteString(d.Decimal.ToStandard())

Consider adding FormatStandard(buf *bytes.Buffer) and related method to the library.

pkg/sql/parser/normalize.go, line 689 at r1 (raw file):

// DecimalZero represents the constant 0 as DECIMAL.
var DecimalZero DDecimal

Do we still need this?

pkg/sql/pgwire/types.go, line 542 at r1 (raw file):

		case formatText:
			dd := &parser.DDecimal{}
			dec, res, err := parser.HighPrecisionCtx.NewFromString(string(b))

Same point about SetString

pkg/sql/pgwire/types_test.go, line 261 at r1 (raw file):

}

func BenchmarkDecodeBinaryDecimal(b *testing.B) {

What do these Decimal benchmarks say?

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, knz (kena) wrote…

I'm not sure about this. Is this the precision we want for sqrt? Given that the result is actually exact to many more decimals, there would be more information in 4.0000000000 than there is in just 4. I'm open to a different view.

4.0000000000000000 is what Postgres returns. I agree with what @knz said, but I'm curious if we're doing this for a certain reason.

pkg/util/encoding/decimal_test.go, line 485 at r1 (raw file):

}

func BenchmarkEncodeDecimalSmall(b *testing.B) {

Could you include all of these benchmarks in the commit message?

Comments from Reviewable

nvanbenschoten · 2017-02-14T21:00:46Z

Reviewed 10 of 42 files at r1, 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 28 unresolved discussions, some commit checks failed.

pkg/sql/testdata/decimal, line 3 at r2 (raw file):

# The following tests have results equivalent to Postgres (differences
# in string representation and number of decimals returned, but otherwise
# the same). These do not pass using the inf.Dec package.

Let's make this comment more timeless. Add some context about how inf.Dec used to be the decimal library we used.

pkg/util/encoding/decimal_test.go, line 72 at r1 (raw file):

	}
	for _, c := range testCases {
		// Deal with the issue that Dec.SetString can't handle exponent notation.

SetString supports exponent notation now, right?

pkg/util/encoding/encoding_test.go, line 885 at r1 (raw file):

func (rd randData) decimal() *apd.Decimal {
	d := &apd.Decimal{}

Is this just apd.New?

Comments from Reviewable

RaduBerinde · 2017-02-15T23:26:52Z

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

4.0000000000000000 is what Postgres returns. I agree with what @knz said, but I'm curious if we're doing this for a certain reason.

In this case, since the input is just 16 it seems fine to me. It would probably make sense to preserve at least as many decimals as the input. Meaning sqrt(16) = 4 but sqrt(16.000) = 4.000.

By the way, preserving the number of decimals doesn't really work across encoding/decoding cycles (#13384). Maybe it's better to just always remove trailing 0s in the interim, as we would at least be consistent.

Comments from Reviewable

maddyblue · 2017-02-17T06:59:54Z

Updated with some perf improvements and benchmarks. Looking into why some decoding benchmarks are slower.

Review status: 25 of 43 files reviewed at latest revision, 28 unresolved discussions.

glide.lock, line 407 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we remove this yet?

Done.

pkg/sql/parser/aggregate_builtins.go, line 308 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Remove the error or the comment? We should definitely prefer panicing over ignoring errors

All decimal operations can error for a variety of reasons when setting exponents including overflow and underflow errors. Quo can further error when the precision is 0 or too high (guaranteed not to happen here though) and on divide by zero. I've updated the comment to be more general though.

pkg/sql/parser/aggregate_builtins.go, line 556 at r1 (raw file):

Previously, knz (kena) wrote…

What error can that be? (Please clarify in "TODO" message).

In this case overflow or adding two decimals of ridiculously differing exponents (unlikely but possible), and all of the various set exponent errors alluded above. Documenting each possible error is probably not practical because there are so many kinds that it's easy to miss one and difficult to keep accurate.

pkg/sql/parser/builtins.go, line 45 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

move this guy up

Done.

pkg/sql/parser/builtins.go, line 1240 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How does c.Quantize(d, 0) compare to c.Round(d)?

Round doesn't care about the exponent, just the number of digits. Quantize puts its value in terms of the exponent of the second argument.

pkg/sql/parser/builtins.go, line 1255 at r1 (raw file):

Previously, knz (kena) wrote…

In the interim, please add a phrase in the Info field below about this ("if the 2nd value to round is larger than ... then the results are undefined")

Done.

pkg/sql/parser/builtins.go, line 1307 at r1 (raw file):

Previously, knz (kena) wrote…

It's the other way around: the other two must be extended to check their argument and return an error when it's invalid.
(At least to match pg. Later we can have error modes.)
Perhaps you can add the check to the other two since it's rather straightforward.

ln and log perform those checks themselves. The check here is not needed, except to produce a specific error message (apd returns the same error, just with different text). The TODO is there because not all the decimal function display errors in exactly the same way, but they are always handled.

pkg/sql/parser/builtins.go, line 1336 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Trunc seems like a good addition to the library itself.

Also, why not make Modf support nil fractions so trunc could simply be x.Modf(d, nil)?

The ToIntegral method does this, but requires the rounding mode to be always down to get the same results as truncate. The modf method here is just an optimization because we know it's safe. I've added a todo to move this method to apd.

pkg/sql/parser/datum.go, line 491 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Didn't we add a SetString method to Context?

There's a SetString on Decimal. But we need it to go through the context because it does bounds checking, which Decimal.SetString does not.

pkg/sql/parser/datum.go, line 556 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Consider adding FormatStandard(buf *bytes.Buffer) and related method to the library.

Filed an issue for this.

pkg/sql/parser/normalize.go, line 689 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we still need this?

Removed.

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

4.0000000000000000 is what Postgres returns. I agree with what @knz said, but I'm curious if we're doing this for a certain reason.

This is tricky because internally apd knows this result is exact, meaning the trailing zeros are not needed, so it drops them. Unclear what to do here, but I'm inclined to leave it.

pkg/sql/testdata/builtin_function, line 663 at r1 (raw file):

Previously, knz (kena) wrote…

I checked with Oracle and mssql too. Both use half-up rounding. Please add a note to this effect ("it's rather common ...") both here and in the built-in definition for round().

Both oracle and mssql use half-up for decimals and banker for floats? Just trying to make sure I get the docs correct.

pkg/sql/testdata/decimal, line 3 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's make this comment more timeless. Add some context about how inf.Dec used to be the decimal library we used.

Done.

pkg/util/encoding/decimal_test.go, line 72 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

SetString supports exponent notation now, right?

Yes, this block has been removed.

pkg/util/encoding/encoding_test.go, line 885 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this just apd.New?

Done.

Comments from Reviewable

maddyblue · 2017-02-17T07:20:45Z

I found the slowdown in those functions and fixed it. Pretty much everything's better now: faster or less allocs.

Review status: 25 of 43 files reviewed at latest revision, 28 unresolved discussions.

Comments from Reviewable

knz · 2017-02-17T10:44:14Z

Reviewed 18 of 18 files at r3.
Review status: all files reviewed at latest revision, 29 unresolved discussions, some commit checks failed.

pkg/sql/parser/aggregate_builtins.go, line 308 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

All decimal operations can error for a variety of reasons when setting exponents including overflow and underflow errors. Quo can further error when the precision is 0 or too high (guaranteed not to happen here though) and on divide by zero. I've updated the comment to be more general though.

Please insert a reference to issue #13640 in that TODO thanks.

pkg/sql/parser/aggregate_builtins.go, line 556 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

In this case overflow or adding two decimals of ridiculously differing exponents (unlikely but possible), and all of the various set exponent errors alluded above. Documenting each possible error is probably not practical because there are so many kinds that it's easy to miss one and difficult to keep accurate.

Ok, please refer to issue #13640 here instead then.

pkg/sql/parser/builtins.go, line 45 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Done.

I don't see anything changed.

pkg/sql/parser/builtins.go, line 1255 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Done.

It's not exactly what I meant. Doc readers are unlikely to understand this because nowhere else in our docs we explain what an "int32" is.

I suggest to go for: "If decimal_accuracy is not in the range -2^31...(2^31-1), the results are undefined."

Also that said, now I can see two additional issues popping up:

How is the algorithm implemented again? If the processing is linear in either time or space with this argument, we should clamp it down.
regarding the other things that happen in the apd library: what is the story w.r.t memory usage? We should do something here which we hadn't done so far with decimal, and that is equipping the thing with a memory budget. What are your thoughts on this?

pkg/sql/parser/builtins.go, line 1307 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

ln and log perform those checks themselves. The check here is not needed, except to produce a specific error message (apd returns the same error, just with different text). The TODO is there because not all the decimal function display errors in exactly the same way, but they are always handled.

Oh I trust that apd returns errors for this.
I meant to add the check for the float versions. Currently the float logs return NaN on invalid argument, whereas sqrt(-1) returns an error.

I mean we can either do one or the other, but they indeed need to be consistent.

Your call: either address this now or leave a TODO both here and also in the log functions.
Also please include a reference to issue #13642 in both places.

pkg/sql/parser/builtins.go, line 1336 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

The ToIntegral method does this, but requires the rounding mode to be always down to get the same results as truncate. The modf method here is just an optimization because we know it's safe. I've added a todo to move this method to apd.

Mention cockroachdb/apd#24 in the TODO.

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

This is tricky because internally apd knows this result is exact, meaning the trailing zeros are not needed, so it drops them. Unclear what to do here, but I'm inclined to leave it.

So the issue I see is not whether apd knows the result is exact, but the fact that users of a decimal type will derive their understanding of the precision of a result based on the number of decimals they see.
I know there are fancy arguments to go either way, and that this discussion is highly sensitive to people for which DECIMAL matter (in particular people who represent and compute money amounts). However this is not my area of specialty and I simply forgot what those arguments are.
Yet I would be wary to take a decision on this without doing some homework first. What are the arguments and what do we want to do with them.

Hence the question, is there a mathematician in the room?
Perhaps Toby would know (he was a mathematician!) and otherwise Arjun may know (he knows about money and numbers!)

pkg/sql/testdata/builtin_function, line 663 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Both oracle and mssql use half-up for decimals and banker for floats? Just trying to make sure I get the docs correct.

I hadn't checked what they do for floats. Also, I made a mistake; and your comment is not saying the same thing as your code. See below.

So I went for a second inspection of the docs:

Oracle: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions135.htm -- I hadn't read well, it's not half up, it's half away from zero for decimal. Round to nearest even for float, specified per doc.
mssql: https://msdn.microsoft.com/en-us/library/ms175003.aspx -- looks like half up, but then there is no example with negative number so it could be half away from zero.
sybase: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1600/doc/html/san1278453136695.html -- half away from zero by default, can be customized to round to even instead
postgres (based on tests): half away from zero for decimal, round to nearest even for floats (that's because they use the std math lib)
mysql: https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_round: half away from zero for decimal, std math lib for float, which is often round to even

Both MySQL and Oracle says explicitly that float gets a different treatment. MySQL says explicitly they use the underlying math lib, which happens to do round to nearest even; Oracle says round to nearest even always.

I think the case is rather clear: half away from zero for decimal seems to be nearly unanimous.
Also, based on your test results this is also what you have implemented, but this is not what your comment says. "Round half up" and "Round half away from zero" are not the same thing.

Also for compat with pg we're good to use the math lib for float.

pkg/util/encoding/decimal.go, line 622 at r3 (raw file):

		i := bi.Uint64()
		modified := false
		for i >= 10000 && i%10000 == 0 {

How did you choose this constant 10000?

Comments from Reviewable

RaduBerinde · 2017-02-17T14:02:02Z

pkg/util/encoding/decimal.go, line 622 at r3 (raw file):

Previously, knz (kena) wrote…

How did you choose this constant 10000?

I suggested it in similar code in cockroachdb/apd#20.

2^64 is ~2e19 and 4 is around sqrt(19), meaning that we might get around 4 iterations here and 4 iterations in the next loop (in general if the maximum value is 10^N we would want 10^sqrt(N) here).

Of course for such small numbers one would find this empirically, and multiple steps (e.g. 1,000,000, 10000, 100, 10) are probably better. But I assumed it's not such a critical piece of code.

Comments from Reviewable

RaduBerinde · 2017-02-17T14:07:55Z

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, knz (kena) wrote…

So the issue I see is not whether apd knows the result is exact, but the fact that users of a decimal type will derive their understanding of the precision of a result based on the number of decimals they see.
I know there are fancy arguments to go either way, and that this discussion is highly sensitive to people for which DECIMAL matter (in particular people who represent and compute money amounts). However this is not my area of specialty and I simply forgot what those arguments are.
Yet I would be wary to take a decision on this without doing some homework first. What are the arguments and what do we want to do with them.

Hence the question, is there a mathematician in the room?
Perhaps Toby would know (he was a mathematician!) and otherwise Arjun may know (he knows about money and numbers!)

My vote is to let this in now (with trimming zeros) and we can have a discussion and plan the necessary work in #13384. My motivation is a bit selfish, because I am having trouble with logic tests with DistSQL because depending on how the decimals travel over the wire they can lose trailing 0s.

There is no way to make this better with DistSQL until we have an encoding that preserves the 0s.

Comments from Reviewable

maddyblue · 2017-02-18T08:44:02Z

Review status: 41 of 43 files reviewed at latest revision, 29 unresolved discussions, all commit checks successful.

pkg/sql/parser/aggregate_builtins.go, line 556 at r1 (raw file):

Previously, knz (kena) wrote…

Ok, please refer to issue #13640 here instead then.

Done.

pkg/sql/parser/aggregate_builtins.go, line 594 at r1 (raw file):

Previously, knz (kena) wrote…

ditto

Done.

pkg/sql/parser/aggregate_builtins.go, line 754 at r1 (raw file):

Previously, knz (kena) wrote…

ditto

Done.

pkg/sql/parser/aggregate_builtins.go, line 765 at r1 (raw file):

Previously, knz (kena) wrote…

ditto

Done.

pkg/sql/parser/aggregate_builtins.go, line 765 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

ditto to panicing for now instead of avoiding the check.

Done.

pkg/sql/parser/aggregate_builtins.go, line 798 at r1 (raw file):

Previously, knz (kena) wrote…

same as sum above, this value can never be negative.

Done.

pkg/sql/parser/builtins.go, line 45 at r1 (raw file):

Previously, knz (kena) wrote…

I don't see anything changed.

Done.

pkg/sql/parser/builtins.go, line 1255 at r1 (raw file):

Previously, knz (kena) wrote…

It's not exactly what I meant. Doc readers are unlikely to understand this because nowhere else in our docs we explain what an "int32" is.

I suggest to go for: "If decimal_accuracy is not in the range -2^31...(2^31-1), the results are undefined."

Also that said, now I can see two additional issues popping up:

How is the algorithm implemented again? If the processing is linear in either time or space with this argument, we should clamp it down.

regarding the other things that happen in the apd library: what is the story w.r.t memory usage? We should do something here which we hadn't done so far with decimal, and that is equipping the thing with a memory budget. What are your thoughts on this?

This uses Quantize, which may do a multiply, and always rounds. However the input and results are limited to 2000 digits in the mantissa and exponents of +/- 2000, so it'd be pretty hard for this to be slow.
Memory usage is limited by cockroach's use of HighPrecisionCtx, which limits mantissa and exponent as described above. There are some operations that ignore the mantissa limits (like addition), though, in order to fully preserve exactness. I'm not opposed to equipping this with a memory budget, although I think it should be another PR.

pkg/sql/parser/builtins.go, line 1307 at r1 (raw file):

Previously, knz (kena) wrote…

Oh I trust that apd returns errors for this.
I meant to add the check for the float versions. Currently the float logs return NaN on invalid argument, whereas sqrt(-1) returns an error.

I mean we can either do one or the other, but they indeed need to be consistent.

Your call: either address this now or leave a TODO both here and also in the log functions.
Also please include a reference to issue #13642 in both places.

Done.

pkg/sql/parser/builtins.go, line 1336 at r1 (raw file):

Previously, knz (kena) wrote…

Mention cockroachdb/apd#24 in the TODO.

Done.

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):
sqrt and cbrt are special: they are the only apd functions that trim zeros after their work. All the other functions keep them:

> SELECT 1.00::DECIMAL + 2.0000::DECIMAL;
+--------------------------------+
| 1.00::DECIMAL + 2.0000::DECIMAL |
+--------------------------------+
|                         3.0000 |
+--------------------------------+

So internal calculations will retain zeros nearly all of the time. (Anything written/read from disk will not have trailing zeros retained, though, because that's how the encoding package works. If we want to change that it'd be in another PR.)

@knz is this fine? I think you are correct in general, but sqrt and cbrt are bad examples.

pkg/sql/testdata/builtin_function, line 663 at r1 (raw file):

Previously, knz (kena) wrote…

I hadn't checked what they do for floats. Also, I made a mistake; and your comment is not saying the same thing as your code. See below.

So I went for a second inspection of the docs:

Oracle: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions135.htm -- I hadn't read well, it's not half up, it's half away from zero for decimal. Round to nearest even for float, specified per doc.

mssql: https://msdn.microsoft.com/en-us/library/ms175003.aspx -- looks like half up, but then there is no example with negative number so it could be half away from zero.

sybase: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1600/doc/html/san1278453136695.html -- half away from zero by default, can be customized to round to even instead

postgres (based on tests): half away from zero for decimal, round to nearest even for floats (that's because they use the std math lib)

mysql: https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_round: half away from zero for decimal, std math lib for float, which is often round to even

Both MySQL and Oracle says explicitly that float gets a different treatment. MySQL says explicitly they use the underlying math lib, which happens to do round to nearest even; Oracle says round to nearest even always.

I think the case is rather clear: half away from zero for decimal seems to be nearly unanimous.
Also, based on your test results this is also what you have implemented, but this is not what your comment says. "Round half up" and "Round half away from zero" are not the same thing.

Also for compat with pg we're good to use the math lib for float.

"half up" as specified by apd doesn't care about sign, so it's the same as "half away from zero" if you care about sign. I think we are talking about the same thing, I was just using a different definition.

I've added docs and comments about this.

Comments from Reviewable

knz · 2017-02-19T12:49:51Z

Reviewed 4 of 4 files at r4.
Review status: all files reviewed at latest revision, 23 unresolved discussions, some commit checks failed.

pkg/sql/parser/builtins.go, line 1255 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

This uses Quantize, which may do a multiply, and always rounds. However the input and results are limited to 2000 digits in the mantissa and exponents of +/- 2000, so it'd be pretty hard for this to be slow.

Memory usage is limited by cockroach's use of HighPrecisionCtx, which limits mantissa and exponent as described above. There are some operations that ignore the mantissa limits (like addition), though, in order to fully preserve exactness. I'm not opposed to equipping this with a memory budget, although I think it should be another PR.

ok, excellent

pkg/sql/parser/builtins.go, line 1241 at r4 (raw file):

		decimalBuiltin1(func(x *apd.Decimal) (Datum, error) {
			return roundDecimal(x, 0)
		}, "Rounds `val` to the nearest integer using half away from zero rounding."),

While the reference to "Banker's rounding" above is sufficient for a reader to find something online to explain, the sentence here is too poor in details to understand what's really going on. What about: "Rounds val to the nearest integer, half away from zero: ROUND(+/-2.4) = +/-2, ROUND(+/-2.5) = +/-3"

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

sqrt and cbrt are special: they are the only apd functions that trim zeros after their work. All the other functions keep them:
> SELECT 1.00::DECIMAL + 2.0000::DECIMAL;
+--------------------------------+
| 1.00::DECIMAL + 2.0000::DECIMAL |
+--------------------------------+
|                         3.0000 |
+--------------------------------+
So internal calculations will retain zeros nearly all of the time. (Anything written/read from disk will not have trailing zeros retained, though, because that's how the encoding package works. If we want to change that it'd be in another PR.)

@knz is this fine? I think you are correct in general, but sqrt and cbrt are bad examples.

I think I'm ok with this for now, but please clarify (and add a test for) what is intended to happen if the result of sqrt/cbrt is not exact.

pkg/util/encoding/decimal.go, line 622 at r3 (raw file):

Previously, RaduBerinde wrote…

I suggested it in similar code in cockroachdb/apd#20.

2^64 is ~2e19 and 4 is around sqrt(19), meaning that we might get around 4 iterations here and 4 iterations in the next loop (in general if the maximum value is 10^N we would want 10^sqrt(N) here).

Of course for such small numbers one would find this empirically, and multiple steps (e.g. 1,000,000, 10000, 100, 10) are probably better. But I assumed it's not such a critical piece of code.

Oh, this is fine. But the same sentence that you wrote above would be nice in a comment in the code here.

Comments from Reviewable

maddyblue · 2017-02-21T06:10:31Z

Review status: all files reviewed at latest revision, 22 unresolved discussions, some commit checks failed.

pkg/sql/parser/builtins.go, line 1241 at r4 (raw file):

Previously, knz (kena) wrote…

While the reference to "Banker's rounding" above is sufficient for a reader to find something online to explain, the sentence here is too poor in details to understand what's really going on. What about: "Rounds val to the nearest integer, half away from zero: ROUND(+/-2.4) = +/-2, ROUND(+/-2.5) = +/-3"

Done.

pkg/sql/testdata/alias_types, line 59 at r1 (raw file):

Previously, knz (kena) wrote…

I think I'm ok with this for now, but please clarify (and add a test for) what is intended to happen if the result of sqrt/cbrt is not exact.

There is such an example in sql/testdata/decimal. If sqrt is inexact, it will contain 16 decimal digits, including possible trailing zeros.

pkg/util/encoding/decimal.go, line 622 at r3 (raw file):

Previously, knz (kena) wrote…

Oh, this is fine. But the same sentence that you wrote above would be nice in a comment in the code here.

Added this as a comment.

Comments from Reviewable

nvanbenschoten · 2017-02-21T07:22:16Z

Reviewed 6 of 18 files at r3, 2 of 4 files at r4, 14 of 14 files at r5.
Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

pkg/sql/parser/datum.go, line 491 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

There's a SetString on Decimal. But we need it to go through the context because it does bounds checking, which Decimal.SetString does not.

Why don't we add one to Context then? Seems like it would be useful in a number of places and would avoid at least one allocation in places like these:

dd := &DDecimal{}
HighPrecisionCtx.SetString(&dd.Decimal)

Comments from Reviewable

knz · 2017-02-21T16:20:10Z

Perhaps you could add "Fixes #13051" to the PR description (and add a test?)

maddyblue · 2017-02-21T19:31:17Z

Done.

Review status: 32 of 43 files reviewed at latest revision, 13 unresolved discussions.

pkg/sql/parser/datum.go, line 491 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why don't we add one to Context then? Seems like it would be useful in a number of places and would avoid at least one allocation in places like these:
dd := &DDecimal{}
HighPrecisionCtx.SetString(&dd.Decimal)

Done.

Comments from Reviewable

knz · 2017-02-21T22:50:29Z

Reviewed 10 of 14 files at r5, 11 of 11 files at r6.
Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

pkg/sql/testdata/decimal, line 55 at r6 (raw file):

0.00000000000000000001000000000000000

# inf panics

nit: I usually put the issue number that motivates a regression test in the comment, so that later the person curious about why this is there can check the reason out.

Comments from Reviewable

nvanbenschoten · 2017-02-21T23:07:54Z

Reviewed 1 of 4 files at r4, 1 of 11 files at r6.
Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful.

Comments from Reviewable

Differences: - decimal rounding is now consistent with postgres (note that it is different than float rounding). The util/decimal package has been completely removed. There were some non-inf uses of it to compute 10^X, but those have been hard coded or removed. Fixes #13051 name old time/op new time/op delta EncodeDecimalSmall-12 302ns ± 1% 307ns ± 0% +1.72% (p=0.016 n=5+4) DecodeDecimalSmall-12 400ns ± 1% 400ns ± 0% ~ (p=0.595 n=5+5) EncodeDecimalMedium-12 290ns ± 1% 298ns ± 1% +2.76% (p=0.008 n=5+5) DecodeDecimalMedium-12 392ns ± 0% 398ns ± 1% +1.48% (p=0.008 n=5+5) EncodeDecimalLarge-12 300ns ± 1% 308ns ± 0% +2.47% (p=0.008 n=5+5) DecodeDecimalLarge-12 401ns ± 1% 402ns ± 0% ~ (p=0.587 n=5+5) PeekLengthDecimal-12 35.1ns ± 0% 35.9ns ± 0% +2.28% (p=0.008 n=5+5) NonsortingEncodeDecimal-12 165ns ± 3% 116ns ± 1% -30.02% (p=0.008 n=5+5) NonsortingDecodeDecimal-12 147ns ± 2% 143ns ± 1% -2.72% (p=0.008 n=5+5) EncodeDecimalValue-12 164ns ± 4% 108ns ± 1% -33.98% (p=0.008 n=5+5) DecodeDecimalValue-12 167ns ± 1% 172ns ± 2% +3.12% (p=0.016 n=5+5) WriteTextDecimal-12 508ns ± 0% 517ns ± 5% ~ (p=0.611 n=5+5) WriteBinaryDecimal-12 509ns ± 0% 509ns ± 1% ~ (p=0.778 n=4+5) DecodeBinaryDecimal-12 3.78µs ± 1% 3.82µs ± 1% ~ (p=0.111 n=5+5) name old alloc/op new alloc/op delta EncodeDecimalSmall-12 74.0B ± 0% 74.0B ± 0% ~ (all equal) DecodeDecimalSmall-12 128B ± 0% 128B ± 0% ~ (all equal) EncodeDecimalMedium-12 74.0B ± 0% 74.0B ± 0% ~ (all equal) DecodeDecimalMedium-12 128B ± 0% 128B ± 0% ~ (all equal) EncodeDecimalLarge-12 74.0B ± 0% 74.0B ± 0% ~ (all equal) DecodeDecimalLarge-12 128B ± 0% 128B ± 0% ~ (all equal) PeekLengthDecimal-12 0.00B 0.00B ~ (all equal) NonsortingEncodeDecimal-12 23.4B ± 3% 23.0B ± 0% ~ (p=0.333 n=5+4) NonsortingDecodeDecimal-12 96.0B ± 0% 96.0B ± 0% ~ (all equal) EncodeDecimalValue-12 7.00B ± 0% 7.60B ± 8% ~ (p=0.238 n=4+5) DecodeDecimalValue-12 96.0B ± 0% 96.0B ± 0% ~ (all equal) WriteTextDecimal-12 288B ± 0% 240B ± 0% -16.67% (p=0.008 n=5+5) WriteBinaryDecimal-12 288B ± 0% 240B ± 0% -16.67% (p=0.008 n=5+5) DecodeBinaryDecimal-12 464B ± 0% 464B ± 0% ~ (all equal) name old allocs/op new allocs/op delta EncodeDecimalSmall-12 2.00 ± 0% 2.00 ± 0% ~ (all equal) DecodeDecimalSmall-12 3.00 ± 0% 3.00 ± 0% ~ (all equal) EncodeDecimalMedium-12 2.00 ± 0% 2.00 ± 0% ~ (all equal) DecodeDecimalMedium-12 3.00 ± 0% 3.00 ± 0% ~ (all equal) EncodeDecimalLarge-12 2.00 ± 0% 2.00 ± 0% ~ (all equal) DecodeDecimalLarge-12 3.00 ± 0% 3.00 ± 0% ~ (all equal) PeekLengthDecimal-12 0.00 0.00 ~ (all equal) NonsortingEncodeDecimal-12 0.00 0.00 ~ (all equal) NonsortingDecodeDecimal-12 2.00 ± 0% 2.00 ± 0% ~ (all equal) EncodeDecimalValue-12 0.00 0.00 ~ (all equal) DecodeDecimalValue-12 2.00 ± 0% 2.00 ± 0% ~ (all equal) WriteTextDecimal-12 6.00 ± 0% 5.00 ± 0% -16.67% (p=0.008 n=5+5) WriteBinaryDecimal-12 6.00 ± 0% 5.00 ± 0% -16.67% (p=0.008 n=5+5) DecodeBinaryDecimal-12 21.0 ± 0% 21.0 ± 0% ~ (all equal)

maddyblue · 2017-02-22T07:31:02Z

Review status: 42 of 43 files reviewed at latest revision, 10 unresolved discussions.

pkg/sql/testdata/decimal, line 55 at r6 (raw file):

Previously, knz (kena) wrote…

nit: I usually put the issue number that motivates a regression test in the comment, so that later the person curious about why this is there can check the reason out.

Done.

Comments from Reviewable

knz requested review from knz and nvanbenschoten February 14, 2017 14:10

This was referenced Feb 17, 2017

sql: support error handling during result aggregations #13640

Closed

Implement Trunc in apd cockroachdb/apd#24

Closed

knz added the docs-todo label Feb 19, 2017

RaduBerinde mentioned this pull request Feb 21, 2017

sql,distsql: inconsistent aggregation results with DistSQL #13689

Closed

maddyblue merged commit 8fe4d6d into cockroachdb:master Feb 22, 2017

maddyblue deleted the apd branch February 22, 2017 08:32

nvanbenschoten mentioned this pull request Mar 24, 2017

util/decimal: Investigate EricLagergren/decimal package #6009

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

all: inf.dec -> apd.Decimal #13551

all: inf.dec -> apd.Decimal #13551

maddyblue commented Feb 13, 2017 •

edited by knz

Loading

knz commented Feb 14, 2017

nvanbenschoten commented Feb 14, 2017

nvanbenschoten commented Feb 14, 2017

RaduBerinde commented Feb 15, 2017

maddyblue commented Feb 17, 2017

maddyblue commented Feb 17, 2017

knz commented Feb 17, 2017

RaduBerinde commented Feb 17, 2017

RaduBerinde commented Feb 17, 2017

maddyblue commented Feb 18, 2017

knz commented Feb 19, 2017

maddyblue commented Feb 21, 2017

nvanbenschoten commented Feb 21, 2017

knz commented Feb 21, 2017

maddyblue commented Feb 21, 2017

knz commented Feb 21, 2017

nvanbenschoten commented Feb 21, 2017

maddyblue commented Feb 22, 2017

all: inf.dec -> apd.Decimal #13551

all: inf.dec -> apd.Decimal #13551

Conversation

maddyblue commented Feb 13, 2017 • edited by knz Loading

knz commented Feb 14, 2017

nvanbenschoten commented Feb 14, 2017

nvanbenschoten commented Feb 14, 2017

RaduBerinde commented Feb 15, 2017

maddyblue commented Feb 17, 2017

maddyblue commented Feb 17, 2017

knz commented Feb 17, 2017

RaduBerinde commented Feb 17, 2017

RaduBerinde commented Feb 17, 2017

maddyblue commented Feb 18, 2017

knz commented Feb 19, 2017

maddyblue commented Feb 21, 2017

nvanbenschoten commented Feb 21, 2017

knz commented Feb 21, 2017

maddyblue commented Feb 21, 2017

knz commented Feb 21, 2017

nvanbenschoten commented Feb 21, 2017

maddyblue commented Feb 22, 2017

maddyblue commented Feb 13, 2017 •

edited by knz

Loading