New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgwire text encoding fixes #32144

Merged
merged 11 commits into from Nov 27, 2018

Conversation

Projects
None yet
4 participants
@mjibson
Copy link
Member

mjibson commented Nov 4, 2018

Most commits here are bug fixes or test improvements. Arrays/tuples and intervals got significant changes to match postgres (but they weren't broken before), and are benchmarked below.

Array encoding delta:

name                                                     old time/op  new time/op  delta
Encodings/array['']/text-8                                511ns ± 8%   475ns ± 4%     ~     (p=0.095 n=5+5)
Encodings/array['\x0d53e338548082'::BYTEA]/text-8         846ns ±10%   604ns ± 9%  -28.57%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces'::BYTEA]/text-8         883ns ± 3%   719ns ±10%  -18.66%  (p=0.008 n=5+5)
Encodings/array['name'::NAME]/text-8                      893ns ± 6%   599ns ±11%  -32.95%  (p=0.008 n=5+5)
Encodings/array[NULL]::text[]/text-8                      615ns ± 9%   523ns ± 2%  -14.99%  (p=0.008 n=5+5)
Encodings/array['']::text[]/text-8                        587ns ± 1%   553ns ± 9%     ~     (p=0.151 n=5+5)
Encodings/array['test']::text[]/text-8                    651ns ± 3%   579ns ± 8%  -10.97%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces']::text[]/text-8        837ns ± 9%   688ns ± 7%  -17.79%  (p=0.008 n=5+5)
Encodings/array[e'\f']::text[]/text-8                     634ns ± 2%   550ns ± 8%  -13.27%  (p=0.008 n=5+5)
Encodings/array[e'\uFEFF']::text[]/text-8                1.18µs ± 4%  0.57µs ± 3%  -51.94%  (p=0.008 n=5+5)
Encodings/array[e'\u2603']::text[]/text-8                 799ns ± 3%   573ns ± 7%  -28.25%  (p=0.008 n=5+5)
Encodings/(1::int8,2::int8,3::int8,4::int8,null)/text-8   488ns ± 7%   619ns ± 3%  +26.99%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::BYTEA,null)/text-8         720ns ±27%   460ns ± 8%  -36.12%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::TEXT,null)/text-8          517ns ± 9%   437ns ±10%  -15.51%  (p=0.008 n=5+5)
Encodings/('name'::NAME,null)/text-8                      525ns ± 6%   325ns ± 7%  -38.02%  (p=0.008 n=5+5)
Encodings/('false'::JSONB,null)/text-8                   1.04µs ± 7%  0.63µs ± 6%  -39.00%  (p=0.008 n=5+5)
Encodings/('{"a":_[]}'::JSONB,null)/text-8               1.25µs ± 6%  0.75µs ±13%  -40.44%  (p=0.008 n=5+5)

Interval encoding delta:

name                                                    old time/op  new time/op  delta
Encodings/'10y10mon'::interval/text-8                    836ns ± 4%  1055ns ± 4%   +26.19%  (p=0.008 n=5+5)
Encodings/'1y1mon'::interval/text-8                      830ns ± 6%  1019ns ±15%   +22.78%  (p=0.008 n=5+5)
Encodings/'23:12:34'::interval/text-8                   1.00µs ± 5%  1.10µs ± 9%    +9.18%  (p=0.032 n=5+5)
Encodings/'-23:00:00'::interval/text-8                   791ns ± 7%  1016ns ± 5%   +28.56%  (p=0.008 n=5+5)
Encodings/'-10d'::interval/text-8                        798ns ±12%   958ns ± 3%   +20.03%  (p=0.008 n=5+5)
Encodings/'1ms'::interval/text-8                         765ns ± 3%  1418ns ± 6%   +85.36%  (p=0.008 n=5+5)
Encodings/'.2ms'::interval/text-8                        777ns ± 3%  1387ns ± 4%   +78.39%  (p=0.008 n=5+5)
Encodings/'-1mon1m'::interval/text-8                     898ns ± 5%  1386ns ± 9%   +54.29%  (p=0.008 n=5+5)
Encodings/'-1y1m'::interval/text-8                       900ns ± 7%  1380ns ± 1%   +53.34%  (p=0.008 n=5+5)
Encodings/'3y4mon5d6ms'::interval/text-8                1.13µs ± 2%  2.23µs ±13%   +97.24%  (p=0.008 n=5+5)
Encodings/'296537y20d15h30m7s'::interval/text-8         1.35µs ± 8%  1.77µs ± 8%   +31.03%  (p=0.008 n=5+5)
Encodings/'-2965y_-20d_-15h_-30m_-7s'::interval/text-8  1.39µs ± 6%  1.67µs ± 4%   +19.92%  (p=0.008 n=5+5)
Encodings/'00:00:00'::interval/text-8                    566ns ± 3%   583ns ± 3%      ~     (p=0.175 n=5+5)

@mjibson mjibson requested review from bobvawter and BramGruneir Nov 4, 2018

@mjibson mjibson requested review from cockroachdb/sql-rest-prs as code owners Nov 4, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Nov 4, 2018

This change is Reviewable

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 4, 2018

Hold off on reviewing until I fix these tests.

@mjibson mjibson force-pushed the mjibson:pgwire-text branch from 3521590 to 0105893 Nov 5, 2018

@mjibson mjibson requested review from cockroachdb/sql-async-prs as code owners Nov 5, 2018

@mjibson mjibson force-pushed the mjibson:pgwire-text branch 2 times, most recently from aafc4d2 to be7e0ca Nov 6, 2018

@mjibson mjibson requested review from cockroachdb/cli-prs as code owners Nov 7, 2018

@mjibson mjibson force-pushed the mjibson:pgwire-text branch 2 times, most recently from a8ff714 to b3a9dcb Nov 7, 2018

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 7, 2018

Ok PTAL when you have time, no rush. Also I'm going to be out until the 19th so feel free to do whatever you want including merging or waiting until I come back to finish this up.

@mjibson mjibson requested a review from knz Nov 7, 2018

@mjibson mjibson force-pushed the mjibson:pgwire-text branch 2 times, most recently from 21b272d to 100e2ae Nov 7, 2018

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 8, 2018

Rebased onto master with the previous PR commits now included, so all commits here need review. Thanks.

@BramGruneir
Copy link
Member

BramGruneir left a comment

For the first commit. It seems wrong to me to first commit a bunch of failing tests and then the fixes for them. Could you reorder this so the tests go in after they should work. Or add in a skip for now that you remove once the tests pass.
The general rule is that each commit should move the state of the system from one good state to a better one. This breaks that by adding failing tests.

I love the commit (the 3rd one) that moves this to a list of data files. And I'm trusting that these contain the full list suite of tests from the original. But I do think that could have been pull out into its own PR.

The 4th PR should be re-reviewed by @bobvawter, since he's now the expert on dates. Also, does this change it so we have to specify dates and interval in the postgres way? Or can we still also use the go way? Would be a shame to not be able to do both.

Got through all but the last commit so far. Will continue tomorrow.

Reviewed 3 of 3 files at r1, 3 of 3 files at r2, 37 of 37 files at r3, 36 of 36 files at r4, 2 of 2 files at r5, 1 of 1 files at r6, 3 of 3 files at r7, 4 of 30 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cli/testdata/dump/row, line 93 at r4 (raw file):

	(NULL, 'NaN', NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'NaN', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
----
----

Was the extra empty line purposefully removed?


pkg/sql/logictest/testdata/logic_test/time, line 194 at r4 (raw file):

SELECT '11:59:59':::TIME - '12:00:00':::TIME
----
-00:00:01

How do we handle -00:00:00? Is it the same as 00:00:00? Please add tests for both of them.


pkg/sql/pgwire/types.go, line 197 at r6 (raw file):

	case *tree.DBitArray:
		words, lastBitsUsed := v.EncodingParts()
		if len(words) == 0 {

Can you add a test for this? Also, a comment here would be nice.


pkg/sql/sem/tree/testdata/eval/overflow, line 1 at r3 (raw file):

eval

no first line with an explanation


pkg/sql/sem/tree/testdata/eval/row, line 1 at r3 (raw file):

# Row

most of these have a space after the first comment


pkg/sql/sem/tree/testdata/eval/unary, line 1 at r3 (raw file):

eval

No first line comment


pkg/util/duration/duration.go, line 174 at r4 (raw file):

	wrote := false
	wrotePrev := func() {

I like this way of dealing with adding spaces.

@knz
Copy link
Member

knz left a comment

Overall LGTM on the changes, modulo resolution of the open comments.

However I agree with Bram that the sequence of commits should be amenable to a git bisect, where the test suite is always expected to succeed.

Also I get that the change in the last comment is useful/important, but the extra work performed is quite the bummer. Could you please include a before/after benchmark of the encoding throughput for arrays/tuples on that commit.

Reviewed 3 of 3 files at r1, 3 of 3 files at r2, 37 of 37 files at r3, 36 of 36 files at r4, 2 of 2 files at r5, 1 of 1 files at r6, 3 of 3 files at r7, 29 of 30 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cmd/cmp-protocol/main.go, line 68 at r7 (raw file):

					sqlbase.ColumnType_OIDVECTOR,
					sqlbase.ColumnType_OID,         // our 8-byte ints are usually out of range for pg
					sqlbase.ColumnType_FLOAT,       // slight rounding differences at the end
  1. Is there a way to include them, but have some kind of function which masks the last digits during the comparison to assert whether the test successful?

  2. I'd like to see tests for DECIMAL and FLOAT which exercise the encodings for infinities and NaN. Are there any already? Perhaps somewhere else?


pkg/sql/sem/tree/interval.go, line 562 at r4 (raw file):

// or ".123" to "123000000". This function takes care to round correctly
// when a naive conversion would incorrectly truncate due to floating point
// inaccuracies.

Maybe add a comment to explain this rounding should not be tweaked unless care is taken it is the same as pg's
also add a reference to where the rounding occurs in pg (I sent you a link earlier)


pkg/sql/sem/tree/pgwire_encode.go, line 37 at r8 (raw file):

	ctx.WriteByte('(')
	for i, v := range d.D {
		if i > 0 {

the comma thing was better - more readable, also more efficient code (avoids a conditional; these are strictly worse than an extra assignment of a constant).


pkg/sql/sem/tree/pgwire_encode.go, line 46 at r8 (raw file):

		case *DCollatedString:
			pgwireFormatStringInTuple(ctx.Buffer, dv.Contents)
		case *DBytes:

Why is a separate case needed for DBytes? Please add a comment to explain/show why the default case below is not adequate.


pkg/sql/sem/tree/pgwire_encode.go, line 72 at r8 (raw file):

			buf.WriteByte(byte(r))
		} else {
			buf.WriteRune(r)

Are you sure it's save to use WriteRun instead of EncodeChar? If there was an invalid utf-8 sequence, r will be invalid.


pkg/sql/sem/tree/pgwire_encode.go, line 94 at r8 (raw file):

	ctx.WriteByte('{')
	for i, v := range d.Array {
		if i > 0 {

ditto my previous comment about comma


pkg/sql/sem/tree/pgwire_encode.go, line 104 at r8 (raw file):

		case *DCollatedString:
			pgwireFormatStringInArray(ctx.Buffer, dv.Contents)
		case *DBytes:

ditto my previous comment about DBytes


pkg/sql/sem/tree/pgwire_encode.go, line 133 at r8 (raw file):

func pgwireQuoteStringInArray(in string) bool {
	return in == "" || strings.ToLower(in) == "null" || arrayQuoteSet.in(in)

Do not call ToLower here. It forces an allocation, also if in is very long you're doing much too much work to determine what is needed. Better compare the 4 characters manually.


pkg/util/duration/duration.go, line 174 at r4 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

I like this way of dealing with adding spaces.

It would be even better to avoid the allocation of a closure and define wrotePrev := func(wrote) bool { .... return wrote}

then below write wrote= wrotePrev(wrote) every time.

@mjibson mjibson force-pushed the mjibson:pgwire-text branch from 100e2ae to 82c44cc Nov 20, 2018

@mjibson
Copy link
Member

mjibson left a comment

Added a skip to the initial test commit and removed it in the last commit.

Bram: we still accept dates/intervals in all the previous formats we did before. This change encodes them on the wire in the standard way.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cli/testdata/dump/row, line 93 at r4 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Was the extra empty line purposefully removed?

I used the -rewrite flag to rewrite the file and it did this, so I think it's correct now.


pkg/cmd/cmp-protocol/main.go, line 68 at r7 (raw file):

Previously, knz (kena) wrote…
  1. Is there a way to include them, but have some kind of function which masks the last digits during the comparison to assert whether the test successful?

  2. I'd like to see tests for DECIMAL and FLOAT which exercise the encodings for infinities and NaN. Are there any already? Perhaps somewhere else?

  1. The difficulty with float is that sometimes it's not just the last digit, since a .99999 could change to a 1.0. Also since this test just compares bytes and doesn't attempt to decode into datums, even doing that would be tricky.

  2. Added a commit for float/decimal nan/inf.


pkg/sql/logictest/testdata/logic_test/time, line 194 at r4 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

How do we handle -00:00:00? Is it the same as 00:00:00? Please add tests for both of them.

Tests added for both of these cases. (They are the same since -0 doesn't exist as an integer.)


pkg/sql/pgwire/types.go, line 197 at r6 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Can you add a test for this? Also, a comment here would be nice.

There was a test for this already, and this commit fixed it.


pkg/sql/sem/tree/interval.go, line 562 at r4 (raw file):

Previously, knz (kena) wrote…

Maybe add a comment to explain this rounding should not be tweaked unless care is taken it is the same as pg's
also add a reference to where the rounding occurs in pg (I sent you a link earlier)

Done.


pkg/sql/sem/tree/pgwire_encode.go, line 37 at r8 (raw file):

Previously, knz (kena) wrote…

the comma thing was better - more readable, also more efficient code (avoids a conditional; these are strictly worse than an extra assignment of a constant).

I'm genuinely surprised that it is more efficient, but a benchmark confirms. Reverted. Thanks for the insight.


pkg/sql/sem/tree/pgwire_encode.go, line 46 at r8 (raw file):

Previously, knz (kena) wrote…

Why is a separate case needed for DBytes? Please add a comment to explain/show why the default case below is not adequate.

Done.


pkg/sql/sem/tree/pgwire_encode.go, line 72 at r8 (raw file):

Previously, knz (kena) wrote…

Are you sure it's save to use WriteRun instead of EncodeChar? If there was an invalid utf-8 sequence, r will be invalid.

This function is called in two ways: 1) with a string from a DString or CollatedString, 2) from a datum formatted into a string. My current understanding is that none of those can ever produce an invalid utf-8 sequence. If we can come up with some SQL that causes postgres to do the equivalent of EncodeChar then I would reconsider. But until then I would prefer to keep this code as is. That is, unless we can come up with a test that exhibits this behavior in pg, I think it makes more sense to keep it like this.


pkg/sql/sem/tree/pgwire_encode.go, line 94 at r8 (raw file):

Previously, knz (kena) wrote…

ditto my previous comment about comma

Done.


pkg/sql/sem/tree/pgwire_encode.go, line 104 at r8 (raw file):

Previously, knz (kena) wrote…

ditto my previous comment about DBytes

Done.


pkg/sql/sem/tree/pgwire_encode.go, line 133 at r8 (raw file):

Previously, knz (kena) wrote…

Do not call ToLower here. It forces an allocation, also if in is very long you're doing much too much work to determine what is needed. Better compare the 4 characters manually.

Done.


pkg/sql/sem/tree/testdata/eval/overflow, line 1 at r3 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

no first line with an explanation

Done.


pkg/sql/sem/tree/testdata/eval/row, line 1 at r3 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

most of these have a space after the first comment

Done.


pkg/sql/sem/tree/testdata/eval/unary, line 1 at r3 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

No first line comment

Done.


pkg/util/duration/duration.go, line 174 at r4 (raw file):

Previously, knz (kena) wrote…

It would be even better to avoid the allocation of a closure and define wrotePrev := func(wrote) bool { .... return wrote}

then below write wrote= wrotePrev(wrote) every time.

Thanks for the suggestion. Using a func defined there that doesn't depend on arguments from the surrounding func performed 10% faster. Interestingly, if I just move that func definition to the package level, it gets 10% slower. Weird.

@knz knz added this to Triage in SQL Front-end, Lang & Semantics via automation Nov 21, 2018

@knz knz moved this from Triage to Current milestone in SQL Front-end, Lang & Semantics Nov 21, 2018

@mjibson mjibson force-pushed the mjibson:pgwire-text branch 2 times, most recently from 2047a27 to 27617ba Nov 23, 2018

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 26, 2018

Added benchmarks to the array/tuple commit (since it's the only one with significant perf differences). There's overall speedup because (I think) of no longer calling into a unicode function per char. Ready for a final look.

@knz
Copy link
Member

knz left a comment

code LGTM still

  1. regarding this:

Added benchmarks to the array/tuple commit (since it's the only one with significant perf differences).

The part "it's the only one with signififcant perf differnces" -- how did you ascertain this? How can you explain me why you think this is true? Would you perhaps like to show me a copy of the benchmarks you used to form this opinion? If there were benchmarks, it wouldn't hurt including them in the PR description or commit message. If there were no benchmarks, you should still include the evidence that enables you to hold that opinion.

  1. @BramGruneir please check whether your comments have been suitably addressed

  2. for the newly added telemetry consider adding a test that the telemetry gets activated. See TestErrorCounts for an example.

Reviewed 94 of 94 files at r31, 3 of 3 files at r32, 37 of 37 files at r33, 36 of 36 files at r34, 2 of 2 files at r35, 1 of 1 files at r36, 3 of 3 files at r37, 32 of 32 files at r38, 1 of 1 files at r39, 3 of 3 files at r40, 1 of 1 files at r41.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@BramGruneir
Copy link
Member

BramGruneir left a comment

:lgtm:

Just some new tests that I think were missing still, but I'm not sure.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/sql/logictest/testdata/logic_test/time, line 194 at r4 (raw file):

Previously, mjibson (Matt Jibson) wrote…

Tests added for both of these cases. (They are the same since -0 doesn't exist as an integer.)

I don't see the tests, but this is a huge diff, maybe I just missed it.

@mjibson mjibson force-pushed the mjibson:pgwire-text branch 2 times, most recently from 7541c6f to ab66959 Nov 26, 2018

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 26, 2018

Added a telemetry test and benchmarks for intervals. The other encoding changes were bug fixes so I'm not going to benchmark them.

The interval benchmarks are ~30% slower. I think this is acceptable due to postgres parity.

@knz

This comment has been minimized.

Copy link
Member

knz commented Nov 27, 2018

Thanks for adding the benchmark. If you could add the before/after comparison at the top level PR description that would be swell.

mjibson added some commits Nov 1, 2018

pgwire: enable text encoding tests in generate-binary
These don't pass as of this commit and are thus skipped.

Release note: None
pgwire: remove time output from dates
Release note (bug fix): dates no longer have a time component in their
text encoding over the wire.
pgwire: use postgres instead of Go interval encoding for pgwire
These changes match postgres for all tested inputs.

Benchmarks are slower across the board which makes sense due to the
div/mod nanosecond section that is now not conditional.

name                                                    old time/op  new time/op  delta
Encodings/'10y10mon'::interval/text-8                    836ns ± 4%  1055ns ± 4%   +26.19%  (p=0.008 n=5+5)
Encodings/'1y1mon'::interval/text-8                      830ns ± 6%  1019ns ±15%   +22.78%  (p=0.008 n=5+5)
Encodings/'23:12:34'::interval/text-8                   1.00µs ± 5%  1.10µs ± 9%    +9.18%  (p=0.032 n=5+5)
Encodings/'-23:00:00'::interval/text-8                   791ns ± 7%  1016ns ± 5%   +28.56%  (p=0.008 n=5+5)
Encodings/'-10d'::interval/text-8                        798ns ±12%   958ns ± 3%   +20.03%  (p=0.008 n=5+5)
Encodings/'1ms'::interval/text-8                         765ns ± 3%  1418ns ± 6%   +85.36%  (p=0.008 n=5+5)
Encodings/'.2ms'::interval/text-8                        777ns ± 3%  1387ns ± 4%   +78.39%  (p=0.008 n=5+5)
Encodings/'-1mon1m'::interval/text-8                     898ns ± 5%  1386ns ± 9%   +54.29%  (p=0.008 n=5+5)
Encodings/'-1y1m'::interval/text-8                       900ns ± 7%  1380ns ± 1%   +53.34%  (p=0.008 n=5+5)
Encodings/'3y4mon5d6ms'::interval/text-8                1.13µs ± 2%  2.23µs ±13%   +97.24%  (p=0.008 n=5+5)
Encodings/'296537y20d15h30m7s'::interval/text-8         1.35µs ± 8%  1.77µs ± 8%   +31.03%  (p=0.008 n=5+5)
Encodings/'-2965y_-20d_-15h_-30m_-7s'::interval/text-8  1.39µs ± 6%  1.67µs ± 4%   +19.92%  (p=0.008 n=5+5)
Encodings/'00:00:00'::interval/text-8                    566ns ± 3%   583ns ± 3%      ~     (p=0.175 n=5+5)

Release note (bug fix): intervals now match postgres in their text
encoding over the wire.

Release note (bug fix): intervals no longer sometimes lose 1ns of
precision. This only happened rarely due to floating point inaccuracy.
tree: change TestEval to use a data directory
Using the giant list is cumbersome and makes fixing output
difficult. Using datadriven gives us some additional organization and
allows us to use the -rewrite flag when the format of a datum changes.
pgwire: test timezones and BC for timestamp
timestamptz differs slightly but isn't broken, which is why it's
commented out.
pgwire: verify encoding length instead of ignoring it in TestEncodings
Improved output and smarter type handling as well.
pgwire: fix array and tuple encoding
Match postgres encoding much more closely for tuples and arrays.

Implement binary tuple encoding. It's pretty simple: number of elements
then each element's OID and binary encoding.

Text encoding for arrays and tuples has some fairly simple quoting rules:
whitespace, quotes, or commas. We were also encoding unicode runes more
than needed.

Unwray datums to remove any OID oddities. This is already done in
non-array/tuple encoding.

Change the encoding tests to specify exactly the bytes the Text format
wants. We still keep the text format around in the encodings.json file
just for human consumption. Since text editors have problems displaying
weirdo things like the byte order mark it's safer to just print out
the expected bytes as numbers instead of hoping there's a value there
that our editors can't show. But in any case switch to %q for the Text
display to try to improve the debugability anyway.

The benchmark below is faster in cases where the type in the array
or tuple is a string. I think this is because of the removed call to
unicode.IsGraphic on each rune. The slower benchmark is mostly ints,
which now go through the same escape and quote check logic as strings,
which would explain the slowdown. We could special case those to improve
performance further.

name                                                     old time/op  new time/op  delta
Encodings/array['']/text-8                                511ns ± 8%   475ns ± 4%     ~     (p=0.095 n=5+5)
Encodings/array['\x0d53e338548082'::BYTEA]/text-8         846ns ±10%   604ns ± 9%  -28.57%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces'::BYTEA]/text-8         883ns ± 3%   719ns ±10%  -18.66%  (p=0.008 n=5+5)
Encodings/array['name'::NAME]/text-8                      893ns ± 6%   599ns ±11%  -32.95%  (p=0.008 n=5+5)
Encodings/array[NULL]::text[]/text-8                      615ns ± 9%   523ns ± 2%  -14.99%  (p=0.008 n=5+5)
Encodings/array['']::text[]/text-8                        587ns ± 1%   553ns ± 9%     ~     (p=0.151 n=5+5)
Encodings/array['test']::text[]/text-8                    651ns ± 3%   579ns ± 8%  -10.97%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces']::text[]/text-8        837ns ± 9%   688ns ± 7%  -17.79%  (p=0.008 n=5+5)
Encodings/array[e'\f']::text[]/text-8                     634ns ± 2%   550ns ± 8%  -13.27%  (p=0.008 n=5+5)
Encodings/array[e'\uFEFF']::text[]/text-8                1.18µs ± 4%  0.57µs ± 3%  -51.94%  (p=0.008 n=5+5)
Encodings/array[e'\u2603']::text[]/text-8                 799ns ± 3%   573ns ± 7%  -28.25%  (p=0.008 n=5+5)
Encodings/(1::int8,2::int8,3::int8,4::int8,null)/text-8   488ns ± 7%   619ns ± 3%  +26.99%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::BYTEA,null)/text-8         720ns ±27%   460ns ± 8%  -36.12%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::TEXT,null)/text-8          517ns ± 9%   437ns ±10%  -15.51%  (p=0.008 n=5+5)
Encodings/('name'::NAME,null)/text-8                      525ns ± 6%   325ns ± 7%  -38.02%  (p=0.008 n=5+5)
Encodings/('false'::JSONB,null)/text-8                   1.04µs ± 7%  0.63µs ± 6%  -39.00%  (p=0.008 n=5+5)
Encodings/('{"a":_[]}'::JSONB,null)/text-8               1.25µs ± 6%  0.75µs ±13%  -40.44%  (p=0.008 n=5+5)

Fixes #31847

Release note (bug fix): Correct pgwire encoding for arrays and tuples.
pgwire: encoding tests/fixes for decimal and float nan/inf
Release note (bug fix): Correct binary decimal encoding for NaN.
pgwire: add encoding benchmarks
These use the same machinery as the encoding tests, so factor out some
common code.

Release note: None

@mjibson mjibson force-pushed the mjibson:pgwire-text branch from ab66959 to 8c7a60f Nov 27, 2018

@mjibson

This comment has been minimized.

Copy link
Member

mjibson commented Nov 27, 2018

bors r+

craig bot pushed a commit that referenced this pull request Nov 27, 2018

Merge #32144
32144: pgwire text encoding fixes r=mjibson a=mjibson

Most commits here are bug fixes or test improvements. Arrays/tuples and intervals got significant changes to match postgres (but they weren't broken before), and are benchmarked below.

Array encoding delta:

```
name                                                     old time/op  new time/op  delta
Encodings/array['']/text-8                                511ns ± 8%   475ns ± 4%     ~     (p=0.095 n=5+5)
Encodings/array['\x0d53e338548082'::BYTEA]/text-8         846ns ±10%   604ns ± 9%  -28.57%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces'::BYTEA]/text-8         883ns ± 3%   719ns ±10%  -18.66%  (p=0.008 n=5+5)
Encodings/array['name'::NAME]/text-8                      893ns ± 6%   599ns ±11%  -32.95%  (p=0.008 n=5+5)
Encodings/array[NULL]::text[]/text-8                      615ns ± 9%   523ns ± 2%  -14.99%  (p=0.008 n=5+5)
Encodings/array['']::text[]/text-8                        587ns ± 1%   553ns ± 9%     ~     (p=0.151 n=5+5)
Encodings/array['test']::text[]/text-8                    651ns ± 3%   579ns ± 8%  -10.97%  (p=0.008 n=5+5)
Encodings/array['test_with_spaces']::text[]/text-8        837ns ± 9%   688ns ± 7%  -17.79%  (p=0.008 n=5+5)
Encodings/array[e'\f']::text[]/text-8                     634ns ± 2%   550ns ± 8%  -13.27%  (p=0.008 n=5+5)
Encodings/array[e'\uFEFF']::text[]/text-8                1.18µs ± 4%  0.57µs ± 3%  -51.94%  (p=0.008 n=5+5)
Encodings/array[e'\u2603']::text[]/text-8                 799ns ± 3%   573ns ± 7%  -28.25%  (p=0.008 n=5+5)
Encodings/(1::int8,2::int8,3::int8,4::int8,null)/text-8   488ns ± 7%   619ns ± 3%  +26.99%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::BYTEA,null)/text-8         720ns ±27%   460ns ± 8%  -36.12%  (p=0.008 n=5+5)
Encodings/('test_with_spaces'::TEXT,null)/text-8          517ns ± 9%   437ns ±10%  -15.51%  (p=0.008 n=5+5)
Encodings/('name'::NAME,null)/text-8                      525ns ± 6%   325ns ± 7%  -38.02%  (p=0.008 n=5+5)
Encodings/('false'::JSONB,null)/text-8                   1.04µs ± 7%  0.63µs ± 6%  -39.00%  (p=0.008 n=5+5)
Encodings/('{"a":_[]}'::JSONB,null)/text-8               1.25µs ± 6%  0.75µs ±13%  -40.44%  (p=0.008 n=5+5)
```

Interval encoding delta:

```
name                                                    old time/op  new time/op  delta
Encodings/'10y10mon'::interval/text-8                    836ns ± 4%  1055ns ± 4%   +26.19%  (p=0.008 n=5+5)
Encodings/'1y1mon'::interval/text-8                      830ns ± 6%  1019ns ±15%   +22.78%  (p=0.008 n=5+5)
Encodings/'23:12:34'::interval/text-8                   1.00µs ± 5%  1.10µs ± 9%    +9.18%  (p=0.032 n=5+5)
Encodings/'-23:00:00'::interval/text-8                   791ns ± 7%  1016ns ± 5%   +28.56%  (p=0.008 n=5+5)
Encodings/'-10d'::interval/text-8                        798ns ±12%   958ns ± 3%   +20.03%  (p=0.008 n=5+5)
Encodings/'1ms'::interval/text-8                         765ns ± 3%  1418ns ± 6%   +85.36%  (p=0.008 n=5+5)
Encodings/'.2ms'::interval/text-8                        777ns ± 3%  1387ns ± 4%   +78.39%  (p=0.008 n=5+5)
Encodings/'-1mon1m'::interval/text-8                     898ns ± 5%  1386ns ± 9%   +54.29%  (p=0.008 n=5+5)
Encodings/'-1y1m'::interval/text-8                       900ns ± 7%  1380ns ± 1%   +53.34%  (p=0.008 n=5+5)
Encodings/'3y4mon5d6ms'::interval/text-8                1.13µs ± 2%  2.23µs ±13%   +97.24%  (p=0.008 n=5+5)
Encodings/'296537y20d15h30m7s'::interval/text-8         1.35µs ± 8%  1.77µs ± 8%   +31.03%  (p=0.008 n=5+5)
Encodings/'-2965y_-20d_-15h_-30m_-7s'::interval/text-8  1.39µs ± 6%  1.67µs ± 4%   +19.92%  (p=0.008 n=5+5)
Encodings/'00:00:00'::interval/text-8                    566ns ± 3%   583ns ± 3%      ~     (p=0.175 n=5+5)
```

Co-authored-by: Matt Jibson <matt.jibson@gmail.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Nov 27, 2018

Build succeeded

@craig craig bot merged commit 8c7a60f into cockroachdb:master Nov 27, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@mjibson mjibson deleted the mjibson:pgwire-text branch Nov 27, 2018

@knz knz moved this from Current milestone to Finished (m2.2-2) in SQL Front-end, Lang & Semantics Dec 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment