Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: Handle NULLS in TopK sorter #38654

Merged
merged 1 commit into from Jul 8, 2019

Conversation

@rohany
Copy link
Contributor

rohany commented Jul 3, 2019

This commit fixes NULLs in the TopK sorter by avoiding use
of the vec copy method, which has a bug. Instead, we add
a set method to the vec comparator, and use the templatized
comparator to perform the sets that the TopK sorter needs.

To facilitate this, we add an UnsetNull method to the Nulls
object. However, use of this method results in HasNull()
maybe returning true even if the vector doesn't have nulls.
This behavior already occurs when selection vectors are used.
Based on discussions with @solongordon and @asubiotto, this behavior
is OK, and future PR's will attempt to make this behavior better, and address
the bugs within the Vec Copy method.

@rohany rohany requested review from solongordon, asubiotto and cockroachdb/sql-execution Jul 3, 2019
@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Jul 3, 2019

This change is Reviewable

Copy link
Contributor

solongordon left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @rohany, and @solongordon)


pkg/sql/exec/vec_comparators.go, line 26 at r1 (raw file):

	// set sets the value of the vector at dstVecIdx at index dstIdx to the value
	// at the vector at srcVecIdx at index srcIdx.
	set(srcVecIdx, dstVecIdx int, srcIdx, dstIdx uint16)

Nit: How about srcValIdx and dstValIdx to match the argument names in compare?


pkg/sql/exec/vec_comparators_tmpl.go, line 82 at r1 (raw file):

func (c *_TYPEVecComparator) set(srcVecIdx, dstVecIdx int, srcIdx, dstIdx uint16) {
	if c.nulls[srcVecIdx].HasNulls() && c.nulls[srcVecIdx].NullAt(srcIdx) {

I think it's possible for nulls[srcVecIdx] or nulls[dstVecIdx] to be nil if they don't have nulls. Could you test this and handle those cases if necessary?


pkg/sql/exec/coldata/nulls.go, line 156 at r1 (raw file):

// UnsetNull64 unsets the ith values of the column.
func (n *Nulls) UnsetNull64(i uint64) {
	n.nulls[i/8] |= ^flippedBitMask[i%8]

^flippedBitMask is just bitMask.

@rohany rohany force-pushed the rohany:topknulls branch from 2665a78 to 88801c1 Jul 3, 2019
Copy link
Contributor Author

rohany left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @solongordon)


pkg/sql/exec/vec_comparators_tmpl.go, line 82 at r1 (raw file):

Previously, solongordon (Solon) wrote…

I think it's possible for nulls[srcVecIdx] or nulls[dstVecIdx] to be nil if they don't have nulls. Could you test this and handle those cases if necessary?

Done.


pkg/sql/exec/coldata/nulls.go, line 156 at r1 (raw file):

Previously, solongordon (Solon) wrote…

^flippedBitMask is just bitMask.

Done.

Copy link
Contributor

yuzefovich left a comment

Nice work!

[nit]: I think a release note is not necessary for this fix (and a missing period if you decide to keep it).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @solongordon)

This commit fixes NULLs in the TopK sorter by avoiding use
of the vec copy method, which has a bug. Instead, we add
a set method to the vec comparator, and use the templatized
comparator to perform the sets that the TopK sorter needs.

To facilitate this, we add an UnsetNull method to the Nulls
object. However, use of this method results in HasNull()
maybe returning true even if the vector doesn't have nulls.
This behavior already occurs when selection vectors are used.

Release note: none
@rohany rohany force-pushed the rohany:topknulls branch from 88801c1 to 3c0aa0b Jul 3, 2019
@rohany

This comment has been minimized.

Copy link
Contributor Author

rohany commented Jul 3, 2019

alright, I removed the release note. Thanks for the look!

@rohany rohany requested a review from cockroachdb/sql-execution Jul 3, 2019
Copy link
Contributor

solongordon left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)

@rohany

This comment has been minimized.

Copy link
Contributor Author

rohany commented Jul 8, 2019

bors r+

craig bot pushed a commit that referenced this pull request Jul 8, 2019
38557: exec: protect against unset syncFlowConsumer r=jordanlewis a=asubiotto

This should never happen since it implies that the receiver isn't
connected correctly. These happen when a node sends a SetupFlow request
to a remote node where the spec specifies that the response is on that
remote node. We don't see panics in the row execution engine due to
wrapping the syncFlowConsumer with a copyingRowReceiver, but this state
can cause setupVectorized to panic.

This commit protects against this state pending further investigation.

Release note: None

38654: exec: Handle NULLS in TopK sorter r=rohany a=rohany

This commit fixes NULLs in the TopK sorter by avoiding use
of the vec copy method, which has a bug. Instead, we add
a set method to the vec comparator, and use the templatized
comparator to perform the sets that the TopK sorter needs.

To facilitate this, we add an UnsetNull method to the Nulls
object. However, use of this method results in HasNull()
maybe returning true even if the vector doesn't have nulls.
This behavior already occurs when selection vectors are used.
Based on discussions with @solongordon and @asubiotto, this behavior
is OK, and future PR's will attempt to make this behavior better, and address
the bugs within the Vec Copy method.

38710: errors: fix the formatting with %+v r=knz a=knz

(found by @RaduBerinde; needed to complete #38570)

The new library `github.com/cockroachdb/errors` was not implementing
`%+v` formatting properly for assertion and unimplemented errors.
The wrong implementation was hiding the details of the cause
of these two error types from the formatting logic.

Fixing this bug comprehensively required completing the investigation
of the Go 2 / `xerrors` error proposal. This revealed that the
implementation of `fmt.Formatter` for wrapper errors (a `Format()`
method) is required in all cases, at least until Go's stdlib
learns about `errors.Formatter`. More details at
golang/go#29934 and this commit message: cockroachdb/errors@78b6caa.

This patch bumps the dependency `github.com/cockroachdb/errors` to
pick up the fixes to assertion failures and unimplemented errors.

The new definition of `errors.FormatError()` subsequently required
re-implemening `Format)` for `pgerros.withCandidateCode`, which is
also done here.

Finally, this patch also picks up `errors.As()` and the new
streamlined `fmt.Formatter` / `errors.Formatter` interaction, so this
patch also simplifies a few custom error types in CockroachDB
accordingly.

Release note: None

Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>
Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Jul 8, 2019

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Jul 8, 2019
38654: exec: Handle NULLS in TopK sorter r=rohany a=rohany

This commit fixes NULLs in the TopK sorter by avoiding use
of the vec copy method, which has a bug. Instead, we add
a set method to the vec comparator, and use the templatized
comparator to perform the sets that the TopK sorter needs.

To facilitate this, we add an UnsetNull method to the Nulls
object. However, use of this method results in HasNull()
maybe returning true even if the vector doesn't have nulls.
This behavior already occurs when selection vectors are used.
Based on discussions with @solongordon and @asubiotto, this behavior
is OK, and future PR's will attempt to make this behavior better, and address
the bugs within the Vec Copy method.

38710: errors: fix the formatting with %+v r=knz a=knz

(found by @RaduBerinde; needed to complete #38570)

The new library `github.com/cockroachdb/errors` was not implementing
`%+v` formatting properly for assertion and unimplemented errors.
The wrong implementation was hiding the details of the cause
of these two error types from the formatting logic.

Fixing this bug comprehensively required completing the investigation
of the Go 2 / `xerrors` error proposal. This revealed that the
implementation of `fmt.Formatter` for wrapper errors (a `Format()`
method) is required in all cases, at least until Go's stdlib
learns about `errors.Formatter`. More details at
golang/go#29934 and this commit message: cockroachdb/errors@78b6caa.

This patch bumps the dependency `github.com/cockroachdb/errors` to
pick up the fixes to assertion failures and unimplemented errors.

The new definition of `errors.FormatError()` subsequently required
re-implemening `Format)` for `pgerros.withCandidateCode`, which is
also done here.

Finally, this patch also picks up `errors.As()` and the new
streamlined `fmt.Formatter` / `errors.Formatter` interaction, so this
patch also simplifies a few custom error types in CockroachDB
accordingly.

Release note: None

Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Jul 8, 2019

Build failed (retrying...)

@rohany

This comment has been minimized.

Copy link
Contributor Author

rohany commented Jul 8, 2019

bors r+

@craig

This comment has been minimized.

Copy link

craig bot commented Jul 8, 2019

Already running a review

craig bot pushed a commit that referenced this pull request Jul 8, 2019
37199: storage: propagate errors from contentionQueue, catch stalls in roachtest r=nvanbenschoten a=nvanbenschoten

Informs #36089.

The PR is split into a series of commits. The first fixes part of a bug that was causing #36089  to fail (thanks to #36748) and the second improves the test to have a more obvious failure condition for this class of bug in the future. The third, fifth, and sixth clean up code. Finally, the fourth fixes another bug that could cause issues with #36089.

Before the first commit, requests could get stuck repeatedly attempting to push a transaction only to repeatedly find that they themselves were already aborted. The error would not propagate up to the transaction coordinator and the request would get stuck. This commit fixes this behavior by correctly propagating errors observed by the `contentionQueue`.

The second commit bumps the TxnLivenessThreshold for clusters running `kv/contention/nodes=4` to 10 minutes. This is sufficiently large such that if at any point a transaction is abandoned then all other transactions will begin waiting for it, throughput will drop to 0 for 10 straight minutes, and the test will fail to achieve its minimum QPS requirement.

The fourth commit instructs pushers in the `txnwait.Queue` to inform all other pushers that are waiting for the same transaction when it observes an ABORTED transaction. I never saw this cause issues with #36089, but it seems very possible that it could given frequent tscache rotations.

38397: sql: deflake TestLogic//crdb_internal/max_retry_counter r=knz a=knz

Fixes #38062.

Release note: None

38654: exec: Handle NULLS in TopK sorter r=rohany a=rohany

This commit fixes NULLs in the TopK sorter by avoiding use
of the vec copy method, which has a bug. Instead, we add
a set method to the vec comparator, and use the templatized
comparator to perform the sets that the TopK sorter needs.

To facilitate this, we add an UnsetNull method to the Nulls
object. However, use of this method results in HasNull()
maybe returning true even if the vector doesn't have nulls.
This behavior already occurs when selection vectors are used.
Based on discussions with @solongordon and @asubiotto, this behavior
is OK, and future PR's will attempt to make this behavior better, and address
the bugs within the Vec Copy method.

38725: cli/dump: more clearly inform the user upon tables with no visible columns r=knz a=knz

Informs #37768.
Informs #28948.
This is coming up quite often on support, lately again on gitter and forum https://forum.cockroachlabs.com/t/error-while-dumping-core-backup/2901/3.

This PR aims to lessen the burden on support and propose a clear "next action" for the user.

Before:

```
kena@kenax ~/cockroach % ./cockroach dump --insecure defaultdb
CREATE TABLE t (,
        FAMILY "primary" (rowid)
);
Error: pq: at or near "from": syntax error
Failed running "dump"
```

After:

```
kena@kenax ~/cockroach % ./cockroach dump --insecure defaultdb
CREATE TABLE t (,
        FAMILY "primary" (rowid)
);
Error: table "defaultdb.public.t" has no visible columns
HINT: To proceed with the dump, either omit this table from the list of tables to dump, drop the table, or add some visible columns.
--
See: #37768
Failed running "dump"
```

Release note (cli change): `cockroach dump` will now more clearly
refer to issue #37768 when it encounters a table with no visible
columns, which (currently) cannot be dumped successfully.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
@craig

This comment has been minimized.

Copy link

craig bot commented Jul 8, 2019

Build succeeded

@craig craig bot merged commit 3c0aa0b into cockroachdb:master Jul 8, 2019
5 checks passed
5 checks passed
GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
security/snyk - Gopkg.lock (Cockroach Labs) Test PR is disabled for this project
security/snyk - pkg/ui/package.json (Cockroach Labs) Test PR is disabled for this project
craig bot pushed a commit that referenced this pull request Jul 8, 2019
38669: exec: Update NULL handling in Vec Copy r=rohany a=rohany

The copy method in Vec previously unset ALL NULL values in the
destination vector, and then proceeded to set the appropriate
NULL values from the source vector, which is incorrect. To address
this problem, this commit adds a UnsetNullRange operation to
the Nulls object. The operation unsets all null values within an
input range, but is unable to smartly/efficiently identify when
the vector no longer contains any null values. The best way
to address this problem can be left to future PR's and discussion.

Depends on #38654. Until that is landed, look at the latest commit to review.


Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.