Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: don't panic on Iterator.Close #84449

Merged
merged 1 commit into from
Jul 15, 2022
Merged

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Jul 14, 2022

Don't panic if Iterator.Close returns a non-nil error. Iteration errors should
already be surfaced through (*pebble.Iterator).Error and
(storage.EngineIterator).Valid. Panicking may crash the node for retriable,
ephemeral I/O errors, especially when reading from sstables over the network.

In a follow up, we should make sure we're still appropriately panicking for
non-retriable errors.

Fix #84396.

Release note (bug fix): fix bug where an ephemeral I/O error could crash a
node.

@jbowens jbowens requested a review from a team as a code owner July 14, 2022 20:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

panic(err)
}
_ = p.iter.Close()
// TODO(jackson): The error returned by iter.Close() may be an ephemeral
Copy link
Contributor

@erikgrinaker erikgrinaker Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call out that if Valid() returns an error, then that same error will also be returned by Close(). In most cases, we do not care about that error during Close, since the caller may already have handled it appropriately (e.g. a context cancellation) but still wants to close the iterator via e.g. defer.

Don't panic if Iterator.Close returns a non-nil error. Iteration errors should
already be surfaced through (*pebble.Iterator).Error and
(storage.EngineIterator).Valid. Panicking may crash the node for retriable,
ephemeral I/O errors, especially when reading from sstables over the network.

In a follow up, we should make sure we're still appropriately panicking for
non-retriable errors.

Fix cockroachdb#84396.

Release note (bug fix): fix bug where an ephemeral I/O error could crash a
node.
Copy link
Collaborator

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jbowens)

@jbowens
Copy link
Collaborator Author

jbowens commented Jul 15, 2022

TFTRs!

bors r=nicktrav,erikgrinaker

@craig
Copy link
Contributor

craig bot commented Jul 15, 2022

Build succeeded:

@craig craig bot merged commit e9ee218 into cockroachdb:master Jul 15, 2022
nicktrav added a commit to nicktrav/cockroach that referenced this pull request Jul 25, 2022
In cockroachdb#84449, the behavior of the `pebbleIterator` was altered to ignore
all errors when closing an iterator. This was done to avoid panic-ing
when the iterator encounters an ephemeral (i.e. retriable) error that
remains pinned to the iterator until it is closed (see cockroachdb#84396 for the
motivation).

Partially address the TODO added in cockroachdb#84449 by panic-ing on errors that
are known to be fatal (such as corruption). Addressing the TODO in its
entirety would likely require enumerating known ephemeral errors and
allowing such errors to be ignored on close, as well as implementing
cockroachdb/pebble#1811 to avoid pinning errors to the iterator even
after they have been handled.

Fix cockroachdb#84479.

Release note: None.
craig bot pushed a commit that referenced this pull request Jul 25, 2022
84590: *: upgrade to go 1.18.4 r=knz a=rickystewart

* [ ] Adjust the Pebble tests to run in new version.
* [x] Adjust version in Docker image ([source](./builder/Dockerfile)).
* [x] Adjust version in the TeamCity agent image ([setup script](./packer/teamcity-agent.sh))
* [ ] Rebuild and push the Docker image (following [Basic Process](#basic-process))
* [x] Download ALL the archives (`.tar.gz`, `.zip`) for the new Go version from https://golang.org/dl/ and mirror them in the `public-bazel-artifacts` bucket in the `Bazel artifacts` project in GCP (sub-directory `go`, next to the other Go SDK's).
* [x] Bump the version in `WORKSPACE` under `go_download_sdk`. You may need to bump [rules_go](https://github.com/bazelbuild/rules_go/releases). Also edit the filenames listed in `sdks` and update all the hashes to match what you mirrored in the step above.
* [x] Run `./dev generate bazel` to refresh `distdir_files.bzl`, then `bazel fetch `@distdir//:archives`` to ensure you've updated all hashes to the correct value.
* [ ] Bump the version in `builder.sh` accordingly ([source](./builder.sh#L6)).
* [x] Bump the version in `go-version-check.sh` ([source](./go-version-check.sh)), unless bumping to a new patch release.
* [ ] Bump the go version in `go.mod`. You may also need to rerun `make vendor_rebuild` if vendoring has changed.
* [x] Bump the default installed version of Go in `bootstrap-debian.sh` ([source](./bootstrap/bootstrap-debian.sh)).
* [x] Replace other mentions of the older version of go (grep for `golang:<old_version>` and `go<old_version>`).
* [ ] Update the `builder.dockerImage` parameter in the TeamCity [`Cockroach`](https://teamcity.cockroachdb.com/admin/editProject.html?projectId=Cockroach&tab=projectParams) and [`Internal`](https://teamcity.cockroachdb.com/admin/editProject.html?projectId=Internal&tab=projectParams) projects.
* [ ] Ask the Developer Infrastructure team to deploy new TeamCity agent images according to [packer/README.md](./packer/README.md)

Release note (build change): Upgrade to golang 1.18.4

84723: opt: build UDF expressions r=mgartner a=mgartner

#### opt: set opt tester search path to empty

`OptTester` now sets its `SemaContext`'s `SearchPath` to
`EmptySearchPath`, instead of `nil`, to avoid nil pointer exceptions
when resolving unknown functions.

Release note: None

#### opt: build UDF expressions

This commit adds basic support for building UDFs in optbuilder. Only
scalar, nullary (arity of zero) functions with a single statement in the
body are supported. Support for more types of UDFs will follow in future
commits. Note that this commit does not add support for execution of
UDFs, only building them within an optimizer expression.

Release note: None

84913: storage: panic on iterator close when encountering corruption r=jbowens a=nicktrav

In #84449, the behavior of the `pebbleIterator` was altered to ignore
all errors when closing an iterator. This was done to avoid panic-ing
when the iterator encounters an ephemeral (i.e. retriable) error that
remains pinned to the iterator until it is closed (see #84396 for the
motivation).

Partially address the TODO added in #84449 by panic-ing on errors that
are known to be fatal (such as corruption). Addressing the TODO in its
entirety would likely require enumerating known ephemeral errors and
allowing such errors to be ignored on close, as well as implementing
cockroachdb/pebble#1811 to avoid pinning errors to the iterator even
after they have been handled.

Fix #84479.

Release note: None.

84917: outliers: rename to insights r=matthewtodd a=matthewtodd

The UI design that includes outliers also features other insights about
sql execution. We will expand this outliers subsystem to support making
those insights as well (probably with more detectors, changing the
signature / return type of `isOutlier`). Renaming now is the first step
in that direction.

Release note: None

84987: kvserver: send range GC requests when point requests are absent r=aliher1911 a=aliher1911

Previously range GC requests were not sent if point GC keys were not requested as well.
This patch fixes the problem by checking that any of parameters could be specified.

Release note: None

85011: sql: remove a reference to issue 77733 r=ajwerner a=knz

Fixes  #77733.

Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Co-authored-by: Matthew Todd <todd@cockroachlabs.com>
Co-authored-by: Oleg Afanasyev <oleg@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: pebbleIterator panics too eagerly on Close()
4 participants