Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
105474: concurrency: hoist TxnMeta from {,un}replicatedLockInfo into the holder  r=nvanbenschoten a=arulajmani

Previously, we were storing the TxnMeta separately for both
{,un}replicatedLockInfo. The lock table is quite dumb when it comes to
replicated locks -- for good reason. As a result, tracking TxnMeta's
for replicated locks isn't of much use, as we don't make use of it,
and this patch does exactly that. This also allows us to hoist the
TxnMeta object one level higher, into the holder struct.

Epic: none

Release note: None

107211: ci: add some retries for `git fetch`es r=rail a=rickystewart

Rarely these can fail. Add retries.

Closes #107087

Epic: none
Releae note: None

107231: server: fix recently introduced bug r=yuzefovich a=yuzefovich

Over in 609230c on `TestTenant.TracerI` we returned the function rather than the underlying tracer.

Epic: None

Release note: None

107249: storage: fix PebbleFileRegistry bug that drops entry on rollover r=jbowens a=sumeerbhola

The writeToRegistryFile method first writes the new batch, containing file mappings, to the registry file, and then if the registry file is too big, creates a new registry file. The new registry file is populated with the contents of the map, which doesn't yet contain the edits in the batch, resulting in a loss of these edits when the file registry is reopened. This PR changes the logic to first rollover if the registry file is too big, and then writes the batch to the new file.

This bug has existed since the record writer based registry was implemented 239377a. When it leads to a loss of a file mapping in the registry, it will be noticed by Pebble as a corruption (so not a silent failure) since the file corresponding to the mapping will be assumed to be unencrypted, but can't be successfully read as an unencrypted file. Since we have not seen this occur in production settings, we suspect that an observable mapping loss is rare because compactions typically rewrite the files in those lost mappings before the file registry is reopened.

Epic: none

Fixes: #106617

Release note: None

107259: changefeedccl: Treat drop descriptors as terminal r=miretskiy a=miretskiy

Prior to this change, changefeed would sometimes treat droped descriptors (ErrDroppedDescriptor) as a terminal error, and sometimes it would be treated as a retryable error (though, upon retry, the error would be upgraded to terminal).

This PR cleans up this logic and ensures that any
"dropped descriptor" error is treated as terminal.

Informs https://github.com/cockroachlabs/support/issues/2408 
Release note: None

Co-authored-by: Arul Ajmani <arulajmani@gmail.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
  • Loading branch information
6 people committed Jul 20, 2023
6 parents 5acb999 + 2e8016d + c096bd7 + ab7aa17 + 0e5d38a + 18ac60a commit 261e785
Show file tree
Hide file tree
Showing 52 changed files with 1,183 additions and 1,007 deletions.
10 changes: 9 additions & 1 deletion build/teamcity-support.sh
Expand Up @@ -228,7 +228,15 @@ get_upstream_branch() {
}

changed_go_pkgs() {
git fetch --quiet origin
n=0
until git fetch --quiet origin; do
n=$((n+1))
if [ "$n" -ge 3 ]; then
echo "Could not fetch from GitHub"
exit 1
fi
sleep 5
done
upstream_branch=$(get_upstream_branch)
# Find changed packages, minus those that have been removed entirely. Note
# that the three-dot notation means we are diffing against the merge-base of
Expand Down
10 changes: 9 additions & 1 deletion build/teamcity/cockroach/ci/tests/maybe_stress.sh
Expand Up @@ -8,7 +8,15 @@ source "$dir/teamcity-support.sh" # For $root, would_stress
source "$dir/teamcity-bazel-support.sh" # For run_bazel

if would_stress; then
git fetch origin master
n=0
until git fetch origin master; do
n=$((n+1))
if [ "$n" -ge 3 ]; then
echo "Could not fetch from GitHub"
exit 1
fi
sleep 5
done
tc_start_block "Run stress tests"
run_bazel env BUILD_VCS_NUMBER="$BUILD_VCS_NUMBER" build/teamcity/cockroach/ci/tests/maybe_stress_impl.sh stress
tc_end_block "Run stress tests"
Expand Down
10 changes: 9 additions & 1 deletion build/teamcity/cockroach/ci/tests/maybe_stressrace.sh
Expand Up @@ -8,7 +8,15 @@ source "$dir/teamcity-support.sh" # For $root, would_stress
source "$dir/teamcity-bazel-support.sh" # For run_bazel

if would_stress; then
git fetch origin master
n=0
until git fetch origin master; do
n=$((n+1))
if [ "$n" -ge 3 ]; then
echo "Could not fetch from GitHub"
exit 1
fi
sleep 5
done
tc_start_block "Run stress tests"
run_bazel env BUILD_VCS_NUMBER="$BUILD_VCS_NUMBER" build/teamcity/cockroach/ci/tests/maybe_stress_impl.sh stressrace
tc_end_block "Run stress tests"
Expand Down
5 changes: 5 additions & 0 deletions pkg/ccl/changefeedccl/cdcevent/rowfetcher_cache.go
Expand Up @@ -109,6 +109,11 @@ func refreshUDT(
tableDesc, err = collection.ByIDWithLeased(txn).WithoutNonPublic().Get().Table(ctx, tableID)
return err
}); err != nil {
if errors.Is(err, catalog.ErrDescriptorDropped) {
// Dropped descriptors are a bad news.
return nil, changefeedbase.WithTerminalError(err)
}

// Manager can return all kinds of errors during chaos, but based on
// its usage, none of them should ever be terminal.
return nil, changefeedbase.MarkRetryableError(err)
Expand Down
3 changes: 3 additions & 0 deletions pkg/ccl/changefeedccl/changefeed_dist.go
Expand Up @@ -154,6 +154,9 @@ func fetchTableDescriptors(
})
}
if err := sql.DescsTxn(ctx, execCfg, fetchSpans); err != nil {
if errors.Is(err, catalog.ErrDescriptorDropped) {
return nil, changefeedbase.WithTerminalError(err)
}
return nil, err
}
return targetDescs, nil
Expand Down
9 changes: 8 additions & 1 deletion pkg/ccl/changefeedccl/schemafeed/schema_feed.go
Expand Up @@ -658,7 +658,14 @@ func (tf *schemaFeed) validateDescriptor(
}
// If a interesting type changed, then we just want to force the lease
// manager to acquire the freshest version of the type.
return tf.leaseMgr.AcquireFreshestFromStore(ctx, desc.GetID())
if err := tf.leaseMgr.AcquireFreshestFromStore(ctx, desc.GetID()); err != nil {
err = errors.Wrapf(err, "could not acquire type descriptor %d lease", desc.GetID())
if errors.Is(err, catalog.ErrDescriptorDropped) { // That's pretty fatal.
err = changefeedbase.WithTerminalError(err)
}
return err
}
return nil
case catalog.TableDescriptor:
if err := changefeedvalidators.ValidateTable(tf.targets, desc, tf.tolerances); err != nil {
return err
Expand Down

0 comments on commit 261e785

Please sign in to comment.