Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
31013: kv: try next replica on RangeNotFoundError r=nvanbenschoten,bdarnell a=tschottdorf

Previously, if a Batch RPC came back with a RangeNotFoundError, we would
immediately stop trying to send to more replicas, evict the range
descriptor, and start a new attempt after a back-off.

This new attempt could end up using the same replica, so if the
RangeNotFoundError persisted for some amount of time, so would the
unsuccessful retries for requests to it as DistSender doesn't aggressively
shuffle the replicas.

It turns out that there are such situations, and the election-after-restart
roachtest spuriously hit one of them:

1. new replica receives a preemptive snapshot and the ConfChange
2. cluster restarts
3. now the new replica is in this state until the range wakes
   up, which may not happen for some time. 4. the first request to the range
   runs into the above problem

@nvanbenschoten: I think there is an issue to be filed about the tendency
of DistSender to get stuck in unfortunate configurations.

Fixes #30613.

Release note (bug fix): Avoid repeatedly trying a replica that was found to
be in the process of being added.

31187: roachtest: add synctest r=bdarnell a=tschottdorf

This new roachtest sets up a charybdefs on a single (Ubuntu) node and runs
the `synctest` cli command against a nemesis that injects random I/O
errors.

The synctest command is new. It simulates a Raft log and can be directed at a
filesystem that is being hit with random failures.

The workload essentially writes ascending keys (flushing each one to disk
synchronously) until an I/O error occurs, at which point it re-opens the
instance to verify that all persisted writes are still there. If the
RocksDB instance was permanently corrupted, it switches to a new, pristine,
directory.
This is used in the roachtest, but is also useful for manual use in user
deployments in which we suspect there is a failure to persist data to disk.

This hasn't found anything, but it's fun to watch and also shows us a
number of errors that we know and love from sentry.

Release note: None

31215: storage: deflake TestStoreRangeMergeWatcher r=tschottdorf a=benesch

This test could deadlock if the LHS replica on store2 was shut down
before it processed the split at "b". Teach the test to wait for the LHS
replica on store2 to process the split before blocking Raft traffic to
it.

Fixes #31096.
Fixes #31149.
Fixes #31160.
Fixes #31167.

Release note: None

31217: importccl: add explicit default to mysql testdata timestamp r=dt a=dt

this makes the testdata work on mysql 8.0.2+, where the timestamp type no longer has the implicit defaults.

Release note: none.

31221: cluster: Create final cluster version for 2.1 r=bdarnell a=bdarnell

Release note: None

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Ben Darnell <ben@bendarnell.com>
  • Loading branch information
5 people committed Oct 10, 2018
6 parents e9ed974 + 857b9c0 + 01d6bda + 1c7b427 + 7bcc94e + 19f8d4b commit 3835e08
Show file tree
Hide file tree
Showing 33 changed files with 838 additions and 287 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,6 @@
<tr><td><code>trace.debug.enable</code></td><td>boolean</td><td><code>false</code></td><td>if set, traces for recent requests can be seen in the /debug page</td></tr>
<tr><td><code>trace.lightstep.token</code></td><td>string</td><td><code></code></td><td>if set, traces go to Lightstep using this token</td></tr>
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.0-14</code></td><td>set the active cluster version in the format '<major>.<minor>'.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.1</code></td><td>set the active cluster version in the format '<major>.<minor>'.</td></tr>
</tbody>
</table>
2 changes: 1 addition & 1 deletion pkg/ccl/importccl/mysql_testdata_helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ func genMysqlTestdata(t *testing.T, dump func()) {
dt DATETIME NOT NULL DEFAULT '2000-01-01 00:00:00',
d DATE,
ts TIMESTAMP,
ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
t TIME,
-- TODO(dt): fix parser: for YEAR's length option
-- y YEAR,
Expand Down
72 changes: 36 additions & 36 deletions pkg/ccl/importccl/testdata/mysqldump/db.sql

Large diffs are not rendered by default.

Binary file modified pkg/ccl/importccl/testdata/mysqldump/db.sql.bz2
Binary file not shown.
Binary file modified pkg/ccl/importccl/testdata/mysqldump/db.sql.gz
Binary file not shown.
12 changes: 6 additions & 6 deletions pkg/ccl/importccl/testdata/mysqldump/everything.sql
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
-- MySQL dump 10.13 Distrib 5.7.23, for osx10.14 (x86_64)
-- MySQL dump 10.13 Distrib 8.0.12, for osx10.14 (x86_64)
--
-- Host: localhost Database: cockroachtestdata
-- ------------------------------------------------------
-- Server version 5.7.23
-- Server version 8.0.12

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
SET NAMES utf8mb4 ;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
Expand All @@ -21,7 +21,7 @@

DROP TABLE IF EXISTS `everything`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
SET character_set_client = utf8mb4 ;
CREATE TABLE `everything` (
`i` int(11) NOT NULL,
`c` char(10) NOT NULL,
Expand Down Expand Up @@ -61,7 +61,7 @@ CREATE TABLE `everything` (

LOCK TABLES `everything` WRITE;
/*!40000 ALTER TABLE `everything` DISABLE KEYS */;
INSERT INTO `everything` VALUES (1,'c','this is s\'s default value',NULL,'Small',_binary 'bin\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',NULL,NULL,'2000-01-01 00:00:00',NULL,'2018-10-09 22:55:39',NULL,NULL,NULL,-12.345,-2,NULL,5,NULL,NULL,NULL,-1.5,NULL,NULL,NULL,NULL,NULL,'{\"a\": \"b\", \"c\": {\"d\": [\"e\", 11, null]}}'),(2,'c2','this is s\'s default value',NULL,'Large',_binary 'bin2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',NULL,NULL,'2000-01-01 00:00:00',NULL,'2018-10-09 22:55:39',NULL,NULL,NULL,12.345,3,NULL,5,NULL,NULL,NULL,1.2,NULL,NULL,NULL,NULL,NULL,'{}');
INSERT INTO `everything` VALUES (1,'c','this is s\'s default value',NULL,'Small',_binary 'bin\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',NULL,NULL,'2000-01-01 00:00:00',NULL,'2018-10-10 15:04:47',NULL,NULL,NULL,-12.345,-2,NULL,5,NULL,NULL,NULL,-1.5,NULL,NULL,NULL,NULL,NULL,'{\"a\": \"b\", \"c\": {\"d\": [\"e\", 11, null]}}'),(2,'c2','this is s\'s default value',NULL,'Large',_binary 'bin2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',NULL,NULL,'2000-01-01 00:00:00',NULL,'2018-10-10 15:04:47',NULL,NULL,NULL,12.345,3,NULL,5,NULL,NULL,NULL,1.2,NULL,NULL,NULL,NULL,NULL,'{}');
/*!40000 ALTER TABLE `everything` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;
Expand All @@ -74,4 +74,4 @@ UNLOCK TABLES;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2018-10-09 22:55:39
-- Dump completed on 2018-10-10 15:04:47
10 changes: 5 additions & 5 deletions pkg/ccl/importccl/testdata/mysqldump/second.sql
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
-- MySQL dump 10.13 Distrib 5.7.23, for osx10.14 (x86_64)
-- MySQL dump 10.13 Distrib 8.0.12, for osx10.14 (x86_64)
--
-- Host: localhost Database: cockroachtestdata
-- ------------------------------------------------------
-- Server version 5.7.23
-- Server version 8.0.12

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
SET NAMES utf8mb4 ;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
Expand All @@ -21,7 +21,7 @@

DROP TABLE IF EXISTS `SECOND`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
SET character_set_client = utf8mb4 ;
CREATE TABLE `SECOND` (
`i` int(11) NOT NULL,
`k` int(11) DEFAULT NULL,
Expand Down Expand Up @@ -51,4 +51,4 @@ UNLOCK TABLES;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2018-10-09 22:55:39
-- Dump completed on 2018-10-10 15:04:47
12 changes: 6 additions & 6 deletions pkg/ccl/importccl/testdata/mysqldump/simple.sql

Large diffs are not rendered by default.

45 changes: 30 additions & 15 deletions pkg/cli/debug.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ import (
"time"

"github.com/cockroachdb/cockroach/pkg/cli/debug"
"github.com/cockroachdb/cockroach/pkg/cli/synctest"
"github.com/cockroachdb/cockroach/pkg/cli/syncbench"
"github.com/cockroachdb/cockroach/pkg/config"
"github.com/cockroachdb/cockroach/pkg/gossip"
"github.com/cockroachdb/cockroach/pkg/keys"
Expand Down Expand Up @@ -99,9 +99,23 @@ func parseRangeID(arg string) (roachpb.RangeID, error) {
return roachpb.RangeID(rangeIDInt), nil
}

// OpenEngineOptions tunes the behavior of OpenEngine.
type OpenEngineOptions struct {
ReadOnly bool
MustExist bool
}

// OpenExistingStore opens the rocksdb engine rooted at 'dir'.
// If 'readOnly' is true, opens the store in read-only mode.
func OpenExistingStore(dir string, stopper *stop.Stopper, readOnly bool) (*engine.RocksDB, error) {
return OpenEngine(dir, stopper, OpenEngineOptions{ReadOnly: readOnly, MustExist: true})
}

// OpenEngine opens the RocksDB engine at 'dir'. Depending on the supplied options,
// an empty engine might be initialized.
func OpenEngine(
dir string, stopper *stop.Stopper, opts OpenEngineOptions,
) (*engine.RocksDB, error) {
cache := engine.NewRocksDBCache(server.DefaultCacheSize)
defer cache.Release()
maxOpenFiles, err := server.SetOpenFileLimitForOneStore()
Expand All @@ -113,8 +127,8 @@ func OpenExistingStore(dir string, stopper *stop.Stopper, readOnly bool) (*engin
Settings: serverCfg.Settings,
Dir: dir,
MaxOpenFiles: maxOpenFiles,
MustExist: true,
ReadOnly: readOnly,
MustExist: opts.MustExist,
ReadOnly: opts.ReadOnly,
}

if PopulateRocksDBConfigHook != nil {
Expand Down Expand Up @@ -1011,28 +1025,28 @@ func runTimeSeriesDump(cmd *cobra.Command, args []string) error {
}
}

var debugSyncTestCmd = &cobra.Command{
Use: "synctest [directory]",
var debugSyncBenchCmd = &cobra.Command{
Use: "syncbench [directory]",
Short: "Run a performance test for WAL sync speed",
Long: `
`,
Args: cobra.MaximumNArgs(1),
Hidden: true,
RunE: MaybeDecorateGRPCError(runDebugSyncTest),
RunE: MaybeDecorateGRPCError(runDebugSyncBench),
}

var syncTestOpts = synctest.Options{
var syncBenchOpts = syncbench.Options{
Concurrency: 1,
Duration: 10 * time.Second,
LogOnly: true,
}

func runDebugSyncTest(cmd *cobra.Command, args []string) error {
syncTestOpts.Dir = "./testdb"
func runDebugSyncBench(cmd *cobra.Command, args []string) error {
syncBenchOpts.Dir = "./testdb"
if len(args) == 1 {
syncTestOpts.Dir = args[0]
syncBenchOpts.Dir = args[0]
}
return synctest.Run(syncTestOpts)
return syncbench.Run(syncBenchOpts)
}

var debugUnsafeRemoveDeadReplicasCmd = &cobra.Command{
Expand Down Expand Up @@ -1207,12 +1221,12 @@ func removeDeadReplicas(
func init() {
DebugCmd.AddCommand(debugCmds...)

f := debugSyncTestCmd.Flags()
f.IntVarP(&syncTestOpts.Concurrency, "concurrency", "c", syncTestOpts.Concurrency,
f := debugSyncBenchCmd.Flags()
f.IntVarP(&syncBenchOpts.Concurrency, "concurrency", "c", syncBenchOpts.Concurrency,
"number of concurrent writers")
f.DurationVarP(&syncTestOpts.Duration, "duration", "d", syncTestOpts.Duration,
f.DurationVarP(&syncBenchOpts.Duration, "duration", "d", syncBenchOpts.Duration,
"duration to run the test for")
f.BoolVarP(&syncTestOpts.LogOnly, "log-only", "l", syncTestOpts.LogOnly,
f.BoolVarP(&syncBenchOpts.LogOnly, "log-only", "l", syncBenchOpts.LogOnly,
"only write to the WAL, not to sstables")

f = debugUnsafeRemoveDeadReplicasCmd.Flags()
Expand Down Expand Up @@ -1243,6 +1257,7 @@ var debugCmds = append(DebugCmdsForRocksDB,
debugSSTDumpCmd,
debugGossipValuesCmd,
debugTimeSeriesDumpCmd,
debugSyncBenchCmd,
debugSyncTestCmd,
debugUnsafeRemoveDeadReplicasCmd,
debugEnvCmd,
Expand Down
Loading

0 comments on commit 3835e08

Please sign in to comment.