-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
engine: TestConcurrentBatch is flaky (locally) #15604
Comments
Were you seeing this even before disk syncing got turned on by default in #15366? |
|
FWIW, this has been happening to me too! |
Happened to me too:
That's... a very long write. I wasn't doing anything noteworthy other than |
Hmm, I'm seeing the slow batches as well. I don't recall seeing these in the past. Time to bisect. |
The finger is pointing at 4d72e12:
@benesch Did we change compiler flags in the move away from the |
First suspicion was that we're now passing |
@petermattis I think you might be running without SSE4.2 in the "after"
case? In short, our development builds don't use SSE4.2, but they used to.
Does that account for the speed difference?
edit: heh, guess you checked that.
…On Mon, May 15, 2017 at 4:12 PM, Peter Mattis ***@***.***> wrote:
The finger is pointing at 4d72e12
<4d72e12>
:
Author: Nikhil Benesch ***@***.***>
Date: Fri Apr 7 16:39:06 2017 -0400
build: minimize cgo's role in the build
This commit excises cgo role in building our dependencies, limiting it
to the link phase only. The c-protobuf, c-jemalloc, c-snappy, and
c-rocksdb packages are removed in favor of Make rules that use these
packages' native build systems. This has a few implications:
1. We no longer need to maintain platform-specific headers for each
platform we support; we can rely on our dependency's build systems
to do feature detection as necessary.
2. Windows needs no special treatment.
3. We no longer require the parallel cgo patch for fast compliation
of C/C++ dependencies.
4. We, unfortunately, now have a build dependency on CMake. We could
instead vendor CMake itself, but building CMake from source takes
about five minutes on my machine and will have serious
implications for CI times. Binary distributions of CMake are
readily available, and the specific version of CMake used is not
particularly critical, as it is with the other C dependencies we
vendor.
5. We're no longer compatible with `go get`; you *must* bootstrap
using Make.
This also adds Windows binaries to the usual deployment scripts.
Closes https://github.com/cockroachlabs/support/issues/33.
Closes #14406.
~/Development/go/src/github.com/cockroachdb/cockroach/pkg (4d72e12...)|BISECTING make test PKG=./storage/engine/ TESTS=TestConcurrentBatch TESTFLAGS='-v -count 10' 2>&1 | grep '^--- PASS'
--- PASS: TestConcurrentBatch (3.12s)
--- PASS: TestConcurrentBatch (3.11s)
--- PASS: TestConcurrentBatch (3.08s)
--- PASS: TestConcurrentBatch (3.11s)
--- PASS: TestConcurrentBatch (3.06s)
--- PASS: TestConcurrentBatch (3.05s)
--- PASS: TestConcurrentBatch (3.05s)
--- PASS: TestConcurrentBatch (3.09s)
--- PASS: TestConcurrentBatch (3.10s)
--- PASS: TestConcurrentBatch (3.11s)
~/Development/go/src/github.com/cockroachdb/cockroach/pkg (ef10b4d...)|BISECTING make test PKG=./storage/engine/ TESTS=TestConcurrentBatch TESTFLAGS='-v -count 10' 2>&1 | grep '^--- PASS'
--- PASS: TestConcurrentBatch (1.50s)
--- PASS: TestConcurrentBatch (1.49s)
--- PASS: TestConcurrentBatch (1.46s)
--- PASS: TestConcurrentBatch (1.28s)
--- PASS: TestConcurrentBatch (1.45s)
--- PASS: TestConcurrentBatch (1.45s)
--- PASS: TestConcurrentBatch (1.45s)
--- PASS: TestConcurrentBatch (2.01s)
--- PASS: TestConcurrentBatch (1.31s)
--- PASS: TestConcurrentBatch (1.45s)
@benesch <https://github.com/benesch> Did we change compiler flags in the
move away from the c-* repos?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#15604 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABdsPPWCSD2TTRYNcrb8Y99ruEXP2sllks5r6LGkgaJpZM4NOadU>
.
|
Yeah, that commit set WITH_SSE42=ON. |
@tamird See my message (which crossed yours in the ether). I tried |
I'd be terrified if this were the problem but... what if you pass |
I don't think that's it, |
Here's a sample compiler invocation while compiling RocksDB on
...it's missing |
Yes, that might be it.
…On Mon, May 15, 2017 at 4:37 PM Nikhil Benesch ***@***.***> wrote:
Here's a sample compiler invocation while compiling RocksDB on 4d72e12.
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DJEMALLOC_NO_DEMANGLE -DOS_MACOSX -DROCKSDB_JEMALLOC -DROCKSDB_LIB_IO_POSIX -DROCKSDB_PLATFORM_POSIX -DSNAPPY -I/Users/benesch/go/native/x86_64-apple-darwin16.5.0/jemalloc/include -I/Users/benesch/go/src/github.com/cockroachdb/cockroach/c-deps/snappy.src -I/Users/benesch/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src -I/Users/benesch/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/include -isystem /Users/benesch/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/third-party/gtest-1.7.0/fused-src -msse4.2 -W -Wextra -Wall -Wsign-compare -Wshadow -Wno-unused-parameter -Wno-unused-variable -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers -std=c++11 -O2 -fno-omit-frame-pointer -momit-leaf-frame-pointer -Werror -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk -mmacosx-version-min=10.12.4 -o CMakeFiles/rocksdb.dir/utilities/compaction_filters/remove_emptyvalue_compactionfilter.cc.o -c /Users/benesch/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/utilities/compaction_filters/remove_emptyvalue_compactionfilter.cc
...it's missing -DNDEBUG. Could that be it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15604 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF6f930Pcovmm9vkiyfLE9HCyHvBVjtMks5r6Ld6gaJpZM4NOadU>
.
|
Oops, yes it is. Latest master:
With the following patch: diff --git a/build/common.mk b/build/common.mk
index 71c7d8b56..6b6da7ef5 100644
--- a/build/common.mk
+++ b/build/common.mk
@@ -380,7 +380,8 @@ $(ROCKSDB_DIR)/Makefile: $(C_DEPS_DIR)/rocksdb-rebuild $(ROCKSDB_SRC_DIR)/.extra
cd $(ROCKSDB_DIR) && cmake $(CMAKE_FLAGS) $(ROCKSDB_SRC_DIR) \
$(if $(findstring release,$(TYPE)),,-DWITH_$(if $(findstring mingw,$(TARGET_TRIPLE)),AVX2,SSE42)=OFF) \
-DSNAPPY_LIBRARIES=$(SNAPPY_DIR)/.libs/libsnappy.a -DSNAPPY_INCLUDE_DIR=$(SNAPPY_SRC_DIR) -DWITH_SNAPPY=ON \
- $(if $(USE_STDMALLOC),,-DJEMALLOC_LIBRARIES=$(JEMALLOC_DIR)/lib/libjemalloc.a -DJEMALLOC_INCLUDE_DIR=$(JEMALLOC_DIR)/include -DWITH_JEMALLOC=ON)
+ $(if $(USE_STDMALLOC),,-DJEMALLOC_LIBRARIES=$(JEMALLOC_DIR)/lib/libjemalloc.a -DJEMALLOC_INCLUDE_DIR=$(JEMALLOC_DIR)/include -DWITH_JEMALLOC=ON) \
+ -DCMAKE_CXX_FLAGS=-DNDEBUG
$(SNAPPY_DIR)/Makefile: $(C_DEPS_DIR)/snappy-rebuild $(SNAPPY_SRC_DIR)/.extracted
mkdir -p $(SNAPPY_DIR)
diff --git a/c-deps/rocksdb-rebuild b/c-deps/rocksdb-rebuild
index 81dc26025..621a785ba 100644
--- a/c-deps/rocksdb-rebuild
+++ b/c-deps/rocksdb-rebuild
@@ -1,4 +1,4 @@
Bump the version below when changing rocksdb CMake flags. Search for "BUILD
ARTIFACT CACHING" in build/common.mk for rationale.
-1
+2
Welp. I guess that'll make for a nice performance boost in 1.0.1. |
@petermattis want me to prepare a PR with that patch? |
@benesch Yes. |
I'm curious why that isn't the default when using |
The CMake build system was historically Windows-only, so it's rather undertested on not-Windows. Have you seen https://github.com/cockroachdb/cockroach/blob/master/c-deps/rocksdb-0014-cmake-o2.patch? The good news is the CMake build is now a first-class citizen on all platforms. I'll be sending a patch upstream, too, to fix this for everyone. |
This also explains why we've seen a handful of assertion failures from RocksDB. Guess I should have paid more attention to that. |
Yup, it would. That's worrying for a rather different reason, though, no? |
Depends on the assertion. The ones that fired could be invalid. I seem to recall one in the skiplist code that I convinced myself wasn't a problem. |
Also:
|
🤦♂️ |
In the transition to the new Make-based build system for C dependencies (4d72e12), RocksDB assertions were inadvertently enabled for all builds. Restore the old behavior, which only enabled assertions for race builds (cockroachdb/c-rocksdb#27). Fixes cockroachdb#15604.
In the transition to the new Make-based build system for C dependencies (4d72e12), RocksDB assertions were inadvertently enabled for all builds. Restore the old behavior, which only enabled assertions for race builds (cockroachdb/c-rocksdb#27). Fixes cockroachdb#15604.
In the transition to the new Make-based build system for C dependencies (4d72e12), RocksDB assertions were inadvertently enabled for all builds. Restore the old behavior, which only enabled assertions for race builds (cockroachdb/c-rocksdb#27). Fixes #15604.
TestConcurrentBatch
has recently (in the last couple of weeks, I think) become flaky on my laptop. There are a bunch of slow batches, each taking longer than the last (this is a partial log):When the test passes, it takes about 4 seconds on this laptop (a macbook pro, with filevault enabled).
I think this is probably a matter of other stuff going on on the machine. It fails a large fraction of the time when I run
make test
and there is stuff to rebuild, but it passes reliably when run in isolation or when there is no rebuilding going on. So it's not really a problem with the test per se, but I'd like to see it be more robust against concurrent activity (or for us to figure out a way formake test
to not cause so much concurrent I/O)The text was updated successfully, but these errors were encountered: