Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-4520: [C++] use voidified expr to ignore DCHECK() custom messages in NDEBUG #3599

Conversation

bkietz
Copy link
Member

@bkietz bkietz commented Feb 9, 2019

For release builds DCHECK( x ) << y << z; currently expands to

((void)(x)); 
while (false) ::arrow::util::ArrowLogBase() << y << z;

This is unreachable which is an error using clang-7

@bkietz bkietz changed the title add ArrowLogIgnore and use for release mode DCHECK* ARROW-4520: [C++] add ArrowLogIgnore and use for release mode DCHECK* Feb 9, 2019
@wesm
Copy link
Member

wesm commented Feb 9, 2019

Is it a warning or an error?

I think we should try to follow closely what's in glog

https://github.com/google/glog/blob/master/src/glog/logging.h.in#L1004

@pitrou
Copy link
Member

pitrou commented Feb 11, 2019

Is the PR missing something? AFAICT it's only replacing a while (false) with another while (false)...

@bkietz bkietz changed the title ARROW-4520: [C++] add ArrowLogIgnore and use for release mode DCHECK* ARROW-4520: [C++] use voidified expr to ignore DCHECK() custom messages in NDEBUG Feb 11, 2019
@bkietz
Copy link
Member Author

bkietz commented Feb 11, 2019

@pitrou Sorry, I didn't rename the PR after changing my approach

@pitrou
Copy link
Member

pitrou commented Feb 11, 2019

So unreachability wasn't the issue?

@bkietz
Copy link
Member Author

bkietz commented Feb 11, 2019

Apparently the warning (error with -Werror) isn't emitted if the unreachable expression is cast to void, which is what's in glog

@pitrou
Copy link
Member

pitrou commented Feb 11, 2019

Isn't it what ARROW_IGNORE_EXPR was supposed to do?

@bkietz
Copy link
Member Author

bkietz commented Feb 11, 2019

ARROW_IGNORE_EXPR was used to cast the condition/operands of checks to void, but the unreachable expression was ArrowLogBase() << extra_messages... which was not cast to void.

@bkietz
Copy link
Member Author

bkietz commented Feb 11, 2019

We can't pass that to ARROW_IGNORE_EXPR that because extra_messages are not arguments to the DCHECK macros

@pitrou
Copy link
Member

pitrou commented Feb 11, 2019

Another suggestion: could we simply use ARROW_CHECK(true) without the while false?

@pitrou
Copy link
Member

pitrou commented Feb 11, 2019

How about the following:

#define DCHECK(condition)       \
  ARROW_IGNORE_EXPR(condition); \
  ARROW_CHECK(true)

#define DCHECK_EQ(val1, val2) \
  ARROW_IGNORE_EXPR(val1);    \
  ARROW_IGNORE_EXPR(val2);    \
  ARROW_CHECK(true)

// [etc.]

Hopefully that can please all compilers.

@bkietz
Copy link
Member Author

bkietz commented Feb 11, 2019

I'm fine with it; I was just following @wesm's recommendation to use glog's solution

@wesm
Copy link
Member

wesm commented Feb 12, 2019

@pitrou your proposed solution seems OK, might want to run a benchmark that involves DCHECK in debug builds to make sure the compiler is optimizing away the dead code. I was thinking there's no reason to deviate strongly from what is in glog since someone at Google would have noticed if it was wrong by now

@pitrou
Copy link
Member

pitrou commented Feb 12, 2019

No strong opinion. We can go with this and revisit another time if another compiler complains.

@pitrou
Copy link
Member

pitrou commented Feb 12, 2019

That said, it seems the manylinux crash is specific to this PR. It's also one of the rare CI entries that compiles in release mode...
@bkietz can you rebase?

@bkietz bkietz force-pushed the ARROW-4520-ignore-custom-messages-for-release-builds branch from 1ca97cf to c0a2121 Compare February 12, 2019 15:57
@bkietz
Copy link
Member Author

bkietz commented Feb 12, 2019

@pitrou rebased and I still get the segfault in release mode.

truncated stack trace:

  1. Decimal128::ToStringNegativeScale out_of_range error in str.substr
  2. Decimal128::ToString
  3. Decimal128Array::FormatValue
  4. DecimalValue.as_py

full stack trace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff7805801 in __GI_abort () at abort.c:79
#2  0x00007fffe679d469 in __gnu_cxx::__verbose_terminate_handler () at /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534627447954/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007fffe679bb1b in __cxxabiv1::__terminate (handler=<optimized out>) at /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534627447954/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x00007fffe679bb54 in std::terminate () at /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534627447954/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x00007fffe679bd36 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x555555fe1be0, tinfo=0x7fffe6832e78 <typeinfo for std::out_of_range>, dest=0x7fffe67a76d6 <std::out_of_range::~out_of_range()>)
    at /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534627447954/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:93
#6  0x00007fffe67b5b49 in std::__throw_out_of_range_fmt (__fmt=<optimized out>) at /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534627447954/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/src/c++11/functexcept.cc:96
#7  0x00007fffe5926a0b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_check (__pos=2, __s=0x7fffffff6920 "", this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/basic_string.h:302
#8  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::substr (__pos=2, __n=18446744073709551615, this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/basic_string.h:2807
#9  arrow::ToStringNegativeScale (adjusted_exponent=<optimized out>, str=..., is_negative=<optimized out>) at ../../src/arrow/util/decimal.cc:107
#10 arrow::Decimal128::ToString[abi:cxx11](int) const (this=<optimized out>, scale=<optimized out>) at ../../src/arrow/util/decimal.cc:127
#11 0x00007fffe585614c in arrow::Decimal128Array::FormatValue[abi:cxx11](long) const (this=0x555555f15900, i=<optimized out>) at ../../src/arrow/array.cc:350
#12 0x00007fffe5f9f926 in __pyx_pw_7pyarrow_3lib_12DecimalValue_1as_py(_object*, _object*) () from /home/bkietz/arrow/python/pyarrow/lib.cpython-36m-x86_64-linux-gnu.so

@pitrou
Copy link
Member

pitrou commented Feb 12, 2019

@bkietz Do you want me to take a look?

You can also try to debug this in debug mode simply by tweaking the #ifdef NDEBUG in logging.h...

You may also want to look what happens to the DCHECK macro in Decimal128::ToIntegerString.

@bkietz
Copy link
Member Author

bkietz commented Feb 12, 2019

@pitrou That would be appreciated, thanks. I'll try tweaking logging.h too

@pitrou
Copy link
Member

pitrou commented Feb 13, 2019

@bkietz Looks like my intuition was right. I fixed the issue here with the following diff:

diff --git a/cpp/src/arrow/util/logging.h b/cpp/src/arrow/util/logging.h
index dfdfc71d..38ada2fd 100644
--- a/cpp/src/arrow/util/logging.h
+++ b/cpp/src/arrow/util/logging.h
@@ -80,22 +80,24 @@ enum class ArrowLogLevel : int {
 #ifdef NDEBUG
 #define ARROW_DFATAL ::arrow::util::ArrowLogLevel::ARROW_WARNING
 
-#define DCHECK(condition) \
-  while (false) ARROW_CHECK(condition)
-#define DCHECK_OK(status) \
-  while (false) ARROW_CHECK_OK(status)
+#define DCHECK(condition)       \
+  ARROW_IGNORE_EXPR(condition); \
+  ARROW_CHECK(true)
+
+#define DCHECK_OK(status)    \
+  ARROW_IGNORE_EXPR(status); \
+  ARROW_CHECK(true)
+
 #define DCHECK_EQ(val1, val2) \
-  while (false) ARROW_CHECK((val1) == (val2))
-#define DCHECK_NE(val1, val2) \
-  while (false) ARROW_CHECK((val1) != (val2))
-#define DCHECK_LE(val1, val2) \
-  while (false) ARROW_CHECK((val1) <= (val2))
-#define DCHECK_LT(val1, val2) \
-  while (false) ARROW_CHECK((val1) < (val2))
-#define DCHECK_GE(val1, val2) \
-  while (false) ARROW_CHECK((val1) >= (val2))
-#define DCHECK_GT(val1, val2) \
-  while (false) ARROW_CHECK((val1) > (val2))
+  ARROW_IGNORE_EXPR(val1);    \
+  ARROW_IGNORE_EXPR(val2);    \
+  ARROW_CHECK(true)
+
+#define DCHECK_NE DCHECK_EQ
+#define DCHECK_LE DCHECK_EQ
+#define DCHECK_LT DCHECK_EQ
+#define DCHECK_GE DCHECK_EQ
+#define DCHECK_GT DCHECK_EQ
 
 #else
 #define ARROW_DFATAL ::arrow::util::ArrowLogLevel::ARROW_FATAL

@bkietz bkietz force-pushed the ARROW-4520-ignore-custom-messages-for-release-builds branch from c0a2121 to ce91f53 Compare February 18, 2019 22:17
@bkietz
Copy link
Member Author

bkietz commented Feb 18, 2019

@wesm @pitrou DCHECK_OK(s) provided the guarantee that s will be evaluated, so the failures were caused by some places where it is being used as a synonym for ABORT_NOT_OK. This seems like an error to me: I would expect debug check expressions be unevaluated in release mode. I suggest refactoring these usages of DCHECK*to have side effect free conditions, maybe by introducing a new macro.

@wesm
Copy link
Member

wesm commented Feb 19, 2019

@bkietz these cases should not abort in release builds. I think what can be done in such cases is this:

Status s = $STATEMENT;
DCHECK_OK(s);

@pitrou
Copy link
Member

pitrou commented Feb 19, 2019

I do think they should either bubble an error, or abort. If we are not bubbling an error, it's usually because the function doesn't return an error code, so we have to abort. Ignoring errors is extremely bad.

@pitrou
Copy link
Member

pitrou commented Feb 19, 2019

Also I have no opinion on whether expressions in debug checks should be always evaluated or not.

@wesm
Copy link
Member

wesm commented Feb 19, 2019

I do think they should either bubble an error, or abort.

Well, I think the idea here is that these particular function calls "can't fail" unless something is implemented wrong by an Arrow developer. It might be a good idea to refactor the code that is being used into two variants:

  • A "can't fail" version that does not return Status
  • A "can fail" version that returns Status, for example to guard against bad user input

@wesm
Copy link
Member

wesm commented Feb 19, 2019

The way that DCHECK is used in practice, it is expected that the statements inside have no cost in release builds. I think that DCHECK_OK might have been a special case

@bkietz
Copy link
Member Author

bkietz commented Feb 19, 2019

We could use ARROW_LOG(DFATAL), which is FATAL in debug and WARNING in release

@wesm
Copy link
Member

wesm commented Feb 19, 2019

Here's the code

  DCHECK_OK(Divide(kTenTo36, &top, &remainder));

Divide has failure modes that simply do not apply here at all

  • Divide by zero
  • BuildFromArray overflowing

If either of these cases occurs, there is a flaw in the implementation that can be caught by DCHECK in debug builds. It should not be possible to trigger these error conditions in release builds if the tests pass in debug builds

@pitrou
Copy link
Member

pitrou commented Feb 19, 2019

Right, so either we don't check at all, or we abort on error (since it's a critical error). I don't think warning makes sense at all.

ARROW_IGNORE_EXPR(val1); \
while (false) ::arrow::util::ArrowLogBase()
while (false) ARROW_CHECK((val1) > (val2))
#define DABORT_NOT_OK(expr) ARROW_IGNORE_EXPR(expr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this does not guarantee that the statements are executed in release builds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARROW_IGNORE_EXPR(expr) evaluates the expression but cast it to void, so the compiler won't warn against a Status return which isn't examined

@wesm
Copy link
Member

wesm commented Feb 19, 2019

Yes, I think the correct option is:

  • Don't check in release builds: execute the statement, but ignore the Status code
  • DCHECK(s.ok()) in debug builds

So the implementation of DCHECK_OK should be

do {
  Status s = STATEMENT;
  DCHECK(s.ok());
} while (0)

If STATEMENT has side effects then it will be executed in both debug and release builds

@bkietz
Copy link
Member Author

bkietz commented Feb 20, 2019

@wesm @pitrou how does that look?

@bkietz
Copy link
Member Author

bkietz commented Feb 20, 2019

also: what happened with appveyor? It shows green here on the PR (for me at least) but MinGW seems to have failed

@pitrou
Copy link
Member

pitrou commented Feb 20, 2019

@bkietz The MinGW 32-bit failures are expected for now, that's why they don't fail the build ;-)

@pitrou
Copy link
Member

pitrou commented Feb 20, 2019

I find this solution confusing. If DCHECK_OK always evaluates its argument, I would expect other DCHECK macros to do as well. Otherwise it's too error-prone.

I'm not strongly attached to one or the other alternative (either we check in release mode or not), but it should be consistent accross the board.

@pitrou
Copy link
Member

pitrou commented Feb 20, 2019

But, really, since this PR started from the desire to fix a compilation failure, I think the simplest thing is to fix the compilation failure without changing the current semantics. We can revise the semantics in another JIRA / PR but that sounds a bit low-priority to me.

@bkietz bkietz force-pushed the ARROW-4520-ignore-custom-messages-for-release-builds branch from 72d0930 to 455ac88 Compare February 22, 2019 15:32
@bkietz
Copy link
Member Author

bkietz commented Feb 22, 2019

@bkietz
Copy link
Member Author

bkietz commented Feb 22, 2019

This seems spurious: https://travis-ci.org/apache/arrow/jobs/497095148#L3065

@wesm wesm force-pushed the ARROW-4520-ignore-custom-messages-for-release-builds branch from 455ac88 to e508537 Compare February 26, 2019 06:32
@wesm
Copy link
Member

wesm commented Feb 26, 2019

rebased

@bkietz
Copy link
Member Author

bkietz commented Feb 26, 2019

@wesm the CI failure is unrelated to this PR, see ARROW-4684

@wesm wesm closed this in 37d9d3d Feb 27, 2019
@bkietz bkietz deleted the ARROW-4520-ignore-custom-messages-for-release-builds branch February 27, 2019 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants