Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI may depend on NDEBUG #9947

Closed
pitrou opened this issue May 11, 2022 · 16 comments
Closed

ABI may depend on NDEBUG #9947

pitrou opened this issue May 11, 2022 · 16 comments
Assignees

Comments

@pitrou
Copy link
Contributor

pitrou commented May 11, 2022

In changeset ab4585a, the InternalMetadata destructor changed to the following:

#if defined(NDEBUG) || defined(_MSC_VER)
  ~InternalMetadata() {
    if (HasMessageOwnedArenaTag()) {
      delete reinterpret_cast<Arena*>(ptr_ - kMessageOwnedArenaTagMask);
    }
  }
#else
  ~InternalMetadata();
#endif

This causes ABI problems between debug and non-debug in one specific case: when the -fvisibility-inlines-hidden compiler option is being used. According to the GCC wiki, that option has the following effect:

This causes all inlined class member functions to have hidden visibility, causing significant export symbol table size & binary size reductions

However, when libprotobuf.so was compiled in non-debug mode with -fvisibility-inlines-hidden enabled, any debug build of a third-party application will fail looking up ~InternalMetadata()

What version of protobuf and what language are you using?
Version: v3.20.1
Language: C++

What runtime / compiler are you using (e.g., python version or gcc version)
gcc 10.3.0

@pitrou
Copy link
Contributor Author

pitrou commented May 11, 2022

One possible fix would look like this:

  ~InternalMetadata() {
#if defined(NDEBUG) || defined(_MSC_VER)
    if (HasMessageOwnedArenaTag()) {
      delete reinterpret_cast<Arena*>(ptr_ - kMessageOwnedArenaTagMask);
    }
#else
    ActuallyDestroy();
#endif
  }

 private:
  // Non-inlined destructor for debug builds,
  // always defined for ABI reasons.
  void ActuallyDestroy();

@devjgm devjgm added the c++ label May 11, 2022
@devjgm
Copy link
Contributor

devjgm commented May 11, 2022

@pitrou, thanks for the issue report and possible fix. It looks like the addition of the NDEBUG check in our header introduced the possibility of ODR violations, such as when libprotobuf is installed with NDEBUG defined, and later used when NDEBUG is undefined. The problem surfaced for you as a linking issue that you found with -fvisibility-inlines-hidden, but the ODR violation is there even without that flag.

Your proposed fix still violates the ODR, but I believe that in practice it may avoid the linker issue you reported and may appear to work.

Generally speaking, we cannot guarantee that our C++ code will continue to work properly when the flags used to compile and install the library differ from the flags used to compile/link with it. However, we may want to special case NDEBUG and try to keep things working at least on a best-effort basis. I need to talk with others on the team who are more experienced on this project than I am, but we'll get back to you.

@thierryba
Copy link

For the record, my current workaround is to build both release and debug and in my app's CMakeLists.txt I have to do:

if (NOT WIN32) ## this is a workaround for #9947
set(Protobuf_LIBRARY optimized ${myproject_SOURCE_DIR}/3rdparty/lib/libprotobuf.a
debug ${myproject_SOURCE_DIR}/3rdparty/lib/libprotobufd.a
)
endif(NOT WIN32)

@fowles
Copy link
Member

fowles commented May 16, 2022

Supporting mixed compilation modes is not something we intend to do, as avoiding ODR violations under this kind of a context comes with significant performance costs that we are unwilling to pay.

@fowles fowles closed this as completed May 16, 2022
@thierryba
Copy link

As I'm used to compiling under windows where you almost always have to compile things twice (1 debug, 1 release), I'm not shocked. The only nice thing that would help is then if FindProtobuf would find both release and debug libs on its own.

@pitrou
Copy link
Contributor Author

pitrou commented May 17, 2022

I understand the desire not to support situations which are stricto sensu undefined behaviour.

It also makes protobuf more difficult to use in situations where pre-built release binaries are available (e.g. Debian/Ubuntu packages), but no debug ones.

@eyalfink
Copy link

eyalfink commented Jun 5, 2022

@fowles
The current code is broken, it assumes that the definition of NDEBUG when compiling the library is the same as when you use it (since the .cc file check the value of NDEBUG on library compilation, while in the h file this value is check during the code compilation)

am I missing something?

@coryan
Copy link
Contributor

coryan commented Jun 6, 2022

Adding my voice here: I think it is a deviation from common practice to change the ABI based on whether NDEBUG is defined or not. Or put another way: it is common practice on Linux to install a library using the system package manager, and then use this library (which naturally was compiled in release mode) to compile your application in debug mode [*]

[*] I am sure it is also common practice to install different libraries for debug and release. Both practices can be "common" at the same time.

@cwharris
Copy link

cwharris commented Jun 8, 2022

Supporting mixed compilation modes is not something we intend to do, as avoiding ODR violations under this kind of a context comes with significant performance costs that we are unwilling to pay.

Where can we see the performance metrics this decision seems to be based on?

@akihikodaki
Copy link

@fowles

As @cwharris wonders, I don't think fixing this without ODR violation costs such significant performance costs either. Modern processors with sophisticated memory hierarchy and branch predictors should be able to handle the branch depending on the debug switch in a few cycles, which is much cheaper than the core deallocation logic. Is there really no chance such costs can be tolerated?

@fowles
Copy link
Member

fowles commented Jul 19, 2022

@cwharris the performance measurements we are using are only available inside google. They are essentially websearch performance tests. Even a half a percent of websearches overall performance foot print turns out to be a very large number.

@coryan
Copy link
Contributor

coryan commented Jul 19, 2022

Sounds like the code is important for performance inside Google, but a hassle outside Google. Consider the following alternatives:

  • Add a macro to enable the dangerous (or at least "only safe inside Google") optimizations:
#if (defined(NDEBUG) || defined(_MSC_VER)) && defined(PROTOBUF_DANGEROUS_GOOGLE_INTERNAL_OPTIMIZATIONS)
  ~InternalMetadata() {
    if (HasMessageOwnedArenaTag()) {
      delete reinterpret_cast<Arena*>(ptr_ - kMessageOwnedArenaTagMask);
    }
  }
#else 
  ~InternalMetadata();
#endif 

The documentation of this macro can explain why it is a bad idea to set it outside Google, and you can have it always set in the Google builds.

  • Use something other than NDEBUG, maybe GOOGLE_ABI_BREAKING_NDEBUG, and change the Google builds to turn on that macro for debug builds.

  • Always use the optimized path:

#if defined(_MSC_VER)
  ~InternalMetadata() {
      if (HasMessageOwnedArenaTag()) {
        delete reinterpret_cast<Arena*>(ptr_ - kMessageOwnedArenaTagMask);
      }
   }
#else
   ~InternalMetadata();
#endif
  }

@fowles
Copy link
Member

fowles commented Jul 19, 2022

@coryan that is only one of quite a few things that have ABI dependencies

@pitrou
Copy link
Contributor Author

pitrou commented Jul 19, 2022

It's the only problem that seems to be breaking third-party downstream projects, though, so probably worth tackling?

@fowles
Copy link
Member

fowles commented Jul 19, 2022

If someone has the cycles to create a PR, I can review it

pitrou added a commit to pitrou/protobuf that referenced this issue Jul 19, 2022
@fowles
Copy link
Member

fowles commented Jul 19, 2022

Re-opening since we would be willing to review a PR, but we cannot commit to fixing this in the near term (we are too busy with other commitments).

@fowles fowles reopened this Jul 19, 2022
@fowles fowles closed this as completed in 0574167 Jul 19, 2022
fowles added a commit that referenced this issue Jul 19, 2022
Fix #9947: make the ABI identical between debug and non-debug builds
acozzette pushed a commit to acozzette/protobuf that referenced this issue Jul 19, 2022
gcjenkinson pushed a commit to CTSRD-CHERI/protobuf that referenced this issue Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants