Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: CommandLineInterfaceTest failing with "SEH exception" on windows with v25.2 #15436

Closed
h-vetinari opened this issue Jan 13, 2024 · 4 comments
Labels

Comments

@h-vetinari
Copy link
Contributor

h-vetinari commented Jan 13, 2024

In conda-forge, all 151 tests in CommandLineInterfaceTest fail with the following error:

[ RUN      ] CommandLineInterfaceTest.BasicOutput
unknown file: error: SEH exception with code 0xc0000005 thrown in the test fixture's constructor.
Stack trace:


[  FAILED  ] CommandLineInterfaceTest.BasicOutput (1 ms)

Looking for "SEH exception ...", this seems to be coming from windows directly, and is possibly related due to having some kind of ABI mismatch (alignment; MD vs MT etc.).

Given that background, it's probably worth mentioning that we're building a shared version of libprotobuf also on windows - it's conceivable that something about the CLI is always built against a static runtime?

In any case, this setup used to work without issues up until v24.4.

More context in this CI run.

What version of protobuf and what language are you using?
Version: v25.2
Language: C++

What operating system (Linux, Windows, ...) and version?

Windows (GH image windows-2022)

What runtime / compiler are you using (e.g., python version or gcc version)

VS2019

Build environment
    bzip2:           1.0.8-hcfcfb64_5        conda-forge
    ca-certificates: 2023.11.17-h56e8100_0   conda-forge
    cmake:           3.28.1-hf0feee3_0       conda-forge
    krb5:            1.21.2-heb0366b_0       conda-forge
    libcurl:         8.5.0-hd5e4a3a_0        conda-forge
    libexpat:        2.5.0-h63175ca_1        conda-forge
    libssh2:         1.11.0-h7dfc565_0       conda-forge
    libuv:           1.44.2-hcfcfb64_1       conda-forge
    libzlib:         1.2.13-hcfcfb64_5       conda-forge
    ninja:           1.11.1-h91493d7_0       conda-forge
    openssl:         3.2.0-hcfcfb64_1        conda-forge
    ucrt:            10.0.22621.0-h57928b3_0 conda-forge
    vc:              14.3-hcf57466_18        conda-forge
    vc14_runtime:    14.38.33130-h82b7239_18 conda-forge
    vs2015_runtime:  14.38.33130-hcb4865c_18 conda-forge
    vs2019_win-64:   19.29.30139-he1865b1_18 conda-forge
    vswhere:         3.1.4-h57928b3_0        conda-forge
    xz:              5.2.6-h8d14728_0        conda-forge
    zstd:            1.5.5-h12be248_0        conda-forge
Host environment
    gtest:           1.14.0-h91493d7_1           conda-forge
    jsoncpp:         1.9.5-h2d74725_1            conda-forge
    libabseil:       20230802.1-cxx17_h63175ca_0 conda-forge
    libabseil-tests: 20230802.1-cxx17_h63175ca_0 conda-forge
    libzlib:         1.2.13-hcfcfb64_5           conda-forge
    ucrt:            10.0.22621.0-h57928b3_0     conda-forge
    vc:              14.3-hcf57466_18            conda-forge
    vc14_runtime:    14.38.33130-h82b7239_18     conda-forge
    vs2015_runtime:  14.38.33130-hcb4865c_18     conda-forge
    zlib:            1.2.13-hcfcfb64_5           conda-forge

What did you do?

mkdir build
cd build
cmake -G "Ninja" ^
    -DCMAKE_BUILD_TYPE=Release ^
    -DCMAKE_CXX_STANDARD=17 ^
    -DCMAKE_INSTALL_PREFIX=%LIBRARY_PREFIX% ^
    -DCMAKE_PREFIX_PATH=%LIBRARY_PREFIX% ^
    -Dprotobuf_ABSL_PROVIDER="package" ^
    -Dprotobuf_BUILD_SHARED_LIBS=ON ^
    -Dprotobuf_JSONCPP_PROVIDER="package" ^
    -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ^
    -Dprotobuf_USE_EXTERNAL_GTEST=ON ^
    -Dprotobuf_WITH_ZLIB=ON ^
    ..
if %ERRORLEVEL% neq 0 exit 1

cmake --build .
if %ERRORLEVEL% neq 0 exit 1

ctest --progress --output-on-failure
if %ERRORLEVEL% neq 0 exit 1

What did you expect to see

Passing test suite

What did you see instead?

See above

Anything else we should know about your project / environment

@h-vetinari h-vetinari added the untriaged auto added to all issues by default when created. label Jan 13, 2024
@h-vetinari
Copy link
Contributor Author

Looking at the blame of the respective file (and the logs noting something about the fixture constructor), my first suspicion was that this is related to

void SetMockGeneratorTestCase(absl::string_view name) {
#ifdef _WIN32
::_putenv(absl::StrCat("TEST_CASE", "=", name).c_str());
#else
::setenv("TEST_CASE", name.data(), 1);
#endif
}

_putenv on windows requires <stdlib.h> (though that's probably satisfied by some transitive include), but more importantly, it's explicitly not threadsafe. The docs note:

The _putenv and _getenv families of functions are not thread-safe. _getenv could return a string pointer while _putenv is modifying the string, causing random failures. Make sure that calls to these functions are synchronized.

However, even with including that header and running the test suite in parallel, the failures persist. They also persist if we build the static variant of protobuf.

@h-vetinari
Copy link
Contributor Author

Couple of other things I noticed:

  • _putenv is soft-deprecated in favour of _putenv_s, a "more secure" variant

  • Both variants have a table that describe which function flavour to use depending on some global symbols related to Unicode and multibyte strings, reproduced for _putenv_s below:

    TCHAR.H routine _UNICODE and _MBCS not defined _MBCS defined _UNICODE defined
    _tputenv_s _putenv_s _putenv_s _wputenv_s

    This is actually a pretty good candidate for an ABI break IMO.

  • Protobuf will set the _UNICODE symbol based on protobuf_UNICODE

    if (protobuf_UNICODE)
    target_compile_definitions("${target}" PRIVATE -DUNICODE -D_UNICODE)
    endif ()

  • The windows tests in appveyor.bat only test the protobuf_UNICODE=ON case.

  • Another gem in the MSFT docs about UNICODE vs. shared builds:

    Polling _environ in a Unicode context is meaningless when /MD or /MDd linkage is used. For the CRT DLL, the type (wide or multibyte) of the program is unknown. Only the multibyte type is created because that is the most likely scenario.

    "Most likely" indeed... 🤦

@h-vetinari
Copy link
Contributor Author

I guess it's also possible that something could be happening related to the following hunk from e897bcf

@@ -242,9 +252,13 @@ class CommandLineInterfaceTest::NullCodeGenerator : public CodeGenerator {
 // ===================================================================

 CommandLineInterfaceTest::CommandLineInterfaceTest() {
+  // Reset the mock generator's test case environment variable.
+  SetMockGeneratorTestCase("");
+
   // Register generators.
-  RegisterGenerator("--test_out", "--test_opt",
-                    std::make_unique<MockCodeGenerator>("test_generator"),
+  auto mock_generator = std::make_unique<MockCodeGenerator>("test_generator");
+  mock_generator_ = mock_generator.get();
+  RegisterGenerator("--test_out", "--test_opt", std::move(mock_generator),
                     "Test output.");
   RegisterGenerator("-t", std::make_unique<MockCodeGenerator>("test_generator"),
                     "Test output.");

It looks strange to me that we're holding a persistent reference in mock_generator_ to something that's then moved (this is also not done consistently between --test_out and -t, which should otherwise be the same AFAICT).

@h-vetinari
Copy link
Contributor Author

There's also a couple of leaks around mocked objects which happen in other tests; not sure if that's an unrelated issue or somehow connected:

D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\io\coded_stream_unittest.cc(1494): ERROR: this mock object (used in test CodedStreamTest.InputOver2G) should be deleted but never is. Its address is @000000B6EBEFE170.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\text_format_unittest.cc(2107): ERROR: this mock object (used in test TextFormatParserTest.PrintErrorsToStderr) should be deleted but never is. Its address is @000000B6EBEFE2A0.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(225): ERROR: this mock object (used in test MessageTest.ParseFailsIfSubmessageNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE4E0.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(206): ERROR: this mock object (used in test MessageTest.ParseFailsIfNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE520.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\io\coded_stream_unittest.cc(1365): ERROR: this mock object (used in test CodedStreamTest.TotalBytesLimit) should be deleted but never is. Its address is @000000B6EBEFE540.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(249): ERROR: this mock object (used in test MessageTest.ParseFailsIfExtensionNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE570.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\descriptor_unittest.cc(6188): ERROR: this mock object (used in test ValidationErrorTest.ErrorsReportedToLogError) should be deleted but never is. Its address is @000000B6EBEFE5B0.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\descriptor_unittest.cc(11238): ERROR: this mock object (used in test DatabaseBackedPoolTest.ErrorWithoutErrorCollector) should be deleted but never is. Its address is @000000B6EBEFE5D0.
ERROR: 8 leaked mock objects found at program exit. Expectations on a mock object are verified when the object is destructed. Leaking a mock means that its expectations aren't verified, which is usually a test bug. If you really intend to leak a mock, you can suppress this error using testing::Mock::AllowLeak(mock_object), or you may use a fake or stub instead of a mock.

@zhangskz zhangskz added 25.x and removed untriaged auto added to all issues by default when created. labels Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants