Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build issue using Musl rather than glibc #503

Closed
PureTryOut opened this issue Jul 31, 2018 · 32 comments
Closed

Build issue using Musl rather than glibc #503

PureTryOut opened this issue Jul 31, 2018 · 32 comments

Comments

@PureTryOut
Copy link

PureTryOut commented Jul 31, 2018

I'm trying to compile Mir as part of my effort to port Unity8 to postmarketOS. Being based on Alpine Linux, postmarketOS uses Musl rather than glibc. This hasn't caused any major problems so far up till Mir, as it uses the function dlvsym (here) which is a glibc extension and not described by POSIX (see here).

When patching that out (using this patch), it fails later on line 59 to 65 with the following:

/home/pmos/build/src/mir-0.32.1/src/common/posix_rw_mutex.cpp: In constructor 'mir::PosixRWMutex::PosixRWMutex(mir::PosixRWMutex::Type)':
/home/pmos/build/src/mir-0.32.1/src/common/posix_rw_mutex.cpp:59:28: error: 'PTHREAD_RWLOCK_DEFAULT_NP' was not declared in this scope
             pthread_type = PTHREAD_RWLOCK_DEFAULT_NP;
                            ^~~~~~~~~~~~~~~~~~~~~~~~~
/home/pmos/build/src/mir-0.32.1/src/common/posix_rw_mutex.cpp:62:28: error: 'PTHREAD_RWLOCK_PREFER_READER_NP' was not declared in this scope
             pthread_type = PTHREAD_RWLOCK_PREFER_READER_NP;
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/pmos/build/src/mir-0.32.1/src/common/posix_rw_mutex.cpp:65:28: error: 'PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP' was not declared in this scope
             pthread_type = PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP;
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/pmos/build/src/mir-0.32.1/src/common/posix_rw_mutex.cpp:92:60: error: 'pthread_rwlockattr_setkind_np' was not declared in this scope
     err = pthread_rwlockattr_setkind_np(&attr, pthread_type);
                                                            ^

I'm able to fix that using this patch.

Then it fails on line 87-88 here, as dev_t was not declared in this scope. Full error:

In file included from /home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:19:0:
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:87:24: error: 'dev_t' was not declared in this scope
     std::unordered_map<dev_t, std::future<std::unique_ptr<mir::Device>>> pending_devices;
                        ^~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:87:72: error: template argument 1 is invalid
     std::unordered_map<dev_t, std::future<std::unique_ptr<mir::Device>>> pending_devices;
                                                                        ^
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:87:72: error: template argument 3 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:87:72: error: template argument 4 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:87:72: error: template argument 5 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:88:24: error: 'dev_t' was not declared in this scope
     std::unordered_map<dev_t, std::unique_ptr<mir::Device>> device_watchers;
                        ^~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:88:58: error: template argument 1 is invalid
     std::unordered_map<dev_t, std::unique_ptr<mir::Device>> device_watchers;
                                                          ^~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:88:58: error: template argument 3 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:88:58: error: template argument 4 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.h:88:58: error: template argument 5 is invalid
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp: In lambda function:
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:295:45: error: request for member 'count' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::pending_devices', which is of non-class type 'int'
                         if (pending_devices.count(workaround_device->devnum()) > 0 ||
                                             ^~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:296:45: error: request for member 'count' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::device_watchers', which is of non-class type 'int'
                             device_watchers.count(workaround_device->devnum()) > 0)
                                             ^~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:302:41: error: request for member 'emplace' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::pending_devices', which is of non-class type 'int'
                         pending_devices.emplace(
                                         ^~~~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:328:49: error: request for member 'erase' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::device_watchers', which is of non-class type 'int'
                                 device_watchers.erase(device.devnum());
                                                 ^~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp: In member function 'void mir::input::evdev::Platform::device_added(libinput_device*)':
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:396:21: error: request for member 'emplace' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::device_watchers', which is of non-class type 'int'
     device_watchers.emplace(devnum, pending_devices.at(devnum).get());
                     ^~~~~~~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:396:53: error: request for member 'at' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::pending_devices', which is of non-class type 'int'
     device_watchers.emplace(devnum, pending_devices.at(devnum).get());
                                                     ^~
/home/pmos/build/src/mir-0.32.1/src/platforms/evdev/platform.cpp:397:21: error: request for member 'erase' in '((mir::input::evdev::Platform*)this)->mir::input::evdev::Platform::pending_devices', which is of non-class type 'int'
     pending_devices.erase(devnum);
                     ^~~~~

That was fixed using this patch.

@AlanGriffiths
Copy link
Contributor

We're not in a position to actively support a port to Musl, but would be happy to consider PRs.

@PureTryOut
Copy link
Author

Yeah I'm slowly working through the errors myself, and will PR everything that fixes the issues I encounter.

@AlanGriffiths
Copy link
Contributor

@PureTryOut I've seen the patches at https://gitlab.com/postmarketOS/pmaports/tree/feature/unity8/unity8/mir and wondered if you're ready to PR some fixes?

How well is Mir working for you?

@wmww
Copy link
Contributor

wmww commented Sep 26, 2018

This morning I helped z3ntu get Mir master to compile for pmOS. They're packaging gtest now AFAIK because its not on Alpine yet.

@PureTryOut
Copy link
Author

PureTryOut commented Sep 26, 2018

Huh, Mir master already compiled, why did he need help?

@AlanGriffiths most fixes can be upstreamed imo, but at least one is not good enough. Basicall it's just removing -Werror from the C and CXX flags which works fine for us, but should be done properly if upstreamed.

I could do the rest separately though, we at least got it running!

@z3ntu
Copy link
Contributor

z3ntu commented Sep 26, 2018

Apparently protobuf was upgraded in Alpine between the time you and me compiled it and 7c426ea is not in 0.32.1

@z3ntu
Copy link
Contributor

z3ntu commented Sep 30, 2018

So I hit this error message when basically doing anything with the compiled libmirserver.so file (ldd, run an executable linked to it, etc):

Error relocating bin/../lib/libmirserver.so.47: _ZN5boost4asio6detail15keyword_tss_ptrINS1_10call_stackINS1_14thread_contextENS1_16thread_info_baseEE7contextEE6value_E: symbol not found

After spending much time and fortunately @PureTryOut telling me, that v0.32.1 worked, I wrote a git bisect script and it resulted in this:

c9f2846a30fbf26f059ecdecef2b87e35f653e3a is the first bad commit
commit c9f2846a30fbf26f059ecdecef2b87e35f653e3a
Author: Alan Griffiths <alan@octopull.co.uk>
Date:   Tue Jul 24 12:40:01 2018 +0100

    Default linker to "gold"

:100644 100644 41744e8f5b626affe169fe67d1274ad04cf07049 83c43f0faefbba643844484a40b0aa3a917b60d8 M      CMakeLists.txt
bisect run success

EDIT: After reverting that commit on master (plus my patches), the shared library seems to work properly. I'm now trying to get miral-shell to work.

@z3ntu
Copy link
Contributor

z3ntu commented Oct 31, 2018

Tests look surprisingly good.

$ bin/mir_integration_tests

.... many lines ....

[==========] 164 tests from 35 test cases ran. (2002 ms total)
[  PASSED  ] 164 tests.

$ bin/mir_unit_tests --gtest_filter=-ProbingClientPlatformFactory.*:MesaKMS/MesaClientPlatformTest.*:MesaClientPlatformTest.*:MesaKMS/ClientPlatformTest.*:MesaX11/ClientPlatformTest.*

.... many lines ....

[==========] 1491 tests from 177 test cases ran. (50179 ms total)
[  PASSED  ] 1484 tests.
[  FAILED  ] 7 tests, listed below:
[  FAILED  ] SharedLibrary.load_nonexistent_function_fails_with_useful_info
[  FAILED  ] MultiThreadedCompositor.names_compositor_threads
[  FAILED  ] BasicConnector.names_ipc_threads
[  FAILED  ] ThreadedSnapshotStrategyTest.names_snapshot_thread
[  FAILED  ] BasicThreadPool.executes_on_preferred_thread
[  FAILED  ] BasicThreadPool.recycles_threads
[  FAILED  ] ThreadedDispatcherTest.sets_thread_names_appropriately

 7 FAILED TESTS
  YOU HAVE 3 DISABLED TESTS

(most failures are caused by the patches linked below)

The acceptance tests have more problems though :)

Using https://github.com/z3ntu/mir/commits/e9f0a81d6fa96417783a6cfc69bb483ae02b07d6

@AlanGriffiths
Copy link
Contributor

Using https://github.com/z3ntu/mir/commits/e9f0a81d6fa96417783a6cfc69bb483ae02b07d6

@z3ntu browsing through those commits I see that you change #! /bin/bash to #!/usr/bin/env sh.

I understand that bash isn't always at /bin/bash and using the /usr/bin/env ... hack to find it using $PATH. But the scripts use bash specific stuff. So using env to find sh (or even just /bin/sh) breaks on, for example, Ubuntu.

Do you have bash in your setup? I.e. does #!/usr/bin/env bash work for you?

@PureTryOut
Copy link
Author

We can add it as a dependency, so that would work yes.

@AlanGriffiths
Copy link
Contributor

We can add it as a dependency, so that would work yes.

OK, proposed that: #637

@z3ntu
Copy link
Contributor

z3ntu commented Jan 9, 2019

I'm happy to report that when merging the following branches onto master (they all have an open PR currently) and applying the tiny patch from #696, Mir compiles on Alpine Linux:
origin/dlvsym z3ntu/pthread_getname_np z3ntu/musl-tests z3ntu/poll_h

Tests are a different story though ^^ (and #503 (comment) is still valid too)

@z3ntu
Copy link
Contributor

z3ntu commented Jan 12, 2019

@AlanGriffiths I get the following warning/error now:

/home/pmos/build/src/mir-1.1.0/src/server/frontend_wayland/xdg_shell_v6.cpp: In constructor 'mir::frontend::XdgPopupV6::XdgPopupV6(wl_client*, wl_resource*, uint32_t, mir::frontend::XdgSurfaceV6*, mir::frontend::XdgSurfaceV6*, wl_resource*, mir::frontend::WlSurface*)':
/home/pmos/build/src/mir-1.1.0/src/server/frontend_wayland/xdg_shell_v6.cpp:261:68: error: 'parent_role' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         specification->parent_id = parent_role.value()->surface_id();
                                                                    ^

@AlanGriffiths
Copy link
Contributor

@z3ntu that's just weird. It is initialized on creation:

    auto parent_role = parent_surface->window_role();

    specification->type = mir_window_type_freestyle;
    specification->placement_hints = mir_placement_hints_slide_any;
    if (parent_role)
        specification->parent_id = parent_role.value()->surface_id();

@z3ntu
Copy link
Contributor

z3ntu commented Jan 13, 2019

I also find it weird because I didn't see that error when building in a shell in an Alpine VM but I get the error when building for a postmarketOS package (which technically should be 1:1 the same packages but something is different)

Build commands:

cmake \
   -DCMAKE_INSTALL_PREFIX:PATH=/usr \
   -DCMAKE_INSTALL_LIBDIR=lib \
   -DMIR_USE_LD=ld
make
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-alpine-linux-musl/8.2.0/lto-wrapper
Target: x86_64-alpine-linux-musl
Configured with: /home/buildozer/aports/main/gcc/src/gcc-8.2.0/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --build=x86_64-alpine-linux-musl --host=x86_64-alpine-linux-musl --target=x86_64-alpine-linux-musl --with-pkgversion='Alpine 8.2.0' --enable-checking=release --disable-fixed-point --disable-libstdcxx-pch --disable-multilib --disable-nls --disable-werror --disable-symvers --enable-__cxa_atexit --enable-default-pie --enable-default-ssp --enable-cloog-backend --enable-languages=c,c++,objc,fortran,ada --disable-libssp --disable-libmpx --disable-libmudflap --disable-libsanitizer --enable-shared --enable-threads --enable-tls --with-system-zlib --with-linker-hash-style=gnu
Thread model: posix
gcc version 8.2.0 (Alpine 8.2.0)

gcc command that triggers that warning/error:

cd /home/pmos/build/src/mir-1.1.0/src/server/frontend_wayland && /usr/lib/ccache/bin/c++  -DLOG_NDEBUG=1 -DMESA_EGL_NO_X11_HEADERS -DMIR_DRMMODEADDFB_HAS_CONST_SIGNATURE -DMIR_LOG_COMPONENT_FALLBACK=\"mirserver\" -DMIR_SERVER_EGL_OPENGL_API=EGL_OPENGL_ES_API -DMIR_SERVER_EGL_OPENGL_BIT=EGL_OPENGL_ES2_BIT -DMIR_SERVER_GLEXT_H="<GLES2/gl2ext.h>" -DMIR_SERVER_GL_H="<GLES2/gl2.h>" -DMIR_SERVER_GRAPHICS_PLATFORM_VERSION=\"MIR_GRAPHICS_PLATFORM_0.32\" -DMIR_SERVER_INPUT_PLATFORM_VERSION=\"MIR_INPUT_PLATFORM_0.27\" -DMIR_SERVER_PLATFORM_PATH=\"/usr/lib/mir/server-platform\" -DMIR_VERSION=\"1.1.0\" -DMIR_VERSION_MAJOR=1 -DMIR_VERSION_MICRO=0 -DMIR_VERSION_MINOR=1 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I/home/pmos/build/src/mir-1.1.0/include/core -I/home/pmos/build/src/mir-1.1.0/include/common -I/home/pmos/build/src/mir-1.1.0/include/cookie -I/home/pmos/build/src/mir-1.1.0/src/include/common -I/home/pmos/build/src/mir-1.1.0/src/capnproto -I/home/pmos/build/src/mir-1.1.0/src/protobuf -I/home/pmos/build/src/mir-1.1.0/include/platform -I/home/pmos/build/src/mir-1.1.0/include/client -I/home/pmos/build/src/mir-1.1.0/include/server -I/home/pmos/build/src/mir-1.1.0/include/renderer -I/home/pmos/build/src/mir-1.1.0/include/renderers/gl -I/home/pmos/build/src/mir-1.1.0/include/renderers/sw -I/home/pmos/build/src/mir-1.1.0/src/include/platform -I/home/pmos/build/src/mir-1.1.0/src/include/client -I/home/pmos/build/src/mir-1.1.0/src/include/server -I/home/pmos/build/src/mir-1.1.0/src/include/cookie -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/gio-unix-2.0 -I/home/pmos/build/src/mir-1.1.0/src/server/frontend_wayland/../frontend_xwayland  -Os -fomit-frame-pointer -pthread -g -std=c++14 -Werror -Wall -fno-strict-aliasing -pedantic -Wnon-virtual-dtor -Wextra -fPIC -Wno-psabi   -std=c++14 -o CMakeFiles/mirfrontend-wayland.dir/xdg_shell_v6.cpp.o -c /home/pmos/build/src/mir-1.1.0/src/server/frontend_wayland/xdg_shell_v6.cpp

@wmww
Copy link
Contributor

wmww commented Jan 15, 2019

I replaced you're absolute paths with the ones on my system and replaced /usr/lib/ccache/bin/c++ with g++, and I reproduced the weird warning.

Things that didn't remove the warning:

  • Declaring the variable with the full type (std::experimental::optional<WindowWlSurfaceRole*>) instead of auto
  • Switching to brace initialization did anything
  • Changing the variable name

Things that did:

  • Declaring the variable with a type and then setting it on the next line
  • (EDIT) Unwrapping the optional immediately with a .value_or(nullptr), and than removing the .value() where we get the ID
  • (EDIT) Moving two unrelated lines below the if block (see below)
  • (EDIT) Moving auto parent_role = parent_surface->window_role(); to below the two lines that set specification properties

@wmww
Copy link
Contributor

wmww commented Jan 15, 2019

Interesting. when I replace the body of the if with

auto parent_role_real = parent_role.value();
specification->parent_id = parent_role_real->surface_id();

I still get the warning xdg_shell_v6.cpp:263:65: error: ‘parent_role’ may be used uninitialized in this function [-Werror=maybe-uninitialized], but it's on the 2nd line (a line that doen't even mention the variable it's erroring over.

@wmww
Copy link
Contributor

wmww commented Jan 15, 2019

Moving

specification->type = mir_window_type_freestyle;
specification->placement_hints = mir_placement_hints_slide_any;

to below the if/else block fixes it, wtf?

@z3ntu
Copy link
Contributor

z3ntu commented Jan 15, 2019

Wtf
But it looks like you're having fun with it :P

@AlanGriffiths
Copy link
Contributor

This is clearly a compiler bug. How about coming up with a minimal example and reporting it?

@wmww
Copy link
Contributor

wmww commented Jan 16, 2019

It will take some amount of work to get a minimal example (especially since many minor changes make the bug go away. I want to do it eventually, but it's low on my priorities. It may be useful to know that xdg_shell_stable.cpp has the same issue, even though the code is slightly different.

@z3ntu
Copy link
Contributor

z3ntu commented Jan 26, 2019

@wmww Did you make any progress on that issue?

@z3ntu
Copy link
Contributor

z3ntu commented Jan 27, 2019

It might be this bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86465 ?

@wmww
Copy link
Contributor

wmww commented Jan 27, 2019

@z3ntu Haven't been back to it, but that bug report does appear to be about the issue.

@AlanGriffiths
Copy link
Contributor

@z3ntu this is clearly a compiler issue.

As there is no obvious rephrasing of the code that works around it, could you try detecting the compiler+version and disabling maybe-uninitialized with set_source_files_properties().

@z3ntu
Copy link
Contributor

z3ntu commented Jan 28, 2019

I've changed the Alpine package to use clang instead of gcc for now

@AlanGriffiths
Copy link
Contributor

It might be this bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86465 ?

I came across this and though of this problem.

@z3ntu
Copy link
Contributor

z3ntu commented Mar 26, 2019

This issue is basically solved - except that the tests don't work (what doesn't really annoy me too much).

What I would appreciate it, if the CI would be set up to also compile everything on Alpine Linux to make sure there are no "compile-time regressions" :)

@Saviq
Copy link
Collaborator

Saviq commented Mar 27, 2019

@z3ntu while it's a noble goal, it's basically impossible to cover every distro / toolchain in CI… If you have infrastructure onto which we could deploy this, we could try.

I suppose we could also try and include a musl build on Ubuntu, that would be much lower maintenance and should cover Alpine for the most part?

@z3ntu
Copy link
Contributor

z3ntu commented Mar 28, 2019

Sure, a musl on Ubuntu CI would suffice.
It's just, that it's "easy" for a change to break musl support (like a missing include) and it took quite some time fixing all the little things that we came across while getting Mir running on Alpine.

@AlanGriffiths
Copy link
Contributor

Sure, a musl on Ubuntu CI would suffice.

Raised that as a specific issue. Can we close this issue now? (Possibly first raise a new issue for the tests?)

@z3ntu
Copy link
Contributor

z3ntu commented Mar 29, 2019

Opened #778. Sure, let's close this issue as Mir now compiles and runs fine on Alpine/postmarketOS.

@Saviq Saviq closed this as completed Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants