Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

www-client/ungoogled-chromium: 109.0.5414.74: crashes on load #197

Closed
baconsalad opened this issue Jan 9, 2023 · 89 comments
Closed

www-client/ungoogled-chromium: 109.0.5414.74: crashes on load #197

baconsalad opened this issue Jan 9, 2023 · 89 comments
Labels
bug Something isn't working

Comments

@baconsalad
Copy link

On load the browser window comes up as crashed, continuously tries to reload in the background and never succeeds.

[23957:23957:0109/175547.856369:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.873953:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.892552:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.909860:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.926557:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.943069:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.

current build flags

www-client/ungoogled-chromium::pf4public -cfi clang -cups custom-cflags -enable-driver -hangouts hevc official optimize-thinlto optimize-webui proprietary-codecs -screencast -suid -system-ffmpeg -system-harfbuzz -system-icu -system-jsoncpp -system-libevent -system-libvpx -system-openh264 -system-openjpeg tcmalloc thinlto vaapi vdpau -widevine
@baconsalad baconsalad added the bug Something isn't working label Jan 9, 2023
@PF4Public
Copy link
Owner

Mine is still building and I'm afraid that's too little information to work with.

@baconsalad
Copy link
Author

I've already rolled back. When the next version comes out I'll build that and see if it's still happening.

(Accidentally closed this, reopen if you want)

@PF4Public
Copy link
Owner

worksforme :)
image

🤷

@baconsalad
Copy link
Author

What are your use flags?

@PF4Public
Copy link
Owner

USE="X clang convert-dict cups custom-cflags hevc js-type-check official optimize-thinlto optimize-webui pgo proprietary-codecs pulseaudio qt5 system-av1 system-ffmpeg system-harfbuzz system-icu system-jsoncpp system-libevent system-libusb system-openh264 system-openjpeg system-png system-re2 system-snappy thinlto vaapi -cfi -debug -enable-driver -gtk4 -hangouts -headless -kerberos -pic -screencast (-selinux) -suid -system-libvpx -vdpau -wayland -widevine"

I doubt it has anything to do with flags. You'd better look into dmesg for any segfaults or anything suspicious.

@baconsalad
Copy link
Author

Our use flags are almost identical. It gave me some goodies this time in dmesg.

[57962.562576] Chrome_ChildIOT[21844]: segfault at 0 ip 0000561296bdd45a sp 00007f13d41fa720 error 4 in chrome[561292096000+b00d000] likely on CPU 1 (core 1, socket 0)
[57962.562582] Code: b1 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 cd 02 4c 06 48 ff 45 d8 48 83 c3 02 4c 39 fb 74 49 bf 28 00 00 00 e8 36 86 c2 ff <0f> b7 0b 48 8b 7d c0 89 48 20 48 85 ff 74 c7 0f 1f 80 00 00 00 00
[57962.578833] Chrome_ChildIOT[21854]: segfault at 0 ip 0000563ecaa2445a sp 00007fda20dfa720 error 4 in chrome[563ec5edd000+b00d000] likely on CPU 0 (core 0, socket 0)
[57962.578839] Code: b1 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 cd 02 4c 06 48 ff 45 d8 48 83 c3 02 4c 39 fb 74 49 bf 28 00 00 00 e8 36 86 c2 ff <0f> b7 0b 48 8b 7d c0 89 48 20 48 85 ff 74 c7 0f 1f 80 00 00 00 00
[57962.595370] Chrome_ChildIOT[21864]: segfault at 0 ip 00005564a4d1b45a sp 00007f29a5dfa720 error 4 in chrome[5564a01d4000+b00d000] likely on CPU 1 (core 1, socket 0)
[57962.595375] Code: b1 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 cd 02 4c 06 48 ff 45 d8 48 83 c3 02 4c 39 fb 74 49 bf 28 00 00 00 e8 36 86 c2 ff <0f> b7 0b 48 8b 7d c0 89 48 20 48 85 ff 74 c7 0f 1f 80 00 00 00 00
[57962.611575] Chrome_ChildIOT[21875]: segfault at 0 ip 000055f8df23f45a sp 00007f9f85bf9720 error 4 in chrome[55f8da6f8000+b00d000] likely on CPU 5 (core 5, socket 0)
[57962.611581] Code: b1 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 cd 02 4c 06 48 ff 45 d8 48 83 c3 02 4c 39 fb 74 49 bf 28 00 00 00 e8 36 86 c2 ff <0f> b7 0b 48 8b 7d c0 89 48 20 48 85 ff 74 c7 0f 1f 80 00 00 00 00
[57962.628167] Chrome_ChildIOT[21884]: segfault at 0 ip 000055bd6992045a sp 00007f9f42dfa720 error 4 in chrome[55bd64dd9000+b00d000] likely on CPU 1 (core 1, socket 0)
[57962.628172] Code: b1 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 cd 02 4c 06 48 ff 45 d8 48 83 c3 02 4c 39 fb 74 49 bf 28 00 00 00 e8 36 86 c2 ff <0f> b7 0b 48 8b 7d c0 89 48 20 48 85 ff 74 c7 0f 1f 80 00 00 00 00

@PF4Public
Copy link
Owner

Have you tried starting with the empty profile? Have you tried googling your issue? Quick googling shows me some potentially related issues, but I cannot tell, how applicable they are to your case. If all that fails I would suggest getting the full back-trace from the crash and then analyzing what could cause it.

Since 74 goes into stable, I'll reopen this issue. Maybe someone else also encounters this and / or could help further.

@PF4Public PF4Public reopened this Jan 11, 2023
@baconsalad
Copy link
Author

I've been building this for a while now and occasionally I have one that won't build or run correctly. If I wait a release or two it will just resolve itself.

As far as googling related issues, what I have found doesn't seem to apply to me.

@perfect7gentleman
Copy link
Contributor

i got it too.

chromium
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
MESA-INTEL: warning: Haswell Vulkan support is incomplete
ATTENTION: default value of option mesa_glthread overridden by environment.
[3088:3088:0113/204826.970719:ERROR:gpu_init.cc(523)] Passthrough is not supported, GL is egl, ANGLE is 
ATTENTION: default value of option mesa_glthread overridden by environment.
[3047:3047:0113/204826.990911:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[3047:3047:0113/204827.023210:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
Segmentation fault

@PF4Public
Copy link
Owner

Guys, I don't have even a slightest idea, why this may happen. GDB should help. You might need to rebuild it with debug info for this to work.

@perfect7gentleman
Copy link
Contributor

Any clues?

@baconsalad
Copy link
Author

Might as well throw my error on.

[32663:0112/014844.623405:ERROR:angle_platform_impl.cc(43)] Display.cpp:997 (initialize): ANGLE Display::initialize error 12289: Could not create a backing OpenGL context.
ERR: Display.cpp:997 (initialize): ANGLE Display::initialize error 12289: Could not create a backing OpenGL context.
[32663:0112/014844.623457:ERROR:gl_display.cc(508)] EGL Driver message (Critical) eglInitialize: Could not create a backing OpenGL context.
[32663:0112/014844.623474:ERROR:gl_display.cc(920)] eglInitialize OpenGL failed with error EGL_NOT_INITIALIZED, trying next display type
[32663:0112/014844.656255:ERROR:angle_platform_impl.cc(43)] Display.cpp:997 (initialize): ANGLE Display::initialize error 12289: Could not create a backing OpenGL context.
ERR: Display.cpp:997 (initialize): ANGLE Display::initialize error 12289: Could not create a backing OpenGL context.
[32663:0112/014844.656286:ERROR:gl_display.cc(508)] EGL Driver message (Critical) eglInitialize: Could not create a backing OpenGL context.
[32663:0112/014844.656300:ERROR:gl_display.cc(920)] eglInitialize OpenGLES failed with error EGL_NOT_INITIALIZED
[32663:0112/014844.656315:ERROR:gl_ozone_egl.cc(23)] GLDisplayEGL::Initialize failed.
[32663:0112/014844.657012:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization
[32703:0112/014844.661261:ERROR:gpu_init.cc(521)] Passthrough is not supported, GL is disabled, ANGLE is

@thubble
Copy link

thubble commented Jan 20, 2023

I'm getting what appears to be the same error:

[254408.994239] Chrome_ChildIOT[17425]: segfault at 0 ip 0000556cd65ca73b sp 0000556cc89fa810 error 4 cpu 12 in chrome[556cd1b2c000+ac10000] likely on CPU 12 (core 12, socket 0)
[254408.994244] Code: 01 4c 89 f2 0f b6 f9 48 89 c6 4c 89 f1 e8 fd 66 56 fb 48 ff 44 24 28 48 83 c3 02 4c 39 fb 74 48 bf 28 00 00 00 e8 f5 bf c5 ff <0f> b7 0b 89 48 20 48 8b 7c 24 10 48 85 ff 74 c5 0f 1f 44 00 00 48

(exact same error repeated dozens of times)

Here's the very weird part: I also get the same error when rebuilding 108.0.5359.124_p1 with my current system! But if I restore 108.0.5359.124_p1 using the binary package I fortunately remembered to build, it works fine!!

I did switch to a new machine since the last time I built 108 (old Haswell-E 5930k, now it's a Zen 4 7950x, I just swapped in my SSD). However, I don't think that's the issue because the old build works fine, and building using -march=haswell instead of -march=native results in the same breakage in both 108 and 109. I'm also not seeing any other errors on the new system.

I did almost get it to work by buliding with libc++ (I needed to use EXTRA_GN="${EXTRA_GN} use_custom_libcxx=true" and hack the ebuild to use bundled libxml and libxslt). This restores my session and loads pages properly - but then the whole thing still crashes after 5-10 seconds randomly.

I've tried downgrading llvm/clang from 15.0.7 to 15.0.6 (the last version I built successfully with). When building with libstdc++, I downgraded gcc to the earlier snapshot I was using for the last successful build. No luck with any of that.

For those who are (and aren't) getting the error - are you using libc++ or libstdc++? On a related note, is it possible to build chromium with libc++ without using it systemwide? I did the hack to use use_custom_libcxx=true but I think that builds with a bundled copy of libc++ rather than the system one (which I don't even have installed).

@perfect7gentleman
Copy link
Contributor

I use libc++.

@PF4Public
Copy link
Owner

PF4Public commented Jan 20, 2023

Have you tried disabling all system-* dependencies? Maybe that could help?

PS: I've also taken some changes from Gentoo, namely: new gcc patchset. This might also help.

@PF4Public PF4Public added the help wanted Extra attention is needed label Jan 20, 2023
@perfect7gentleman
Copy link
Contributor

I wasn't able to build UGC with use_custom_libcxx=true.

@joecool1029
Copy link
Contributor

joecool1029 commented Jan 21, 2023

For what its worth I've been hitting the same crash with electron 21 and 22, I changed a lot of useflags on my system so it'll be tough to nail down. Could be recent clang/llvm issues, I'll try dropping all the system-* out next.

Update: building with -system-* didn't help and -system-av1 is broken on electron-22, looks like it was missing something being bundled.

@perfect7gentleman
Copy link
Contributor

building with -system-* did not help.

@thubble
Copy link

thubble commented Jan 21, 2023

I'm still getting the same error with 108.0.5359.124_p1. I did try building with full debug info with both libc++ (use_custom_libcxx=true) and libstdc++ - and the crash stack trace was exactly the same for both, and ends in unique_ptr.
Libstdc++:
#0 std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> >::~unique_ptr() (this=0x1cbc012fb580) at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/unique_ptr.h:396
Libc++:
#0 std::Cr::unique_ptr<cc::ResourcePool::PoolResource, std::Cr::default_delete<cc::ResourcePool::PoolResource> >::reset[abi:v160000](cc::ResourcePool::PoolResource*) (this=0x1ff00132e6e0, __p=0x0) at ../../buildtools/third_party/libc++/trunk/include/__memory/unique_ptr.h:281

So I don't think the C++ library is the issue.

I also disabled the system-* flags and no change.

Has anyone tried rebuilding 108 with a recent toolchain? I haven't actually tried 109 since my first attempt, since I wanted to make sure I could build a known-working version first.

I'm probably going to try building with gcc next.

@thubble
Copy link

thubble commented Jan 21, 2023

OK, 108 works fine when built with gcc (and libstdc++, which is what I normally use). I did have to apply this patch to get it to build: https://chromium-review.googlesource.com/c/chromium/src/+/3963839

So obviously something has broken the clang build - and it must be a recent change in Gentoo packages. I'm out of ideas, though. I guess I can just build with gcc for now, although I'd prefer to have thinlto/pgo.

@PF4Public
Copy link
Owner

crash stack trace

Full back-trace might give some clues!

Oh, BTW, I see you're using gcc-12! That might be the problem! I'm still with gcc-11.

Others, do you also have gcc-12?

@baconsalad
Copy link
Author

gcc version 12.2.1 20221231 (Gentoo 12.2.1_p20221231 p8)

@thubble
Copy link

thubble commented Jan 21, 2023

I'm using gcc-12 as well (latest 12.2.1 snapshot). I'll try installing gcc-11 and rebuilding with clang. The build with gcc-12 actually did work, but maybe there's something in libstdc++ or some other gcc library/header that's breaking things when built with clang.

Here's the backtrace, for the libstdc++ version. The libc++ version is identical except for the different STL header locations.

396               get_deleter()(std::move(__ptr));
(gdb) bt
#0  std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> >::~unique_ptr() (this=0x1cbc012fb580) at /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/unique_ptr.h:396
#1  base::internal::VectorBuffer<std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> > >::DestructRange<std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> >, 0>(std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> >*, std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> >*) (this=this@entry=0x1cbc01228388, begin=0x1cbc012fb580, end=0x1cbc012fb580) at ../../base/containers/vector_buffer.h:112
#2  0x000055555c7a7748 in base::circular_deque<std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> > >::erase(base::internal::circular_deque_const_iterator<std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> > >, base::internal::circular_deque_const_iterator<std::unique_ptr<cc::ResourcePool::PoolResource, std::default_delete<cc::ResourcePool::PoolResource> > >) (this=0x1cbc01228388, first=..., last=...) at ../../base/containers/vector_buffer.h:85
#3  0x000055555ae97b06 in base::OnceCallback<void (base::FilePath const&, bool)>::Run(base::FilePath const&, bool) && (this=<optimized out>, args=..., args=<optimized out>) at ../../base/functional/callback.h:145
#4  base::internal::FunctorTraits<base::OnceCallback<void (base::FilePath const&, bool)>, void>::Invoke<base::OnceCallback<void (base::FilePath const&, bool)>, base::FilePath, bool>(base::OnceCallback<void (base::FilePath const&, bool)>&&, base::FilePath&&, bool&&)
    (callback=<optimized out>, args=..., args=<optimized out>) at ../../base/functional/bind_internal.h:750
#5  base::internal::InvokeHelper<false, void, 0ul, 1ul>::MakeItSo<base::OnceCallback<void (base::FilePath const&, bool)>, std::tuple<base::FilePath, bool>>(base::OnceCallback<void (base::FilePath const&, bool)>&&, std::tuple<base::FilePath, bool>&&) (functor=<optimized out>, bound=<optimized out>)
    at ../../base/functional/bind_internal.h:826
#6  base::internal::Invoker<base::internal::BindState<base::OnceCallback<void (base::FilePath const&, bool)>, base::FilePath, bool>, void ()>::RunImpl<base::OnceCallback<void (base::FilePath const&, bool)>, std::tuple<base::FilePath, bool>, 0ul, 1ul>(base::OnceCallback<void (base::FilePath const&, bool)>&&, std::tuple<base::FilePath, bool>&&, std::integer_sequence<unsigned long, 0ul, 1ul>) (functor=<optimized out>, bound=<optimized out>, seq=...) at ../../base/functional/bind_internal.h:920
#7  base::internal::Invoker<base::internal::BindState<base::OnceCallback<void (base::FilePath const&, bool)>, base::FilePath, bool>, void ()>::RunOnce(base::internal::BindStateBase*) (base=<optimized out>) at ../../base/functional/bind_internal.h:871
#8  0x000055555c719df2 in base::OnceCallback<void ()>::Run() && (this=0x1cbc01251880) at ../../base/functional/callback.h:145
#9  viz::ClientResourceProvider::ReceiveReturnsFromParent(std::vector<viz::ReturnedResource, std::allocator<viz::ReturnedResource> >)Python Exception <class 'gdb.error'>: value has been optimized out
 (this=0x1cbc0063a260, resources=) at ../../components/viz/client/client_resource_provider.cc:283
#10 0x000055555c7e1171 in cc::LayerTreeHostImpl::ReclaimResources(std::vector<viz::ReturnedResource, std::allocator<viz::ReturnedResource> >)Python Exception <class 'gdb.error'>: value has been optimized out
 (this=0x1cbc0063a000, resources=) at ../../cc/trees/layer_tree_host_impl.cc:2181
#11 0x000055555cdc80aa in non-virtual thunk to cc::mojo_embedder::AsyncLayerTreeFrameSink::ReclaimResources(std::vector<viz::ReturnedResource, std::allocator<viz::ReturnedResource> >) () at ../../cc/mojo_embedder/async_layer_tree_frame_sink.cc:293
#12 0x0000555558593137 in viz::mojom::CompositorFrameSinkClientStubDispatch::Accept(viz::mojom::CompositorFrameSinkClient*, mojo::Message*) (impl=0x1cbc011a8ed8, message=0x7fffffffc120) at gen/services/viz/public/mojom/compositing/compositor_frame_sink.mojom.cc:1758
#13 0x000055555b6447cf in mojo::InterfaceEndpointClient::HandleValidatedMessage(mojo::Message*) (this=0x1cbc011a2a80, message=0x7fffffffc120) at ../../mojo/public/cpp/bindings/lib/interface_endpoint_client.cc:994
#14 0x000055555b649f0b in mojo::MessageDispatcher::Accept(mojo::Message*) (this=0x1cbc011a2b78, message=0x7fffffffc120) at ../../mojo/public/cpp/bindings/lib/message_dispatcher.cc:43
#15 0x000055555b645c8a in mojo::InterfaceEndpointClient::HandleIncomingMessage(mojo::Message*) (this=<optimized out>, message=0x1cbc012fb580) at ../../mojo/public/cpp/bindings/lib/interface_endpoint_client.cc:693
#16 0x000055555b64d918 in mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*)
     (this=this@entry=0x1cbc01192400, message_wrapper=message_wrapper@entry=0x7fffffffc220, client_call_behavior=client_call_behavior@entry=mojo::internal::MultiplexRouter::ALLOW_DIRECT_CLIENT_CALLS, current_task_runner=0x1cbc00220fc0) at ../../mojo/public/cpp/bindings/lib/multiplex_router.cc:1102
#17 0x000055555b64d27f in mojo::internal::MultiplexRouter::Accept(mojo::Message*) (this=0x1cbc01192400, message=0x7fffffffc480) at ../../mojo/public/cpp/bindings/lib/multiplex_router.cc:716
#18 0x000055555b649f0b in mojo::MessageDispatcher::Accept(mojo::Message*) (this=0x1cbc01192430, message=0x7fffffffc480) at ../../mojo/public/cpp/bindings/lib/message_dispatcher.cc:43
#19 0x000055555b642655 in mojo::Connector::DispatchMessage(mojo::ScopedHandleBase<mojo::MessageHandle>) (this=this@entry=0x1cbc01192460, handle=...) at ../../mojo/public/cpp/bindings/lib/connector.cc:561
#20 0x000055555b642fd4 in mojo::Connector::ReadAllAvailableMessages() (this=0x1cbc01192460) at ../../mojo/public/cpp/bindings/lib/connector.cc:618
#21 0x0000555558384804 in base::RepeatingCallback<void (int, int)>::Run(int, int) const & (this=<optimized out>, args=19903872, args=19903872) at ../../base/functional/callback.h:267
#22 0x000055555b65e503 in base::RepeatingCallback<void (unsigned int, mojo::HandleSignalsState const&)>::Run(unsigned int, mojo::HandleSignalsState const&) const & (this=0x7fffffffc6f0, args=0, args=...) at ../../base/functional/callback.h:267
#23 mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&) (this=0x1cbc00bb5e00, watch_id=<optimized out>, result=0, state=...) at ../../mojo/public/cpp/system/simple_watcher.cc:278
#24 0x000055555b2d10f0 in base::OnceCallback<void ()>::Run() && (this=0x1cbc00249e00) at ../../base/functional/callback.h:145
#25 base::TaskAnnotator::RunTaskImpl(base::PendingTask&) (this=<optimized out>, pending_task=...) at ../../base/task/common/task_annotator.cc:133
#26 0x000055555b2e771f in base::TaskAnnotator::RunTask<base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl(base::LazyNow*)::$_0>(perfetto::StaticString, base::PendingTask&, base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl(base::LazyNow*)::$_0&&) (this=0x1cbc002f86d0, pending_task=..., event_name=..., args=<optimized out>) at ../../base/task/common/task_annotator.h:72
#27 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl(base::LazyNow*) (this=this@entry=0x1cbc002f8500, continuation_lazy_now=continuation_lazy_now@entry=0x7fffffffca40) at ../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:441
#28 0x000055555b2e7175 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork() (this=0x1cbc002f8500) at ../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:297
#29 0x000055555b2e7e88 in non-virtual thunk to base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork() () at ../../third_party/abseil-cpp/absl/types/optional.h:483
#30 0x000055555b28d91b in base::MessagePumpGlib::HandleDispatch() (this=0x1cbc00258a80) at ../../base/message_loop/message_pump_glib.cc:374
#31 base::(anonymous namespace)::WorkSourceDispatch(_GSource*, int (*)(void*), void*) (source=<optimized out>, unused_func=<optimized out>, unused_data=<optimized out>) at ../../base/message_loop/message_pump_glib.cc:127
#32 0x00005555553ee55d in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
#33 0x0000555555479b04 in  () at /usr/lib64/libglib-2.0.so.0
#34 0x00005555553ea4ca in g_main_context_iteration () at /usr/lib64/libglib-2.0.so.0
#35 0x000055555b28d6e9 in base::MessagePumpGlib::Run(base::MessagePump::Delegate*) (this=0x1cbc00258a80, delegate=<optimized out>) at ../../base/message_loop/message_pump_glib.cc:400
#36 0x000055555b2e8197 in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run(bool, base::TimeDelta) (this=0x1cbc002f8500, application_tasks_allowed=true, timeout=...) at ../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:600
#37 0x000055555b2b0d25 in base::RunLoop::Run(base::Location const&) (this=0x1cbc00e0fc00, location=...) at ../../base/run_loop.cc:141
#38 0x0000555559697f7a in content::BrowserMainLoop::RunMainMessageLoop() (this=<optimized out>) at ../../content/browser/browser_main_loop.cc:1048
#39 0x0000555559699c45 in content::BrowserMainRunnerImpl::Run() (this=0x1cbc00358780) at ../../content/browser/browser_main_runner_impl.cc:162
#40 0x0000555559695886 in content::BrowserMain(content::MainFunctionParams) (parameters=...) at ../../content/browser/browser_main.cc:30
#41 0x000055555adf13f0 in content::RunBrowserProcessMain(content::MainFunctionParams, content::ContentMainDelegate*) (main_function_params=..., delegate=0x7fffffffd680) at ../../content/app/content_main_runner_impl.cc:712
#42 0x000055555adf2680 in content::ContentMainRunnerImpl::RunBrowser(content::MainFunctionParams, bool) (this=this@entry=0x1cbc00264180, main_params=..., start_minimal_browser=false) at ../../content/app/content_main_runner_impl.cc:1253
#43 0x000055555adf247f in content::ContentMainRunnerImpl::Run() (this=0x1cbc00264180) at ../../content/app/content_main_runner_impl.cc:1108
#44 0x000055555adef4b8 in content::RunContentProcess(content::ContentMainParams, content::ContentMainRunner*) (params=..., content_main_runner=0x1cbc00264180) at ../../content/app/content_main.cc:342
#45 0x000055555adefc57 in content::ContentMain(content::ContentMainParams) (params=...) at ../../content/app/content_main.cc:370
#46 0x0000555557606236 in ChromeMain(int, char const**) (argc=<optimized out>, argv=0x7fffffffd888) at ../../chrome/app/chrome_main.cc:175
#47 0x0000555551a2c2b7 in __libc_start_call_main (main=main@entry=0x555557606120 <main(int, char const**)>, argc=argc@entry=3, argv=argv@entry=0x7fffffffd888) at ../sysdeps/nptl/libc_start_call_main.h:58
#48 0x0000555551a2c375 in __libc_start_main_impl (main=0x555557606120 <main(int, char const**)>, argc=3, argv=0x7fffffffd888, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd878) at ../csu/libc-start.c:381
#49 0x0000555557606021 in _start ()

@PF4Public
Copy link
Owner

I'm leaning towards some weird incompatibility between either of: 1) libstdc++ from gcc-12; 2) chromium internals; 3) clang's libc++, all of which might be in play here. Also I recall this happened in the past at least once before…

Looking forward for @thubble's rebuild with gcc-11.

@thubble
Copy link

thubble commented Jan 22, 2023

gcc-11 made no difference. I'm out of ideas :(

@arbitrary-dev
Copy link

Hey, using debug flag causes the build to fail

Share the stacktrace you get in terminal.

@preed
Copy link

preed commented Feb 2, 2023

I, too, was experiencing this problem; I solved worked around it thusly: https://bugs.gentoo.org/892537#c3

@thubble
Copy link

thubble commented Feb 2, 2023

I, too, was experiencing this problem; I solved worked around it thusly: https://bugs.gentoo.org/892537#c3

Looks like this was my issue as well. I commented out -fstack-clash-protection in /etc/clang/gentoo-hardened.cfg and 109.0.5414.119 is now working fine, with no extra workarounds and PartitionAlloc re-enabled.

Excellent find, thanks!

@fordfrog
Copy link

fordfrog commented Feb 3, 2023

i have -fstack-clash-protection enabled in /etc/clang/gentoo-hardened.cfg and i don't have the issue. my chromium is configured to use proxy. here are my use flags:
[ebuild R ] www-client/ungoogled-chromium-110.0.5481.77_p1::pf4public USE="X clang convert-dict cups hevc js-type-check official optimize-thinlto optimize-webui pgo proprietary-codecs pulseaudio qt5 system-av1 system-ffmpeg system-harfbuzz system-icu system-jsoncpp system-libevent system-libusb system-openh264 system-openjpeg system-png system-re2 system-snappy thinlto vaapi vdpau widevine -cfi -custom-cflags -debug -enable-driver -gtk4 -hangouts -headless -kerberos -pic -screencast (-selinux) -suid -system-libvpx -wayland" L10N="cs -af -am -ar -bg -bn -ca -da -de -el -en-GB -es -es-419 -et -fa -fi -fil -fr -gu -he -hi -hr -hu -id -it -ja -kn -ko -lt -lv -ml -mr -ms -nb -nl -pl -pt-BR -pt-PT -ro -ru -sk -sl -sr -sv -sw -ta -te -th -tr -uk -ur -vi -zh-CN -zh-TW" 0 KiB

@PF4Public
Copy link
Owner

www-client/ungoogled-chromium-110.0.5481.77

Does it work? Mine crashes just after start :)

@fordfrog
Copy link

fordfrog commented Feb 3, 2023

it works for me fine. but i'm not sure why. maybe the code is not triggered as i use proxy, or i evaded the buggy code by using more system libs instead of the bundled ones.

@PF4Public
Copy link
Owner

it works for me fine

I'm asking because 110 has yet another issue with libstdc++. But if ebuild works for you, this should mean that one of my local patches is to blame for me.

@fordfrog
Copy link

fordfrog commented Feb 3, 2023

it works for me fine

I'm asking because 110 has yet another issue with libstdc++. But if ebuild works for you, this should mean that one of my local patches is to blame for me.

yeah, it compiles and works fine for me, at least as fine as the version before. no custom patches applied here.

@baconsalad
Copy link
Author

Removing -fstack-clash-protection fixed it for me with 110.0.5481.77_p1

Sneaky gentoo.

@perfect7gentleman
Copy link
Contributor

My way.
1 - removed @gentoo-hardend.cfg line in gentoo-common.cfg
2 - rebuilt llvm toolchain
3 - rebuilt nodejs
4 - ...
5 - profit.
Version 110.0.5481.77 (Official Build, ungoogled-chromium) (64-bit)

@thesamesam
Copy link

thesamesam commented Feb 4, 2023

My way. 1 - removed @gentoo-hardend.cfg line in gentoo-common.cfg 2 - rebuilt llvm toolchain 3 - rebuilt nodejs 4 - ... 5 - profit. Version 110.0.5481.77 (Official Build, ungoogled-chromium) (64-bit)

This is essentially the same as saying that using GCC "fixes" the problem. The problematic line has already been identified above.

@PF4Public
Copy link
Owner

I commented out -fstack-clash-protection

I wonder why my systems never got that flag in hardened.cfg?

@thubble
Copy link

thubble commented Feb 4, 2023

I commented out -fstack-clash-protection

I wonder why my systems never got that flag in hardened.cfg?

Do you recall if you ever did etc-update merging with that file? (It's part of sys-devel/clang-common and is config-protected).

@PF4Public
Copy link
Owner

Do you recall if you ever did etc-update merging with that file?

Nope. Maybe I just didn't update clang for long enough for this to never happen to me :)

@joecool1029
Copy link
Contributor

Are we tracking multiple issues here? The removal of -fstack-clash-protection seems to fix a different chromium crash that happens later than initial start. Maybe since electron is building with chromium 108 it doesn't help.

I still see the immediate crash/segfault on electron even after commenting that flag and rebuilding clang/llvm and then electron. It is not possible to build electron with gcc (at least not with gcc12).

@thesamesam
Copy link

thesamesam commented Feb 5, 2023

I think that's possible and it'd explain the inconsistent results, backtraces, and observations in the thread. It's probably worth chalking this bug up to -fstack-clash-protection and forking that other issue into a new one.

@perfect7gentleman
Copy link
Contributor

@joecool1029, try to rebuild nodejs too.

@joecool1029
Copy link
Contributor

Already did as suggested, no change. Pretty sure we're looking at chromium fixing part of the problem in later versions combined with also needing this flag removed.

@thubble
Copy link

thubble commented Feb 5, 2023

I re-built qtwebengine (latest 5.15 git, Chromium 87-based with security backports) and re-enabled my clang/thinlto and -fomit-frame-pointer hacks, and it's working perfectly. So I'm convinced that the Gentoo clang-hardening changes were the source of all of my issues.

It was really frustrating that this happened at exactly the time I upgraded to a completely new architecture, and was also experimenting with overclocking/undervolting - so thanks everyone for the help getting this sorted out!

I don't build Electron from source, and I always build nodejs with gcc, so unfortunately I'm not sure what the problems are there.

@PF4Public
Copy link
Owner

@joecool1029 Do you use chromium on your system? Does it compile and work? Do you remember if you did build 108 chromium, did it work? If electron crashes for you, so should chromium.

@joecool1029
Copy link
Contributor

No, I have edge as a backup but I use firefox as my browser. I think next attempt I'll drop all the system useflags in case dependencies were built with bad clang flags. Electron forces some of the GN flags so I can't test as many configs with it as plain chromium (like changing allocator).

@PF4Public PF4Public removed the help wanted Extra attention is needed label Feb 7, 2023
arbitrary-dev added a commit to arbitrary-dev/configs that referenced this issue Feb 10, 2023
@Techwolf
Copy link

I am getting the same thing here. Currently doing a full debug build now, that will take a few hours. Will post results here later. Due to using packages, I have a local binary fallback of the previous version to use when I need too. I have mostly switch over to librewolf now.

@Techwolf
Copy link

An update.

Building a debug build with clang took 6+ hours. Sadly, can not load the debug build in gdb due to requiring over 32G of RAM.

Building a debug build with gcc also required over 32G of RAM. I have only 32G of RAM.

Building a normal build with USE="-clang" took 3+ hours and fixed the crashing problem for me.

@thubble
Copy link

thubble commented Feb 18, 2023

I am getting the same thing here. Currently doing a full debug build now, that will take a few hours. Will post results here later. Due to using packages, I have a local binary fallback of the previous version to use when I need too. I have mostly switch over to librewolf now.

@Techwolf Can you post your /etc/clang/*.cfg (most importantly, /etc/clang/gentoo-hardened.cfg)? Having -fstack-clash-protection anywhere in there seems to be what caused this issue.

@PF4Public
Copy link
Owner

I'll close this issue since the culprit was identified.

@joecool1029 feel free to open separate issue regarding electron if you're willing to investigate it further.

@joecool1029
Copy link
Contributor

Sounds good, I'll start to look into the electron issue some more this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests