Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

px_proxy_factory_get_proxies not thread-safe #68

Closed
thiagomacieira opened this issue Jul 20, 2017 · 7 comments · Fixed by #128
Closed

px_proxy_factory_get_proxies not thread-safe #68

thiagomacieira opened this issue Jul 20, 2017 · 7 comments · Fixed by #128

Comments

@thiagomacieira
Copy link

The documentation says:

This method always blocks and may be called in a separate thread (is thread-safe)

But it isn't thread-safe. webkitgtk plugin can only be used from one thread, ever, since javascriptcoregtk will find a running GMainLoop and attach timers to it. If the PAC runner is then later run on another thread, bad things will happen.

I recommend that that plugin start a thread of its own and communicate with it.

Backtrace:

 #0  0x00007f745f0ac1d8 in JSC::HeapTimer::timerDidFire() () at /usr/lib64/libjavascriptcoregtk-4.0.so.18
 #1  0x00007f745f0ac287 in  () at /usr/lib64/libjavascriptcoregtk-4.0.so.18
 #2  0x00007f748e5ae9c5 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
 #3  0x00007f748e5aed88 in  () at /usr/lib64/libglib-2.0.so.0
 #4  0x00007f748e5aee1c in g_main_context_iteration () at /usr/lib64/libglib-2.0.so.0
 #5  0x00007f7494f4268f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
 #6  0x00007f7494eeb35a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib64/libQt5Core.so.5
 #7  0x00007f7494d1b31a in QThread::exec() () at /usr/lib64/libQt5Core.so.5
 #8  0x00007f7494d1fd2e in  () at /usr/lib64/libQt5Core.so.5
 #9  0x00007f74913174e7 in start_thread () at /lib64/libpthread.so.0
@DimStar77
Copy link
Contributor

Time to revive the dbusify branch...

@mcatanzaro
Copy link
Contributor

mcatanzaro commented Apr 16, 2018

Oh wow, I'm seeing a very different variant of this crash:

https://retrace.fedoraproject.org/faf/reports/1728624/

Note the nearly 27,000 crash reports... that's an order of magnitude more than the worst I've seen reported in Fedora before. Wow. OK. Anyway, the problem is caused by glib-networking's libproxy-based GProxyResolver calling px_proxy_factory_get_proxies() in a thread. But here we are using the default mozjs38 backend. It just crashes in a similar way because both JS libraries have similar threading requirements.

So this crash can be triggered by any application using GSockets. (Including WebKitGTK+ itself, even when libproxy is using the mozjs backend. Nice cyclic dependency there.)

GNOME users almost never notice this because glib-networking goes to pains to ensure that libproxy is only used when running outside of GNOME. The exception is if proxy autoconfig is enabled. So I guess almost everyone hitting this crash will be people using GNOME apps in KDE or other desktops.

OK, tangent time:

Time to revive the dbusify branch...

No kidding.

On a related note, I'm seriously thinking that glib-networking needs to move its libproxy use beyond another D-Bus boundary. E.g. consider #71, now any gjs application is going to load two incompatible versions of mozjs if glib-networking dlopens libproxy (i.e. when running outside GNOME). I can't imagine that'd be good. Also consider https://blogs.gnome.org/mcatanzaro/2018/02/28/on-setenv-and-explosions/ where my inability to figure out how to safely configure libproxy is the motivation for the blog post. (That theoretical crasher is still unfixed, btw.) Of course, this is all at the glib-networking level. So let's say we create a glib-libproxy-helper binary that would be spoken to via D-Bus, and libproxy creates its own libproxy-javascript-helper binary that it would itself speak to via D-Bus... I guess the end result IPC would be:

application -> (D-Bus) -> glib-libproxy-helper -> (D-Bus) -> libproxy-javascript-helper

A wonderful special case of that, when using libproxy's JSC backend, would be:

WebKitGTK+ -> (D-Bus) -> glib-libproxy-helper -> (D-Bus) -> libproxy-javascript-helper (WebKitGTK+)

If that's not wild enough for you, consider the case where we don't create either of those helper processes -- the status quo -- where the application is using a newer or older version of the WebKitGTK+ API. That will result in a sure explosion due to the symbol conflicts, similar to the current case where gjs uses a different version of mozjs than libproxy.

We've really got to think this through and stop dlopening each other unexpectedly. The root of the problem is that libproxy isn't prepared at all for the way it's used by GLib. It's not prepared at all to be dlopened by arbitrary applications, applications that don't know anything about libproxy, applications whose maintainers use GNOME and such don't realize that their applications dlopen libproxy when running outside of GNOME.

To fix all the problems listed above, we'd need two new helper binaries at the glib-networking level (one for glib-pacrunner to spawn to run setenv(), one for GLibproxyResolver to spawn to use libproxy), in addition to the one that's already there (glib-pacrunner), plus one at the libproxy level (for using JavaScript). Spare me.

@mcatanzaro
Copy link
Contributor

To fix all the problems listed above, we'd need two new helper binaries at the glib-networking level (one for glib-pacrunner to spawn to run setenv()

Of course, that one could be avoidable by adding some new libproxy API to tell it to ignore all other configuration and just do proxy autoconfig.

@mcatanzaro
Copy link
Contributor

dbusify would avoid #76 as well

@mcatanzaro
Copy link
Contributor

BTW this issue should be renamed to "px_proxy_factory_get_proxies not thread-safe"

@DimStar77 DimStar77 changed the title px_proxy_factory_get_proxies not thread-safe when using webkitgtk px_proxy_factory_get_proxies not thread-safe Apr 16, 2018
@mcatanzaro
Copy link
Contributor

To fix all the problems listed above, we'd need two new helper binaries at the glib-networking level (one for glib-pacrunner to spawn to run setenv(), one for GLibproxyResolver to spawn to use libproxy), in addition to the one that's already there (glib-pacrunner), plus one at the libproxy level (for using JavaScript). Spare me.

Nope, we only need the libproxy one. All the problems glib-networking has with libproxy go away if libproxy moves the JavaScript out of process.

However, libproxy would need to add new API to handle proxy autoconfig, since glib-pacrunner would no longer be possible (because it works currently by setting and unsetting environment variables, but the libproxy service would not see any of those changes).

@mcatanzaro
Copy link
Contributor

It's reported fixed by this patch:

diff -up libproxy-0.4.11/libproxy/extension_pacrunner.cpp.crash libproxy-0.4.11/libproxy/extension_pacrunner.cpp
--- libproxy-0.4.11/libproxy/extension_pacrunner.cpp.crash	2010-07-29 08:14:59.000000000 -0400
+++ libproxy-0.4.11/libproxy/extension_pacrunner.cpp	2013-11-11 15:23:56.987266457 -0500
@@ -22,20 +22,10 @@ using namespace libproxy;
 
 pacrunner::pacrunner(string, const url&) {}
 
-pacrunner_extension::pacrunner_extension() {
-	this->pr = NULL;
-}
+pacrunner_extension::pacrunner_extension() {}
 
-pacrunner_extension::~pacrunner_extension() {
-	if (this->pr) delete this->pr;
-}
+pacrunner_extension::~pacrunner_extension() {}
 
 pacrunner* pacrunner_extension::get(string pac, const url& pacurl) throw (bad_alloc) {
-	if (this->pr) {
-		if (this->last == pac)
-			return this->pr;
-		delete this->pr;
-	}
-
-	return this->pr = this->create(pac, pacurl);
+	return this->create(pac, pacurl);
 }
diff -up libproxy-0.4.11/libproxy/proxy.cpp.crash libproxy-0.4.11/libproxy/proxy.cpp
--- libproxy-0.4.11/libproxy/proxy.cpp.crash	2013-11-11 15:25:27.309271353 -0500
+++ libproxy-0.4.11/libproxy/proxy.cpp	2013-11-11 15:25:31.569271584 -0500
@@ -416,7 +416,9 @@ void proxy_factory::run_pac(url &realurl
 
 		/* Run the PAC, but only try one PACRunner */
 		if (debug) cerr << "Using pacrunner: " << typeid(*pacrunners[0]).name() << endl;
-		string pacresp = pacrunners[0]->get(this->pac, this->pacurl->to_string())->run(realurl);
+		pacrunner* runner = pacrunners[0]->get(this->pac, this->pacurl->to_string());
+		string pacresp = runner->run(realurl);
+		delete runner;
 		if (debug) cerr << "Pacrunner returned: " << pacresp << endl;
 		format_pac_response(pacresp, response);
 	}

which Fedora used for a while in the past. See:

https://bugzilla.redhat.com/show_bug.cgi?id=1459779
https://bugzilla.gnome.org/show_bug.cgi?id=795287

DimStar77 added a commit that referenced this issue Nov 10, 2020
Remove nonfunctional and crashy pacrunner caching
janbrummer added a commit to janbrummer/libproxy that referenced this issue Mar 27, 2023
In order to reduce the number of environment manipulation for testing
purpose, introduce a new config-option property to set config files for
tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants