Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xdg-open not working properly #7

Closed
89luca89 opened this issue Jul 15, 2022 · 22 comments
Closed

xdg-open not working properly #7

89luca89 opened this issue Jul 15, 2022 · 22 comments
Labels
bug Something isn't working upstream

Comments

@89luca89
Copy link
Contributor

Doing something like:

host-spawn xdg-open https://google.com

should open host's browser pointing at that page.
Right now, flatpak-spawn works, while host-spawn does not. This seems not related to anything in the env, right now doing

vimdiff <(host-spawn env | sort -u) <(flatpak-spawn --host env | sort -u)

shows no difference in the environment

@1player
Copy link
Owner

1player commented Jul 15, 2022

Yep, I can reproduce it as well. Not sure what's going on, it's worth checking with strace what's the difference between the two invocations.

@89luca89
Copy link
Contributor Author

Attaching them now, I'll try to check when I have some time 😄

strace-flatpak-spawn.log
strace-host-spawn.log

@1player
Copy link
Owner

1player commented Jul 15, 2022

I think you probably did strace host-spawn xdg-open, while it would help to have the output of host-spawn strace xdg-open instead, to see what's going on on the host side of the invocation.

I'll check what's going on as well.

EDIT: nevermind, xdg-open is a bash script. No need for any strace.

@1player
Copy link
Owner

1player commented Jul 15, 2022

OK, I've trimmed this down to gio behaving differently.

host-spawn gio open http://google.com doesn't do anything, while it works with flatpak-spawn --host

@89luca89
Copy link
Contributor Author

Interesting, might be worth comparing the strace for both then

@1player
Copy link
Owner

1player commented Jul 15, 2022

Here's what I've found so far: gio open URL doesn't work with host-spawn because there's a pty attached. gio open FILE works, and the URL invocation works if I disable the pty but just pass stdin, stdout and stderr as-is to the process, like flatpak-spawn --host does.

A broken invocation shows this line on the host journal:

Started app-glib-org.mozilla.firefox-28359.scope - Application launched by gio open.

while a working invocation shows:

Started app-glib-org.mozilla.firefox-28359.scope - Application launched by gio open.
Started app-flatpak-org.mozilla.firefox-28359.scope.

so in fact Firefox gets spawned in all cases, but the second scope doesn't get started, for some reason. I am doubtful this is a host-spawn bug itself, or rather if it is a gio, a firefox or a flatpak bug.

I struggle to understand why attaching a pty would cause a problem with that program only.

@1player
Copy link
Owner

1player commented Jul 15, 2022

@89luca89 is your browser installed as a Flatpak? Do you have the same problem if you configure your DE to default to another/ a non-Flatpak browser?

host-spawn gio open FILE works and spawns gnome-text-editor on my host, which is a flatpak. So I don't know if it's browser related or what.

@89luca89
Copy link
Contributor Author

89luca89 commented Jul 15, 2022

Yep all flatpaks, but if that works with gedit (which is flatpak) it shouldn't be flatpak related
It works with a native app (nautilus - host-spawn gio open .) and a flatpak app (gedit - host-spawn gio open README.md)

EDIT: Interestingly it does not work for videos or images either

@89luca89
Copy link
Contributor Author

89luca89 commented Jul 15, 2022

Indeed, removing pty attachment, makes it work
Might be worth considering a --pty flag to make it optional?

Better yet, it's the fds setup in line 76 that is causing this issue
EDIT: specifically, it's the STDIN, setting it to host's stdin, breaks a possible shell, but makes gio work

	fds := map[uint32]dbus.UnixFD{
		0: dbus.UnixFD(os.Stdin.Fd()),
		1: dbus.UnixFD(os.Stdout.Fd()),
		2: dbus.UnixFD(os.Stderr.Fd()),
	}

with this gio/xdg-open works, but spawning a shell will not work

@1player
Copy link
Owner

1player commented Jul 15, 2022

I keep thinking this is an upstream issue. host-spawn works fine with host-spawn strace -ff gio open http://combo.cc, which makes me think this is a timing issue, as strace following every single fork() would slow down the gio process considerably.

There is indeed a long standing bug about GLib/DBus dropping messages if the sender terminates too quickly. In this case the sender is gio, which spawns your browser through DBus, but terminates very quickly so the spawning doesn't complete. Why this happens when we pass a pty, is anyone's guess and I'm really inexperienced with both dbus and glib to know exactly what's going on.

These might be relevant:

In fact I tried a small hack by keeping gio open around instead of immediately returning (by literally duplicating this line https://github.com/GNOME/glib/blob/28fd2e4e67513fbbc45395ca717ae4d6dab268b6/gio/gio-tool-open.c#L133). In this case host-spawn works as intended.

We need someone that knows dbus and/or glib pretty well to figure out what's going on, but as those issues mentioned above point out, there are corner cases in which the sender of a dbus message quits early and everything breaks.

@1player
Copy link
Owner

1player commented Jul 15, 2022

Opened an upstream issue: https://gitlab.gnome.org/GNOME/glib/-/issues/2695

@1player
Copy link
Owner

1player commented Jul 15, 2022

Might be worth considering a --pty flag to make it optional?

I would add a --no-pty optional flag instead. This seems to be a problem strictly with gio, and unless we can reproduce with another non-GLib tool, I think defaulting to creating a pty for the child process to be much more beneficial.

Still, I hate the idea of requiring one to pass --no-pty only for this corner case to work. Let's sleep on this for a while to see if we can track down exactly what's going on.

EDIT: more test cases:

  • host-spawn sh -c 'gio open http://google.com' doesn't work
  • host-spawn sh -c 'gio open http://google.com; sleep 1' works

Ugh.

@89luca89
Copy link
Contributor Author

Yea as a workaround I was trying adding some "sleep" to make it wait for gio:

adding at line 36:

	command := strings.Join(args[:], " ") + "; sleep 0.5"
	args = append([]string{"sh", "-c"}, command)

makes it work as expected, but it's a really ugly workaround, probably it's better a --no-pty flag for now

89luca89 added a commit to 89luca89/host-spawn that referenced this issue Jul 15, 2022
This works around some problems with GIO, as seen in issue 1player#7

Specifying --no-pty will make `host-spawn` behave like `flatpak-spawn --host`.

Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
@89luca89
Copy link
Contributor Author

@1player for now I've opened a PR to add the --no-pty flag, at least as an host-spawn user I have a way to work around it 😄

@1player
Copy link
Owner

1player commented Jul 15, 2022

More debugging log: the reason it fails with host-spawn and its pty is because somewhere down the line after gio has spawned a browser, something closes the pty file descriptor and a sh process terminated with SIGHUP. Of course doesn't happen with flatpak-spawn because SIGHUP is sent iff a tty is closed.

How to reproduce:

  1. On the host attach a gdb instance to your flatpak-session-helper process: gdb -p $(pgrep -f flatpak-session-helper)
  2. gdb: set follow-fork-mode child and then continue
  3. Run host-spawn on another shell

With host-spawn you'll get this output:

[Attaching after Thread 0x7f8cb3323c40 (LWP 3380) fork to child process 19442]
[New inferior 2 (process 19442)]
[Detaching after fork from parent process 3380]
[Inferior 1 (process 3380) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
process 19442 is executing new program: /usr/bin/gio
[...cut a dozen lines...]
process 19455 is executing new program: /usr/bin/bash

Thread 4.1 "sh" received signal SIGHUP, Hangup.
[Switching to process 19455]

Whereas with flatpak-spawn --host, since they don't pass a pty, the SIGHUP will never happen. The question now is figuring out who and why is our pty fd closed as that stage, causing that SIGHUP to be fired.

@89luca89
Copy link
Contributor Author

89luca89 commented Jul 15, 2022

I think there might be some problems with the pty creation/initialization then
because If I move it outside of the runCommandAsync function, like this in the main:

	pty, err := createPty()
	if err != nil {
		log.Fatalln(err)
	}
	pty.Start()
	defer pty.Terminate()

	_, exitCode, exited := runCommandSync(os.Args[1:], pty)

it completely scrambles the host's terminal

@1player
Copy link
Owner

1player commented Jul 15, 2022

No, the pty creation is fine. The scrambling is due to stdin set to RAW mode in pty.Start(), which is in fact required to some host programs to work correctly.

Alright I've spent most of my day on this and I conclude that it's not a host-spawn bug yet I don't even know if it's possible to fix.

What is going on

  1. The host command gio/xdg-open gets spawned, with its stdin/stdout and stderr attached to a pty. gio becomes controlling process of that pty.
  2. gio/xdg-open returns immediately, but behind the scenes it's still spawning processes, still attached to those pty file descriptors.
  3. Inside the container we have no way of knowing this, we get the HostCommandExited signal and we terminate.
  4. As we terminate, all our fds get closed, including the pty master fd. This causes the kernel to send a SIGHUP to whoever is the controlling process of that pty (should be gio/xdg-open or one of its child processes, but I'm not sure at this point)
  5. SIGHUP is not handled nor ignored by this controlling process, so the process is killed.

Why flatpak-spawn --host is fine

They do not allocate a pty, so when HostCommandExited is sent prematurely, they too close their fds, but as they do not allocate a pty, the host side couldn't become the controlling process, so no SIGHUP is sent by the kernel.

Who exactly is at fault

I suspect the fault is of gio, that terminates before it has actually spawned the web browser process. We have no way of dealing with this from inside the container. The only way we can mitigate it is with that --no-pty hack.

Additional reading: http://www.linusakesson.net/programming/tty/index.php


I will merge your PR, but leave this open because there is a bug somewhere, but it's not on this side.

@1player 1player added upstream bug Something isn't working labels Jul 15, 2022
@89luca89
Copy link
Contributor Author

Fine for me, having a workaround for now, is also good in optic of upstream fix (that takes quite a while to come downstream)

@1player
Copy link
Owner

1player commented Jul 20, 2022

@89luca89 this just dawned on me, but why are you running xdg-open through host-spawn? It runs fine as-is inside distrobox, since the container shares the same DBus session as the host machine.

So, instead of doing host-spawn xdg-open http://google.com, you can just do xdg-open http://google.com.

For this reason I'm closing this issue. There is no need to use host-spawn for that command.

@1player 1player closed this as completed Jul 20, 2022
@89luca89
Copy link
Contributor Author

@1player i'm fine with this, being an upstream bug, at least having the no-pty flag suffices 👍

89luca89 added a commit to 89luca89/distrobox that referenced this issue Jul 20, 2022
This workaround is needed because of a bug in gio (used by xdg-open) where
a race condition happens when allocating a pty, leading to the command
being killed before having time to be executed.

https://gitlab.gnome.org/GNOME/glib/-/issues/2695
1player/host-spawn#7

As an (ugly) workaround, we will not allocate a pty for those commands.

Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
@1player
Copy link
Owner

1player commented Nov 22, 2022

BTW, I was following the upstream gio thread about this issue, and it might've been fixed with a recent release (which would fix it for most apps using GLib).

https://gitlab.gnome.org/GNOME/glib/-/issues/2695

I haven't had the chance to test yet.

@89luca89
Copy link
Contributor Author

Interesting, let's see how fast it trickles down to downstream 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

2 participants