Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mosh doesn't work on PowerPC OS X 10.5 (Leopard) #479

Closed
deutrino opened this issue Dec 16, 2013 · 12 comments
Closed

mosh doesn't work on PowerPC OS X 10.5 (Leopard) #479

deutrino opened this issue Dec 16, 2013 · 12 comments

Comments

@deutrino
Copy link

Built 1.2.4 from source after trying Tigerbrew's 1.2.4. Both seem to connect and then instantly exit, saying

[mosh is exiting.]

After a couple of line feeds. Tried on a couple different mosh servers, x86 and ARM, both Debian or close enough.

@andersk
Copy link
Member

andersk commented Dec 16, 2013

If you apply #452 on the client side, you might get a more informative error message.

@andersk
Copy link
Member

andersk commented Dec 16, 2013

Also, this might be related to #424? (I’m not a Mac user.)

@deutrino
Copy link
Author

I'll happily try it - it may take me a day or two.

@deutrino
Copy link
Author

I manually applied #452 to 1.2.4 as downloaded by Tigerbrew using 'brew install --interactive'. There's still no useful error output after switching to alternate screen and back; it just does that and prints "[mosh is exiting.]" after a couple line feeds as before.

@deutrino
Copy link
Author

deutrino commented Feb 3, 2014

What can I do to help get this working on PowerPC? Now that I'm used to mosh, its lack is very limiting on my PPC laptop.

@keithw
Copy link
Member

keithw commented Feb 3, 2014

What happens when you run mosh-server and mosh-client separately (as described at http://mosh.mit.edu)?

@deutrino
Copy link
Author

Okay, I've installed mosh again on my G4 with Tigerbrew (see mistydemeo/tigerbrew#87 for related ticket). This installs 1.2.4, downloading the tarball straight from mosh.mit.edu. This is on OS X 10.5.8 running on a PPC 7450 (1.25GHz G4). Compilation was mostly accomplished with 'cc1plus' - just watched 'top' as it was building. Here's the build output, if it's useful:

$ brew install mosh
==> Downloading http://mosh.mit.edu/mosh-1.2.4.tar.gz
Already downloaded: /Library/Caches/Homebrew/mobile-shell-1.2.4.tar.gz
==> ./configure --prefix=/usr/local/Cellar/mobile-shell/1.2.4 --enable-completio
==> make install
==> Caveats
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d
==> Summary
/usr/local/Cellar/mobile-shell/1.2.4: 13 files, 1.6M, built in 4.7 minutes

So, I run mosh-server on the remote host:

$ mosh-server

MOSH CONNECT 60024 6FsVVXcwd4wSMJRAQuAonQ

mosh-server (mosh 1.2.3)
Copyright 2012 Keith Winstein <mosh-devel@mit.edu>
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

[mosh-server detached, pid = 3546]
$

And then I run mosh-client on the G4, and it exits in probably less than half a second with an interesting message at thte top-of-screen bar, more on that in a sec, but here's how I executed it (IP redacted):

$ MOSH_KEY=6FsVVXcwd4wSMJRAQuAonQ mosh-client 111.222.111.222 60024

mosh did not make a successful connection to 111.222.111.222:60024.
Please verify that UDP port 60024 is not firewalled and can reach the server.

(By default, mosh uses a UDP port between 60000 and 61000. The -p option
selects a specific UDP port number.)

[mosh is exiting.]

First order of business: I'm connecting from my home here with the G4. All other machines using mosh - OS X Intel, Linux, and Linux on Raspberry Pi - have never had any problems connecting to this remote host through my NAT, and have been doing so for months, multiple times a day. This problem also occurs on the G4 on friends' networks, coffee shops, and so on, so I'm fairly confident that it's not a networking problem external to the G4. I've also never observed these symptoms on anything but the G4.

Now to the more interesting bits. First, when I attempted to connect with 'mosh-client' and it rapidly exited, I noticed a flash of the top-of-screen bar, so I ran it multiple times so that I'd have enough time to read it. The bar read something as follows, but with the "without contact" time randomized between about 20:00 and 55:00 every execution.

mosh: Timed out waiting for server... (29:51 without contact)

Finally, when I looked for PID 3546 on the remote host, it did not exist.

I hope this is useful! Let me know if there are more steps I can take.

@deutrino
Copy link
Author

Building with GCC 4.8 (built from Tigerbrew) does not solve this problem on PPC 7450 ("G4e") running 10.5.8.

@deutrino
Copy link
Author

This is broken the same way on OS X 10.5.8 on a G5 (PPC 970), so it's not just confined to PPC 7450.

@mistydemeo
Copy link

I can repro on a PowerBook G4 with OS X 10.5.8. I'll try building on Intel OS X 10.5.8 to see if it occurs there as well.

@mistydemeo
Copy link

I tested on an Intel Leopard machine, and a command that fails on PowerPC succeeds there. I suspect there may be an endianness issue here.

@mistydemeo
Copy link

So it turns out this is a simple endian issue, in the timestamp code - thanks to @keithw for suggesting it was the timestamp.

The timestamp code has a Darwin-specific code path that uses mach_absolute_time(). While that's known to have somewhat different behaviour between PPC and Intel, that turned out to be a red herring - it was something much simpler.

mach_absolute_time() returns a uint64_t value that's an "absolute time unit" which is not a second; to convert into seconds, a mach_timebase_info_data_t struct is provided with numerator and denominator members that can be used to calculate the value in a fraction of seconds. mosh converts to milliseconds like so:

millis_cache = ((mach_absolute_time() * s_timebase_info.numer) / (1000000 * s_timebase_info.denom));

The division with a different integer type produces a wrong result on PowerPC, which is big-endian, and that's the cause of the hugely wrong timestamps @gordon-morehouse was seeing. Creating a uint64_t 1000000 and dividing by that produces correct results.

That said, it seems unnecessary to have Darwin-specific timestamp code given that gettimeofday() exists and works as expected on, AFAIK, every version of Darwin going way way back.

mistydemeo added a commit to mistydemeo/mosh that referenced this issue Apr 18, 2014
freeze_timestamp() was previously using the mach_absolute_time()
function on Darwin to determine time. This isn't really necessary,
since Darwin also supports gettimeofday(). (mach_absolute_time() also
introduced a minor endian bug that caused implausible timestamps
on PowerPC, which is easily fixed but doesn't happen with
gettimeofday() anyway.)

Fixes mobile-shell#479.
mistydemeo added a commit to mistydemeo/mosh that referenced this issue Apr 18, 2014
Multiplying a uint64_t by an int was producing wrong results on big-
endian architectures (e.g. PowerPC). This resulted in implausible
timestamps that caused mosh to exit instantly on starting up.

Fixes mobile-shell#479.
mistydemeo added a commit to mistydemeo/mosh that referenced this issue Apr 18, 2014
When converting the value of mach_absolute_time() into milliseconds,
multiplying a uint64_t by an int was producing wrong results on big-
endian architectures (e.g. PowerPC) due to the larger value of
s_timebase_info.denom on that platform. This resulted in implausible
timestamps that caused mosh to exit instantly on starting up.

Fixes mobile-shell#479.
andersk added a commit to andersk/mosh that referenced this issue Apr 18, 2014
A Darwin PPC 32-bit user observes huge values numer == 1000000000 and
denom == 18431683 returned from mach_timebase_info().  For these
values, mach_absolute_time() * numer overflows uint64_t every 1000.82
seconds, and 1000000 * denom always overflows uint32_t, with the
effect of making time run backwards at -11190660 times its usual
speed.

This bug was masked on Darwin x86 64-bit, where numer == denom == 1.

Fix it by doing the conversion with double arithmetic instead.

Closes mobile-shell#479.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
andersk added a commit to andersk/mosh that referenced this issue Apr 18, 2014
A Darwin PPC 32-bit user observes huge values numer == 1000000000 and
denom == 18431683 returned from mach_timebase_info().  For these
values, mach_absolute_time() * numer overflows uint64_t every 1000.82
seconds, and 1000000 * denom always overflows uint32_t, with the
effect of making time run backwards at -11190660 times its usual
speed.

This bug was masked on Darwin x86 64-bit, where numer == denom == 1.

Fix it by doing the conversion with double arithmetic instead.

Closes mobile-shell#479.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants