Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix domain sockets return spurious data from getsockopt(SO_PEERCRED) if the peer has already departed, which breaks ssh-agent forwarding for Ubuntu 18.04 #3183

Closed
0xabu opened this issue May 11, 2018 · 48 comments

Comments

@0xabu
Copy link

0xabu commented May 11, 2018

Since upgrading from Ubuntu 16.04 to 18.04, I've noticed that my ssh-agent setup no longer works reliably. For the purposes of debugging, I ran ssh-agent as:

/usr/bin/ssh-agent -s -d -a /tmp/.sshagent.sock

After doing this, I can add keys to the agent, and make exactly one SSH connection. Later, trying to connect to the agent fails with Error connecting to agent: Connection refused. Before the SSH connection (when everything still works), getsockopt appears to behave as expected, returning the pid/uid/gid of the process connecting to the socket:

accept(3</unknown>, {sa_family=AF_UNIX}, [110->2]) = 4</unknown>
getsockopt(4</unknown>, SOL_SOCKET, SO_PEERCRED, {pid=369, uid=1000, gid=1000}, [12]) = 0
getuid()                                = 1000

However later, getsockopt returns -1 as uid/gid which obviously makes ssh-agent unhappy:

accept(3</unknown>, {sa_family=AF_UNIX}, [110->2]) = 4</unknown>
getsockopt(4</unknown>, SOL_SOCKET, SO_PEERCRED, {pid=0, uid=-1, gid=-1}, [12]) = 0
getuid()                                = 1000
getuid()                                = 1000
write(2</dev/pts/1>, "uid mismatch: peer euid 42949672"..., 48) = 48
close(4</unknown>)                      = 0
close(3</unknown>)                      = 0

I'm on RS4 build 17134. /tmp is on an lxfs mountpoint.

@WSLUser
Copy link

WSLUser commented May 11, 2018

How did you upgrade? do-release-upgrade or installing the new 18.04 app?

@0xabu
Copy link
Author

0xabu commented May 11, 2018

@DarthSpock, do-release-upgrade -d

@0xabu
Copy link
Author

0xabu commented May 11, 2018

Some more diagnosis: this bug is caused by initiating an SSH connection that uses agent connection forwarding (ssh -O ForwardAgent=yes). When that connects, it does a test connection to the auth sock:

connect(9</unknown>, {sa_family=AF_UNIX, sun_path="/tmp/.sshagent.sock"}, 110) = 0
close(9</unknown>)                      = 0

On the SSH agent side, this one connect/close pair causes the bogus data returned from getsockopt SO_PEERCRED (probably because the peer is already gone by the time of the getsockopt call):

accept(3</unknown>, {sa_family=AF_UNIX}, [110->2]) = 5</unknown>
getsockopt(5</unknown>, SOL_SOCKET, SO_PEERCRED, {pid=0, uid=-1, gid=-1}, [12]) = 0
getuid()                                = 1000
getuid()                                = 1000
write(2</dev/pts/1>, "uid mismatch: peer euid 42949672"..., 48) = 48
close(5</unknown>)                      = 0
close(3</unknown>)                      = 0

… and after that time the ssh-agent remains alive but is unusable because it closed its end of the socket. This is the only thing that differs between Ubuntu 16.04 and 18.04 -- in earlier versions, the agent would log the error message and drop that specific connection but keep its listening socket open; now (probably for security paranoia) it becomes unusable after one failed connection.

So there's a simple workaround for now: don't use agent forwarding.

@0xabu 0xabu changed the title Unix domain sockets sometimes return spurious data from getsockopt(SO_PEERCRED), breaks ssh-agent for Ubuntu 18.04 Unix domain sockets return spurious data from getsockopt(SO_PEERCRED) if the peer has already departed, which breaks ssh-agent forwarding for Ubuntu 18.04 May 11, 2018
@mdkeehan
Copy link

I have noticed my ssh-agent exhibiting similar behaviour after upgrading to 18.04 using
do-release-upgrade -d

@rarenerd
Copy link

I can confirm having the same issue, using Ubuntu 18.04 from the Microsoft Store on a new Windows 10 installation.

@rarenerd
Copy link

My config is identical to the example in your link. This does not resolve the issue for me.

@0xabu
Copy link
Author

0xabu commented May 15, 2018

@DarthSpock, I already have the metadata flag on my drvfs mounts, but I don't think it's relevant here anyway because the ssh-agent socket is on a different lxfs mountpoint.

@0xabu
Copy link
Author

0xabu commented May 15, 2018

@DarthSpock the issue is the new socket close call in the error-handling path of the ssh-agent binary. Why would that be any different from a clean install? I'm pretty sure this is just another minor compatibility bug in the implementation of getsockopt.

@rarenerd
Copy link

@DarthSpock I'm doing this on a clean Windows 10 install with a clean Ubuntu 18.04 install. I've since removed the 18.04 install and installed 16.04 where I don't have this issue.

@0xabu
Copy link
Author

0xabu commented May 15, 2018

It's a bug in WSL.

@rarenerd
Copy link

rarenerd commented Jun 5, 2018

Is there anything I can do to help progress this ticket? Anything needed to reproduce the issue?

@WSLUser
Copy link

WSLUser commented Jun 5, 2018

@rarenerd By following CONTRIBUTING.md

@d-rez
Copy link

d-rez commented Jun 26, 2018

I can confirm a clean installation of Ubuntu 18.04 on 1803 build of Windows 10 results in this problem (WSL was installed after updating to 1803)

@Brian-Perkins
Copy link

There were changes in Insider Build 17704 that may address this.

@alexwh
Copy link

alexwh commented Jul 7, 2018

I can confirm that SSH agent forwarding works on 17711.

@thegass
Copy link

thegass commented Jul 21, 2018

will this get fixed for non insider-builds? or do we have to wait for the fall-update of Windows 10?
wsl is kind of useless without working ssh-agent.

@d-rez
Copy link

d-rez commented Jul 21, 2018

@thegass it's not useless, just install Ubuntu 16 instead of 18 from the windows store, it works fine

@thegass
Copy link

thegass commented Jul 21, 2018

@d-rez will try the old version. (tried already opensuse but that has the same problem)

@CoolCold
Copy link

* ssh(1): Add a ProxyJump option and corresponding -J command-line
   flag to allow simplified indirection through a one or more SSH
   bastions or "jump hosts".

was introduced in version 7.3 ( http://www.openssh.com/txt/release-7.3 ) which is not the case for ubuntu 16.04:

Version: 1:7.2p2-4ubuntu2.2
coolcold@x230-coolcold:~$ dpkg -s openssh-client|egrep ^Version:
Version: 1:7.2p2-4ubuntu2.2

and is limiting factor for me, for example.

@rcarmo
Copy link

rcarmo commented Aug 7, 2018

Any idea if this is getting rolled up into a prod update soon? I can't revert to 16.04 solely for the sake of SSH (because I need some of the updated userland), and yet I really need SSH forwarding to work, otherwise I can't checkout repositories on remote machines with my own key.

@benhillis
Copy link
Member

@rcarmo - We typically only service security and reliability issues, so this won't be available in a non-Insiders build until the next update due this fall.

@shoffmeister
Copy link

Could you run multiple distributions in parallel (16.04 for git clone et al; 18.04 for a current userland) and use the Windows host as the shared storage provider (/mnt/c/git-repose-here)?

@0xabu
Copy link
Author

0xabu commented Aug 7, 2018

@rcarmo have you tried grabbing the ssh-agent binary from 16.04 and using that as a workaround?

@rcarmo
Copy link

rcarmo commented Aug 7, 2018

It's not practical to maintain two copies of Ansible, Terraform, az, my custom pyenvs and $DIVINITY knows what else and switch over to another distro "just" for SSH. I just reinstalled everything today after realizing 18.04 had new GCC and "good enough" Go, and having a critical function like agent forwarding conk out on me is a major issue (I pretty much live inside the terminal other than a browser and corp e-mail).

@rcarmo
Copy link

rcarmo commented Aug 7, 2018

@0xabu from the thread above I was under the impression this was also broken in other distros, but yes, that worked, thanks!

Not that I'm happy about it, but it is a semi-acceptable workaround for WSL. Which should probably have off-cycle updates (hint hint).

Here's a quick HOWTO for a fix (and for the non-squeamish):

If you're reading this after August 2018, go here to check you're getting the latest version from xenial-updates :

cd /tmp/
wget http://mirrors.kernel.org/ubuntu/pool/main/o/openssh/openssh-client_7.2p2-4ubuntu2.4_amd64.deb
dpkg -x openssh-client_7.2p2-4ubuntu2.4_amd64.deb /tmp/deb
sudo mv /usr/bin/ssh-agent /usr/bin/ssh-agent.18.04
# for safekeeping in case of bionic updates
sudo mv /tmp/deb/usr/bin/ssh-agent /usr/bin/ssh-agent.16.04
sudo cp /usr/bin/ssh-agent.16.04 /usr/bin/ssh-agent
sudo chown root:ssh /usr/bin/ssh-agent

(Edits for readability.)

@apsdsm
Copy link

apsdsm commented Aug 19, 2018

@rcarmo this worked for me, thanks!

@masoo
Copy link

masoo commented Oct 4, 2018

#3183 (comment)
Even if you use the method
In my environment Windows 10 18.09, "ssh-agent" has stopped working properly.
How is everyone?

#3183 (comment)
Thank you.
It worked fine.

@jstangroome
Copy link

I can confirm that Windows 10 October 2018 Update Version 1809 (OS Build 17763.1) works fine with Ubuntu Bionic's ssh-agent from the openssh-client apt package version 1:7.6p1-4.

https://blogs.windows.com/windowsexperience/2018/10/02/how-to-get-the-windows-10-october-2018-update/

@trueromio
Copy link

trueromio commented Dec 12, 2018

Just posting @rcarmo workaround with fixed package url

cd /tmp/
wget http://mirrors.kernel.org/ubuntu/pool/main/o/openssh/openssh-client_7.2p2-4ubuntu2.6_amd64.deb
dpkg -x openssh-client_7.2p2-4ubuntu2.6_amd64.deb /tmp/deb
sudo mv /usr/bin/ssh-agent /usr/bin/ssh-agent.18.04
# for safekeeping in case of bionic updates
sudo mv /tmp/deb/usr/bin/ssh-agent /usr/bin/ssh-agent.16.04
sudo cp /usr/bin/ssh-agent.16.04 /usr/bin/ssh-agent
sudo chown root:ssh /usr/bin/ssh-agent

@bentterp
Copy link

Package is now openssh-client_7.2p2-4ubuntu2.7_amd64.deb

cd /tmp/
wget http://mirrors.kernel.org/ubuntu/pool/main/o/openssh/openssh-client_7.2p2-4ubuntu2.7_amd64.deb
dpkg -x openssh-client_7.2p2-4ubuntu2.7_amd64.deb /tmp/deb
sudo mv /usr/bin/ssh-agent /usr/bin/ssh-agent.18.04
# for safekeeping in case of bionic updates
sudo mv /tmp/deb/usr/bin/ssh-agent /usr/bin/ssh-agent.16.04
sudo cp /usr/bin/ssh-agent.16.04 /usr/bin/ssh-agent
sudo chown root:ssh /usr/bin/ssh-agent

I know this is an old issue, but some companies are still rolling out Build 1803 so the fix is still relevant

@tony1223
Copy link

Got same issue and the workaround really save my day. thanks a lot.

@peter279k
Copy link

I also have the same issue here.

After downgrading the openssh-client package version, it's worked successfully.

@ehrlich-b
Copy link

Just updating the script again for those who need it.

cd /tmp/
wget http://mirrors.kernel.org/ubuntu/pool/main/o/openssh/openssh-client_7.2p2-4ubuntu2.8_amd64.deb 
dpkg -x openssh-client_7.2p2-4ubuntu2.8_amd64.deb /tmp/deb
sudo mv /usr/bin/ssh-agent /usr/bin/ssh-agent.18.04
# for safekeeping in case of bionic updates
sudo mv /tmp/deb/usr/bin/ssh-agent /usr/bin/ssh-agent.16.04
sudo cp /usr/bin/ssh-agent.16.04 /usr/bin/ssh-agent
sudo chown root:ssh /usr/bin/ssh-agent

@ntcong
Copy link

ntcong commented Mar 14, 2019

If anyone having problems with BSOD (lxcore.sys), try to run sudo apt install --reinstall openssh-client

@petersalomonsen
Copy link

Thanks @ADotOut . This was really helpful.

@HarikrishnanBalagopal
Copy link

ssh-agent is still broken in the latest ubuntu 18 from the windows store.

$ echo $SSH_AUTH_SOCK

the environment variable is not being set when you close a Ubuntu terminal and reopen it.
I tried the script @ADotOut posted but the environment variable is still not being set.
This is annoying for a lot of reasons, especially because you cannot use ssh-keys to log into servers anymore.

@tmillican
Copy link

@HarikrishnanBalagopal :

If you're still having this issue, double-check that ssh-agent is actually running and listening to the socket specified by $SSH_AUTH_SOCK. In my .bashrc I have:

if [ ! -S ~/.ssh/ssh_auth_sock ]; then
  eval `ssh-agent`
  ln -sf "$SSH_AUTH_SOCK" ~/.ssh/ssh_auth_sock
fi
export SSH_AUTH_SOCK=~/.ssh/ssh_auth_sock
[ -z "`ssh-add -l | grep 'somekey'`" ] && ssh-add ~/.ssh/somekey

I started getting "Error connecting to agent: Connection refused" after an automatic system reboot, so I checked, and sure enough, the socket in /tmp had somehow persisted across a reboot. Obviously the listening ssh-agent instance didn't, hence the error. I don't yet know if this is a reproducible issue, but if you have similar logic in your .bashrc, it might be the cause of your problem.

@sinkcup
Copy link

sinkcup commented Oct 8, 2019

add these code to ~/.bashrc

[ -x /usr/bin/ssh-agent ] && eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

@N0xFF
Copy link

N0xFF commented Dec 24, 2019

add these code to ~/.bashrc

[ -x /usr/bin/ssh-agent ] && eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

You create a new process each time you open bash. We can reuse one ssh-agent instance:

if ! pgrep ssh-agent > /dev/null; then
  rm -f /tmp/ssh-auth-sock
  eval "$(ssh-agent -s -a /tmp/ssh-auth-sock)"
else
  export SSH_AUTH_SOCK=/tmp/ssh-auth-sock
fi
ssh-add

@Morgy93
Copy link

Morgy93 commented Jan 15, 2020

You create a new process each time you open bash. We can reuse one ssh-agent instance:

if [ ! -S /tmp/ssh-auth-sock ]; then
  eval "$(ssh-agent -s -a /tmp/ssh-auth-sock)"
else
  export SSH_AUTH_SOCK=/tmp/ssh-auth-sock
fi
ssh-add

This worked fine until I restarted the pc.

Now I'm only left with:
Error connecting to agent: Connection refused

$ cat /tmp/ssh-auth-sock 
cat: /tmp/ssh-auth-sock: No such device or address
$ eval "$(ssh-agent -s -a /tmp/ssh-auth-sock)"
bind: Address already in use
unix_listener: cannot bind to path: /tmp/ssh-auth-sock

Any idea?

@N0xFF
Copy link

N0xFF commented Jan 15, 2020

@Morgy93 I updated script above and replace socket file lookup with a process.

@lmachens
Copy link

lmachens commented Feb 7, 2020

I can approve that the script is working fine to reuse one ssh-agent.
But you are calling ssh-add even if the ssh-agent exists.
I would only call it only once:

if ! pgrep ssh-agent > /dev/null; then
  rm -f /tmp/ssh-auth-sock
  eval "$(ssh-agent -s -a /tmp/ssh-auth-sock)"
  ssh-add
else
  export SSH_AUTH_SOCK=/tmp/ssh-auth-sock
fi

There is still the issue that you have to enter the password after a restart. Any idea how to solve that?

@Morgy93
Copy link

Morgy93 commented Feb 7, 2020

There's by the way some "official" script in the docs:

Linux:

To start the SSH Agent in the background, run:

eval "$(ssh-agent -s)"

To start the SSH Agent automatically on login, add these lines to your ~/.bash_profile:

if [ -z "$SSH_AUTH_SOCK" ]
then
   # Check for a currently running instance of the agent
   RUNNING_AGENT="`ps -ax | grep 'ssh-agent -s' | grep -v grep | wc -l | tr -d '[:space:]'`"
   if [ "$RUNNING_AGENT" = "0" ]
   then
        # Launch a new instance of the agent
        ssh-agent -s &> .ssh/ssh-agent
   fi
   eval `cat .ssh/ssh-agent`
fi

Source: https://code.visualstudio.com/docs/remote/troubleshooting#_setting-up-the-ssh-agent

@pearcec
Copy link

pearcec commented Mar 26, 2020

Try this:

if ! pgrep ssh-agent > /dev/null; then
rm -f /tmp/ssh-auth-sock
eval "$(ssh-agent -s -a /tmp/ssh-auth-sock)"
fi
export SSH_AUTH_SOCK=/tmp/ssh-auth-sock
if ! ssh-add -l; then
ssh-add
fi

@istvan-ujjmeszaros
Copy link

Actually non of the above scripts are working in WSL2 Ubuntu 18.04.
The only good solution for me so far was when I simply added eval "$(ssh-agent -s)" to my .bash_profile.

@CoolCold
Copy link

CoolCold commented Oct 7, 2020

this snippet I currently use in my .bashrc.local (as I often copy .bashrc and other dot files to servers, splitting into separate .bashrc.local ensures desktop specific settings doesn't hit remote servers)

#setting ssh agent
ssh-add -l &>/dev/null
if [ "$?" == 2 ]; then
  test -r ~/.ssh-agent && \
    eval "$(<~/.ssh-agent)" >/dev/null
fi
ssh-add -l &>/dev/null
if [ "$?" == 2 ]; then
    (umask 066; ssh-agent > ~/.ssh-agent)
    eval "$(<~/.ssh-agent)" >/dev/null
fi
ssh-add -l &>/dev/null
if [ "$?" == 1 ]; then
    ssh-add .ssh/mykey.id_rsa.key
fi

.bashrc part to include local file:

if [ -f ~/.bashrc.local ]; then
    . ~/.bashrc.local
fi

.ssh/config to learn new keys once used:

Host *
  ControlMaster auto
  ControlPath /tmp/%r@%h:%p
  ServerAliveInterval 30
  ServerAliveCountMax 10
  AddKeysToAgent yes

all that works fine for me on WSL2 with Ubuntu 20.04

@lucas-langa
Copy link

Actually non of the above scripts are working in WSL2 Ubuntu 18.04. The only good solution for me so far was when I simply added eval "$(ssh-agent -s)" to my .bash_profile.

Anyone in 2022 and beyond running wsl2, can confirm this worked for me!

@Adriano-Dos-Santos
Copy link

Actually non of the above scripts are working in WSL2 Ubuntu 18.04. The only good solution for me so far was when I simply added eval "$(ssh-agent -s)" to my .bash_profile.

This worked for me on 2022 using wsl2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests