Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruby-mri only binding to IPV6 interfaces #51

Closed
dannysheehan opened this issue Dec 30, 2014 · 27 comments
Closed

ruby-mri only binding to IPV6 interfaces #51

dannysheehan opened this issue Dec 30, 2014 · 27 comments

Comments

@dannysheehan
Copy link

@dannysheehan dannysheehan commented Dec 30, 2014

If you try to 'auth' to IPv4 address you can't, but IPv6 addresses are fine. See below.


$ pcs cluster auth  127.0.0.1
Error: Unable to communicate with 127.0.0.1

# IPV6 localhost is fine
#
$ pcs cluster auth  ::1
Username: 

# Observation: ruby-mri is only listening on tcp6 not tcp.
#
$ netstat -anp | grep 2224
tcp6       0      0 :::2224                 :::*                    LISTEN      744/ruby-mri

Possible Fix

After making the following change I can now bind via either IPv4 or IPv6 addresses.

$ diff /usr/lib/pcsd/ssl.rb.bak /usr/lib/pcsd/ssl.rb
31c31
<   :BindAddress        => "::",

---
>   :BindAddress        => "*",

$ systemctl restart pcsd.service
@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Feb 9, 2015

We need to take a closer look at this issue. For me bind address :: works just fine whereas * makes pcsd inaccessible on IPv6.

  1. RHEL 7, bind address ::
# netstat -anp | grep 2224
tcp6       0      0 :::2224                 :::*                    LISTEN      26505/ruby

pcs cluster auth 127.0.0.1 works
pcs cluster auth ::1 works

  1. RHEL 7, bind address *
# netstat -anp | grep 2224
tcp        0      0 0.0.0.0:2224            0.0.0.0:*               LISTEN      26569/ruby

pcs cluster auth 127.0.0.1 works
pcs cluster auth ::1 doesn't work

  1. RHEL 6, bind address ::
# netstat -anp | grep 2224
tcp        0      0 :::2224                     :::*                        LISTEN      11231/ruby

pcs cluster auth 127.0.0.1 works
pcs cluster auth ::1 works

  1. RHEL 6, bind address *
# netstat -anp | grep 2224
tcp        0      0 0.0.0.0:2224                0.0.0.0:*                   LISTEN      11344/ruby

pcs cluster auth 127.0.0.1 works
pcs cluster auth ::1 doesn't work

@feist
Copy link
Collaborator

@feist feist commented Feb 15, 2015

I've been doing some research on this, and I'm not 100% sure what's going on, but it looks like different version of ruby have different behavior.

When I'm using ruby-2.1.5, using a BindAddress of '::' only listens on IPv6, but a BindAddress of nil listens on IPv6 & IPv4. But when I use ruby 1.8.7, I don't get the same behavior.

Technically, specifying '::' should allow binding of both IPv4 & IPv6, but it appears there is a bug in Webrick that we need to work around.

I've committed code that will set the BindAddress to nil, but that will only work with a newer ruby. @dannysheehan what version of ruby are you running?

@dannysheehan
Copy link
Author

@dannysheehan dannysheehan commented Feb 15, 2015

I was using ruby 2.1.5 also.

@andyprice
Copy link

@andyprice andyprice commented Mar 30, 2015

I managed to get this working correctly on my rawhide test cluster (ruby 2.2.1) by working around a webrick handler bug in rack, as it seems to have been broken by rack/rack@5a9169d

For what it's worth I've opened an issue report about it rack/rack#833 and there's another issue linked from it regarding the BindAddress option getting clobbered.

@guidtz
Copy link

@guidtz guidtz commented Jun 9, 2015

Hello,
on fedora 22 if i set

webrick_options = {
  :Port               => 2224,
  :BindAddress        => "*",

ruby-mri doesn't bind to ips address

netstat -tpnl |grep 2224
tcp        0      0 127.0.0.1:2224          0.0.0.0:*               LISTEN      1245/ruby-mri       
tcp6       0      0 ::1:2224                :::*                    LISTEN      1245/ruby-mri      

Thanks for you're help
guidtz

@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Jun 9, 2015

Hello guidtz,
an updated pcs package, which contains a fix for this bug, is currently in the queue to be pushed into Fedora 22 repository.

@guidtz
Copy link

@guidtz guidtz commented Jun 9, 2015

Thanks Tom, in how many days you think it will be disponible ?

@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Jun 9, 2015

I think it should be available in testing repository in two days and it should take about another week for it to be available in stable. https://admin.fedoraproject.org/updates/pcs-0.9.139-5.fc22

@guidtz
Copy link

@guidtz guidtz commented Jun 9, 2015

Ok I'll continue my tests with the source version.

@guidtz
Copy link

@guidtz guidtz commented Jun 9, 2015

So ... I install the git version and now it's listen on 0.0.0.0

netstat -tpnl |grep 2224
tcp        0      0 0.0.0.0:2224            0.0.0.0:*               LISTEN      3366/ruby-mri       
tcp6       0      0 :::2224                 :::*                    LISTEN      3366/ruby-mri       

But with this command I have Error: Unable to communicate with pcsd

pcs cluster auth fed-node01 fed-node02


My cluster run :

pcs status
Cluster name: cluster_guidtz
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Tue Jun 9 10:50:19 2015
Last change: Tue Jun 9 10:31:28 2015
Stack: corosync
Current DC: fed-node02 (2) - partition with quorum
Version: 1.1.12-a9c8177
2 Nodes configured
0 Resources configured

Online: [ fed-node01 fed-node02 ]

Full list of resources:

PCSD Status:
fed-node01: Unable to authenticate
fed-node02: Unable to authenticate

Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled

@feist
Copy link
Collaborator

@feist feist commented Jun 9, 2015

Can you try running 'pcs cluster auth'. My guess is that you're running it from source so it's not using the system "auth" directory.

@guidtz
Copy link

@guidtz guidtz commented Jun 10, 2015

Same problem

# pcs cluster auth
Username: hacluster
Password: 
Error: Unable to communicate with pcsd
@johnjelinek
Copy link

@johnjelinek johnjelinek commented Jul 31, 2015

Any updates on this? I am using CentOS 7.1 and after installing pacemaker/pcs and starting the service, it's only serving up on tcp6.

@johnjelinek
Copy link

@johnjelinek johnjelinek commented Jul 31, 2015

👍

@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Aug 4, 2015

guidtz: Can you try 'pcs cluster auth fed-node01 fed-node02 --debug'? Do you have any proxy set?

johnjelinek: Did you try to connect to pcsd using ipv4 or you just took a look at netstat? For me it works OK on 7.1 as I described in part 1 of my first comment in this thread. What pcs version do you have? Is it a pcs CentOS package or upstream pcs?

@spravcesite
Copy link

@spravcesite spravcesite commented Aug 8, 2015

I have the same problem - pcs Centos package version 0.9.137-13.el7_1.3. And listen only localhost:
netstat -tpnl |grep 2224
tcp 0 0 127.0.0.1:2224 0.0.0.0:* NASLOUCHÁ 22908/ruby
tcp6 0 0 ::1:2224 :::* NASLOUCHÁ 22908/ruby

@spravcesite
Copy link

@spravcesite spravcesite commented Aug 9, 2015

when i change ssl.rb:
bind to nil
and add to ssl.rb
host => "nil",

everything is ok

@goldyfruit
Copy link

@goldyfruit goldyfruit commented Nov 9, 2015

@spravcesite 👍
I'm running Debian Jessie with ruby 2.1.5p273

@howdoicomputer
Copy link

@howdoicomputer howdoicomputer commented Jan 10, 2016

I'm running into the same issue but the above fixes do not work. I can get the web UI to render but I'm unable to authenticate cluster members.

root@dev-nfs-archive-1001:/var/log/pcsd# pcs cluster auth nfs1 nfs2
Username: hacluster
Password:
Error: Unable to communicate with nfs1
Error: Unable to communicate with nfs2
@ntt1985
Copy link

@ntt1985 ntt1985 commented Jan 11, 2016

Hi, I have a similar issue:

pcs cluster auth mynode1 --debug

Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens
--Debug Input Start--
{}
--Debug Input End--

Return Value: 1
--Debug Output Start--
/usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:in require': cannot load such file -- json (LoadError) from /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:inrequire'
from /usr/lib/pcsd/pcsd-cli.rb:5:in `

'

--Debug Output End--

Sending HTTP Request to: https://mynode1:2224/remote/check_auth
Data: None
Response Code: 401
Username: hacluster
Password:
Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth
--Debug Input Start--
{"username": "hacluster", "local": false, "nodes": ["mynode1"], "password": "XXXXXXXX", "force": false}
--Debug Input End--

Return Value: 1
--Debug Output Start--
/usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:in require': cannot load such file -- json (LoadError) from /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:55:inrequire'
from /usr/lib/pcsd/pcsd-cli.rb:5:in `

'

--Debug Output End--

Error: Unable to communicate with pcsd

but I have another cluster with rhel7 where the process use only ipv6 and all works well. I installed pcs from repository, but it seems there is also some problem with gem..... why?
Someone can give me some advises?
Thank you

@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Jan 12, 2016

@howdoicomputer Can you try running pcs cluster auth nfs1 nfs2 --debug and post the output? Be aware your password is contained in the output so you most probably want to replace it.

@ntt1985 So pcsd is running just fine, yet pcsd-cli.rb is unable to load rubygem-json. I'm not able to reproduce this on a freshly installed RHEL7 host. Do you have any custom ruby setup or settings, e.g. multiple ruby versions or something like that? Rubygem-json should be installed in /usr/share/gems. Can you try running GEM_PATH=/usr/share/gems pcs cluster auth mynode1 --debug?

@ntt1985
Copy link

@ntt1985 ntt1985 commented Jan 15, 2016

@tomjelinek you are right. The problem was related to multiple version of ruby installed with rvm on the server. After deleting all ruby versions, uninstalled rvm and reinstalled pcs all works well. Thank you.
PS: actually I have the port 2224 used by ipv6 (or at least, netstat -tulnp show me that), but all works well also with ipv4 on a centos7 cluster.

@vvidic
Copy link
Contributor

@vvidic vvidic commented Jun 17, 2016

I have the same problem (binding to IPv6 only) on Debian with ruby version 2.1 and 2.3. The problem seems to be in the change to WEBrick::Utils::create_listeners introduced in ruby 2.1:

ruby/ruby@b1f493d

A small example program calling this function can prove this:

WEBrick::Utils::create_listeners("::", 2000)

This is the strace output for CentOS7 with ruby 2.0:

socket(PF_INET6, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 7
fcntl(7, F_GETFD)                       = 0x1 (flags FD_CLOEXEC)
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(7, {sa_family=AF_INET6, sin6_port=htons(2000), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(7, 128)                          = 0

On ruby 2.1 and newer the same program executes this:

socket(PF_INET6, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 7
fcntl(7, F_GETFD)                       = 0x1 (flags FD_CLOEXEC)
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fstat(7, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
setsockopt(7, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(7, {sa_family=AF_INET6, sin6_port=htons(2000), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(7, 128)                          = 0

The problem is caused by the IPV6_V6ONLY socket option being set:

setsockopt(7, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0

On ruby 2.1 using nil in place of :: has the same effect so perhaps this ruby version check can be added to the bind code in pcsd?

if primary_addr == '::' and RUBY_VERSION >= "2.1"
  primary_addr = nil
end
@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Jun 23, 2016

We'll take a look at this. Thanks @vvidic !

@vvidic
Copy link
Contributor

@vvidic vvidic commented Jun 27, 2016

It seems rack handler for webrick does a little rewriting of this parameter:

environment  = ENV['RACK_ENV'] || 'development'
default_host = environment == 'development' ? 'localhost' : '0.0.0.0'
options[:BindAddress] = options.delete(:Host) || default_host

Because of this nil gets replaced with 0.0.0.0 and we get IPv4 only. For some reason this only happens when RACK_ENV=production is set for the ruby process.

But luckily getaddrinfo in libc will take * instead of nil/NULL:

int
getaddrinfo (const char *name, const char *service,
             const struct addrinfo *hints, struct addrinfo **pai)
{
...
  if (name != NULL && name[0] == '*' && name[1] == 0)
    name = NULL;

and the following fix works with rack+webrick combination too:

if primary_addr == '::' and RUBY_VERSION >= '2.1'
  primary_addr = '*'
end
@tomjelinek
Copy link
Collaborator

@tomjelinek tomjelinek commented Aug 9, 2016

@vvidic Thank you very much for the time and effort you spent on debugging this issue!

Here is the resulting patch 13cfbab. User is still able to define bind addresses without pcsd changing them based on ruby version.

I'm closing this issue. I believe the root cause has been fixed by the patch above. Furthermore there is a possibility to set bind addresses manually (see #77) to workaround this issue.

@tomjelinek tomjelinek closed this Aug 9, 2016
@mpalmer
Copy link

@mpalmer mpalmer commented Oct 13, 2017

Having hit this issue in my own projects, and since this issue helped me out in fixing that, I'll add my own caveat to what @vvidic said up-thread. Specifically:

But luckily getaddrinfo in libc will take *

is only true for glibc. As far as I can tell, there's no requirement that getaddrinfo(3) (which is the underlying function which does the mapping of * to 0.0.0.0 and ::) handle * in the manner that glibc does. I know for a fact that musl libc doesn't handle * the same way.

The only way I know that is portable across libcs is to pass BindAddress: nil to WEBrick::HTTPServer.new, which (as rack/rack#821 still isn't fixed), means that Rack::Handler::WEBrick is broken and unusable if you want to support IPv6. Thankfully, the work it does is only about five lines of code to replace, so I just do that in my projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.