Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Securing connections with SSL not working on Solaris 10 x86 #246

Open
tmcneill30 opened this issue Dec 7, 2015 · 62 comments
Open

Securing connections with SSL not working on Solaris 10 x86 #246

tmcneill30 opened this issue Dec 7, 2015 · 62 comments
Labels
bug solaris Solaris and illumos systems SSL/NSS Issues and PRs about SSL, TLS and other crypto-related matters

Comments

@tmcneill30
Copy link

tmcneill30 commented Dec 7, 2015

I have compiled versions 2.6.5, 2.7.1 & 2.7.3 and have been unable to implement ssl encryption between upsd and the other processes such as upsmon etc on Solaris 10.

My target machine and build server are running Solaris 10/08 running in 32 bit mode. Both servers are recently patched. All other nut functionality is working fine on target server. I am compiling with gcc, gmake, etc.

My configure command is:

./configure  --with-ssl --with-openssl --with-wrap --with-logfacility=LOG_DAEMON \
    --with-openssl-includes=-I/usr/include:/usr/local/ssl/include/openssl \
    --with-openssl-libs='-L/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib -lssl -lcrypto' \
    --with-group=ups --with-user=root \
    --with-snmp-libs=/usr/sfw/lib --with-snmp=no --with-usb=no

make && make install

Configure reports that all ssl libraries are found. /usr/local/ssl contains a build of openssl 1.0.1p. My openssl was used to create the certificate and works well with other packages I have on the system such as ntp and ssh.

I have followed the instructions in the documentation in section 9.5 "Configuring SSL". I created upsd.pem and a certificate using instructions in documentation.

I changed the following configuration entries

upsd.conf: CERTFILE /usr/local/ups/etc/upsd.pem
upsmon.conf: CERTPATH /usr/local/ups/etc/certs
upsmon.conf: FORCESSL 1

I also compiled and made a 2.7.3 and 2.7.1 nut package for RHEL 6. When I change above config settings and use the exact same cert and key, the upsmon to upsd connection is encrypted!! I verified this thru packet sniffing and upmon itself now states it is using ssl. So I know my Solaris problem is not with the certficate files nor with my understanding of how to configure NUT.

ldd on upsd on rhel6 and solaris both list ssl shared objects as dependencies.

upsmon and upsd debugging show that upsmon says hello and upsd does not understand the hello. It should say hello back and then send its certificate. This occurs on RHEL6 but the ssl handshake fails immediately on Solaris 10/08.

UPS: power0@localhost (master) (power value 1)
   0.000207     UPS: ups0@localhost (master) (power value 1)
   0.000572     Using power down flag file /var/state/ups/killpower
   0.001068     debug level is '2'
   0.008958     Trying to connect to UPS [power0@localhost]
   0.011489     SSL_connect do not accept handshake.: Error 0
   0.011585     ssl_error() EOF from client
   0.011617     Can not connect to localhost in SSL, disconnect
   0.011825     UPS [power0@localhost]: connect failed: SSL error: error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake failure
   0.011876     do_notify: ntype 0x0005 (COMMBAD)
   0.011899     Communications with UPS power0@localhost lost
   0.013125     Trying to connect to UPS [ups0@localhost]
   0.015265     SSL_connect do not accept handshake.: Error 0
   0.015355     ssl_error() EOF from client
   0.015379     Can not connect to localhost in SSL, disconnect

Open config.txt in wordpad or vi. This is config.log.

@tmcneill30
Copy link
Author

config.txt

@clepple clepple added the bug label Dec 9, 2015
@clepple
Copy link
Member

clepple commented Dec 9, 2015

So is the problem limited to a Solaris upsmon talking to a Solaris upsd, or are there issues interoperating between Solaris and RHEL as well? The version of OpenSSL on RHEL might be useful as well.

Also, if you are debugging the SSL handshake, upsc might be easier than upsmon to debug, although I don't think it enforces SSL since it is not sending passwords.

I'm wondering if TLSv1_server_method() is not compatible with the defaults in newer OpenSSL. Ideally, we would use TLS_server_method(), but that isn't available until OpenSSL 1.1.0. What if you change it to TLSv1_2_server_method() in https://github.com/networkupstools/nut/blob/v2.7.3/server/netssl.c#L392 ?

@tmcneill30
Copy link
Author

HI Charles,

Thanks for your response. I haven’t had time yet to test/research some of your questions. I haven’t tried connecting to a Solaris upsmon to a RHEL upsd. We are not interested in doing that but I will try it for your troubleshooting benefit.

I can tell you Solaris upsc also fails to connect to Solaris upsmon. It fails in the same way, with a failed ssl handshake. Once again upsd fails to recognize its an ssl connection and throws its fails to send back a server hello and its certificate.

The openssl version on my RHEL system is 1.0.1e. It is the latest ssl package provided by Red Hat ( or was a couple months ago). On the solaris side we have 1.01.f. I hope to get back to you on Tuesday with more info. Would love to figure this out. We are a defense contractor and this is an security item that our client want us to fix.

(trimmed)

@clepple
Copy link
Member

clepple commented Dec 12, 2015

Thanks for your response. I haven’t had time yet to test/research some of your questions. I haven’t tried connecting to a Solaris upsmon to a RHEL upsd. We are not interested in doing that but I will try it for your troubleshooting benefit.

Understood that it is not the deliverable configuration, but it is a good consistency check. A quicker test is to just run the RHEL upsc against the Solaris upsd - no extra configuration needed. Solaris upsc to RHEL upsd might also shed some light on the SSL handshake issue.

For future reference, this is how cURL handles things: https://github.com/bagder/curl/blob/master/lib/vtls/openssl.c#L1670

I can tell you Solaris upsc also fails to connect to Solaris upsmon. It fails in the same way, with a failed ssl handshake. Once again upsd fails to recognize its an ssl connection and throws its fails to send back a server hello and its certificate.

Thanks for confirming that.

The openssl version on my RHEL system is 1.0.1e. It is the latest ssl package provided by Red Hat ( or was a couple months ago). On the solaris side we have 1.01.f. I hope to get back to you on Tuesday with more info. Would love to figure this out. We are a defense contractor and this is an security item that our client want us to fix.

Is NSS/NSPR a viable alternative to OpenSSL in your application?

A bit of background: NUT has had OpenSSL-based SSL support in the source code for years, but due to some licensing complications https://people.gnome.org/~markmc/openssl-and-the-gpl.html, it didn't make it into many packaging systems. Eventually, NSS-based SSL support was added as an option, and that's what goes into the Debian/Ubuntu packages (and probably others). The OpenSSL code in NUT still compiles, but it was written for a much older version of OpenSSL, and it is not exercised often.

(I am not a lawyer; views are my own and not necessarily my employer's; etc.)

@tmcneill30
Copy link
Author

tmcneill30 commented Dec 22, 2015

Sorry about the delay. I’m the usual almost overwhelmed systems engineer.

With certificates enabled, I ran a Solaris upsc against a RHEL upsd. I did not receive a handshake failure. Rather it said connected with ssl.

I then ran a RHEL upsc against a Solaris upsd and received the usual ssl handshake failure. Once again, the handshake blows up right away. The server receives the hello and throws an error. It never sends a hello or the certificate back.

I recompiled 2.7.3 with the change you suggested in netssl.c. Change TLSv1 to TLSv2 in:

if ((ssl_method = TLSv1_server_method()) == NULL)

This failed to compile. It wasn’t able to identify the function TLSv2_server_method.

/bin/sh ../libtool  --tag=CC    --mode=link gcc -I../include  -I/usr/include:/usr/local/ssl/include/openssl -g -O2 -Wall -Wsign-compare -D_REENTRANT    -o upsd upsd.o user.o conf.o  netssl.o sstate.o desc.o  netget.o netmisc.o netlist.o  netuser.o netset.o netinstcmd.o ../common/libcommon.la ../common/libparseconf.la   -lnsl  -lwrap -L/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib -lssl -lcrypto -lrt -lsocket -lnsl
libtool: link: gcc -I../include -I/usr/include:/usr/local/ssl/include/openssl -g -O2 -Wall -Wsign-compare -D_REENTRANT -o upsd upsd.o user.o conf.o netssl.o sstate.o desc.o netget.o netmisc.o netlist.o netuser.o netset.o netinstcmd.o  ../common/.libs/libcommon.a ../common/.libs/libparseconf.a -lwrap -L/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib -lssl -lcrypto -lrt -lsocket -lnsl
Undefined                       first referenced
symbol                             in file
TLSv2_server_method                 netssl.o
ld: fatal: symbol referencing errors. No output written to upsd

I am wondering if NUT is linking to the older ssl system libraries that are present on my system. Openssl 1.0.1p is installed in /usr/local/ssl. As you can see that path is what is referenced in the failing libtool command.

Yet when I perform on ldd on successfully compiled upsd, I get

-bash-3.2$ ldd upsd
        libwrap.so.1 =>  /usr/sfw/lib/libwrap.so.1
        libssl.so.0.9.7 =>       /usr/sfw/lib/libssl.so.0.9.7
        libcrypto.so.0.9.7 =>    /usr/sfw/lib/libcrypto.so.0.9.7
        librt.so.1 =>    /usr/lib/librt.so.1
        libsocket.so.1 =>        /usr/lib/libsocket.so.1
        libnsl.so.1 =>   /usr/lib/libnsl.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libaio.so.1 =>   /usr/lib/libaio.so.1
        libmd.so.1 =>    /usr/lib/libmd.so.1
        libmp.so.2 =>    /usr/lib/libmp.so.2
        libscf.so.1 =>   /usr/lib/libscf.so.1
        libdoor.so.1 =>  /usr/lib/libdoor.so.1
        libuutil.so.1 =>         /usr/lib/libuutil.so.1
        libgen.so.1 =>   /usr/lib/libgen.so.1
        libssl_extra.so.0.9.7 =>         /usr/sfw/lib/libssl_extra.so.0.9.7
        libcrypto_extra.so.0.9.7 =>      /usr/sfw/lib/libcrypto_extra.so.0.9.7
        libm.so.2 =>     /usr/lib/libm.so.2

I’m expecting references to /usr/local/ssl not usr/lib. At every turn, I’ve taken care to specify the ssl path as /usr/local/ssl.

I have the following ssl packages on my system. I’m hesitant to remove the system ones.

pkginfo |grep -i ssl
application OpenSSL_1-0-1p-fips209           OpenSSL
system      SUNWopenssl-commands             OpenSSL Commands (Usr)
system      SUNWopenssl-include              OpenSSL Header Files
system      SUNWopenssl-libraries            OpenSSL Libraries (Usr)
system      SUNWopenssl-man                  OpenSSL Manual Pages
system      SUNWopensslr                     OpenSSL (Root)

If this is the issue, I don’t know who to fix it. My configure command uses the ssl flags to tell it where to find it (as evidenced above). My PKG_CONFIG_PATH has /usr/local/ssl first.

I do note that the nss package is installed on our working RHEL server. Whereas nss is not installed on the Solaris side. Our IA guy, says that installing nss is not an acceptable solution.

(trimmed)

@clepple
Copy link
Member

clepple commented Dec 22, 2015

With certificates enabled, I ran a Solaris upsc against a RHEL upsd. I did not receive a handshake failure. Rather it said connected with ssl.

Thanks, this is a good sanity check on the client side of the Solaris SSL library.

I recompiled 2.7.3 with the change you suggested in netssl.c. Change TLSv1 to TLSv2 in:
if ((ssl_method = TLSv1_server_method()) == NULL)
This failed to compile. It wasn’t able to identify the function TLSv2_server_method.

Not quite "TLSv2":

TLSv1_2_server_method()

The name "TLSv1_2" comes from "TLS v1.2", but it is munged into a C-compatible namespace.

libtool: link: gcc -I../include -I/usr/include:/usr/local/ssl/include/openssl -g -O2 -Wall -Wsign-compare -D_REENTRANT -o upsd upsd.o user.o conf.o netssl.o sstate.o desc.o netget.o netmisc.o netlist.o netuser.o netset.o netinstcmd.o ../common/.libs/libcommon.a ../common/.libs/libparseconf.a -lwrap -L/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib -lssl -lcrypto -lrt -lsocket -lnsl
[...]

I am wondering if NUT is linking to the older ssl system libraries that are present on my system. Openssl 1.0.1p is installed in /usr/local/ssl. As you can see that path is what is referenced in the failing libtool command.

It's been a while since I used the Solaris compiler/linker, but I think that the "-L" flags are parsed left-to-right (assuming they don't need to be split up into multiple -L arguments: "-L/usr/local/ssl/lib -L/usr/sfw/lib") so it might be searching /usr/sfw/lib first.

@clepple
Copy link
Member

clepple commented Dec 22, 2015

https://en.wikipedia.org/wiki/OpenSSL says that TLSv1.2 was introduced in 1.0.1, so if the /usr/sfw/lib version is older, that makes sense. You might not even need to change the TLSv1_2_server_method() call, just rearrange the library search order.

@tmcneill30
Copy link
Author

tmcneill30 commented Dec 23, 2015

Thanks.

I did figure out how to re-order the Linking directories as you can see below. Unfortunately the compile still fails in the same way.

libtool: link: gcc -I../include -I/usr/local/ssl/include/openssl:/usr/sfw/include/ssl:/usr/include -g -O2 -Wall -Wsign-compare -D_REENTRANT -o upsd upsd.o user.o conf.o netssl.o sstate.o desc.o netget.o netmisc.o netlist.o netuser.o netset.o netinstcmd.o  ../common/.libs/libcommon.a ../common/.libs/libparseconf.a -lwrap -L/usr/local/ssl/lib:/usr/lib:/usr/sfw/lib -lssl -lcrypto -lrt -lsocket -lnsl
Undefined                       first referenced
symbol                             in file
TLSv1_2_server_method               netssl.o
ld: fatal: symbol referencing errors. No output written to upsd
collect2: ld returned 1 exit status
**\* Error code 1
make: Fatal error: Command failed for target `upsd'
Current working directory /export/home/pkgbuild/src/nut/nut-2.7.3/server

This surprised me since it is suppose to be finding the newer version of openssl. I haven’t been able to locate any mechanism for it secretly looking elsewhere for header files and libraries.

I have also noticed that the upsd binary looks specifically for libcrypto.so.0.9.7 and libssl.so.0.9.7. Whereas the newer variants are in libcrypto.a and libssl.a located in /usr/local/ssl/lib. But it doesn’t want these. I have noticed it is not trying the /usr/local/ssl/lib path. This path is not getting hard coded into upsd. I have tried everything from environment variables to C flags to change this but with no luck. But I don’t think that is relevant at this point because its specifically wanting libcrypto.so.0.9.7.

ldd -s upsd|more

   find object=libwrap.so.1; required by upsd
    search path=/usr/ccs/lib:/lib:/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib  (RUNPATH
/RPATH from file upsd)
    trying path=/usr/ccs/lib/libwrap.so.1
    trying path=/lib/libwrap.so.1
    trying path=/usr/lib/libwrap.so.1
    trying path=/usr/sfw/lib/libwrap.so.1
        libwrap.so.1 =>  /usr/sfw/lib/libwrap.so.1

   find object=libssl.so.0.9.7; required by upsd
    search path=/usr/ccs/lib:/lib:/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib  (RUNPATH
/RPATH from file upsd)
    trying path=/usr/ccs/lib/libssl.so.0.9.7
    trying path=/lib/libssl.so.0.9.7
    trying path=/usr/lib/libssl.so.0.9.7
    trying path=/usr/sfw/lib/libssl.so.0.9.7
        libssl.so.0.9.7 =>       /usr/sfw/lib/libssl.so.0.9.7

   find object=libcrypto.so.0.9.7; required by upsd
    search path=/usr/ccs/lib:/lib:/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib  (RUNPATH
/RPATH from file upsd)
    trying path=/usr/ccs/lib/libcrypto.so.0.9.7
    trying path=/lib/libcrypto.so.0.9.7
    trying path=/usr/lib/libcrypto.so.0.9.7
    trying path=/usr/sfw/lib/libcrypto.so.0.9.7
        libcrypto.so.0.9.7 =>    /usr/sfw/lib/libcrypto.so.0.9.7

So its clear that my upsd is using the 0.9.7 version of ssl. If this is the case, why is it failing if its meant to work with older versions of openssl?

I was wondering if the problem was that it was including header files from 1.0.1p and then trying to work with 0.9.7. So I stripped out every reference of /usr/local/ssl/lib from every flag and environment variable. I then installed my package with my certs. Sadly the upsc and upmon connections still failed. So this appears to not work with ssl 0.9.7 which would rule out the problem being with ssl being newer. Remember these are the very cert files and config file settings that succeed with my linux package.

(trimmed)

@clepple
Copy link
Member

clepple commented Dec 23, 2015

search path=/usr/ccs/lib:/lib:/usr/lib:/usr/sfw/lib:/usr/local/ssl/lib (RUNPATH
/RPATH from file upsd)

Something is still putting /usr/sfw/lib first in RUNPATH/RPATH, though.

So its clear that my upsd is using the 0.9.7 version of ssl. If this is the case, why is it failing if its meant to work with older versions of openssl?

The problem is that while there is a window of OpenSSL versions where we can expect that things would work (i.e. NUT uses TLS v1.0 support, which apparently hasn't been deprecated in OpenSSL yet - but NUT's SSL code predates the introduction of TLS v1.2), that doesn't mean that it will automatically work once those conditions are satisfied.

I'm willing to believe that there is a bug in NUT's OpenSSL implementation (see also #202), but it works in Linux, so I suspect a subtle difference in Solaris system call semantics (or possibly in the versions you are linking against - but I think we are close to pinning that down).

At this point, some more runtime log information would be helpful. I think you sent some client-side information: what about on the server side? Can you sanitize some of the packet captures?

@tmcneill30
Copy link
Author

Here is the debug output from Solaris upsd when a connection is attempted by (.150) upsmon

./upsd -DD

25.954192 write: [destfd=6] [len=12] [OK STARTTLS]
25.956692 Unknown return value from SSL_accept: Resource temporarily unavailable
25.956883 ssl_error() ret=-1 SSL_ERROR_WANT_READ
25.957055 mainloop: polling 2 filedescriptors
25.957510 Disconnect 192.168.72.150 (read failure): No such file or directory
25.959629 Disconnect from 192.168.72.150

Debug output from upsmon trying to connect to upsd (.121)

./upsmon -DD

Network UPS Tools upsmon 2.7.3
kill: No such process
0.000000 UPS: power0@192.168.72.121 (master) (power value 1)
0.000117 Using power down flag file /usr/local/ups/etc/killpower
0.000126 Using power down flag file /var/state/ups/killpower
0.000258 debug level is '2'
0.012876 Trying to connect to UPS [power0@192.168.72.121]
0.014393 SSL_connect do not accept handshake.: Success
0.014408 ssl_error() EOF from client
0.014414 Can not connect to 192.168.72.121 in SSL, disconnect
0.014574 UPS [power0@192.168.72.121]: connect failed: SSL error: error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake failure
0.014587 do_notify: ntype 0x0005 (COMMBAD)
0.014592 Communications with UPS power0@192.168.72.121 lost

Attached are the 6 packets generated by upsmon’s attempt to connect with upsd. The packet sniffer on Solaris 10 is snoop which is a variant of tcpdump. The proper way to read the output is to save to a file and then read that file with the snoop program. I have outputted the payload of each packet to the attached txt and docx file. All the data (header and payload) are contained in the binary file. The file can be read in wireshark or if you have access to snoop or tcpdump. The snoop command is “snoop –i binary-version –x0” to read the payloads. Snoop –i binary-version –V – for headers.

Question: When my Linux upsmon is able to connect locally to upsd, why does it say Certificate Verification is Disabled?

0.003453 Trying to connect to UPS [power0@localhost]
0.019824 Connected to localhost in SSL
0.019845 Certificate verification is disabled
0.020491 Logged into UPS power0@localhost

As an additional help I’ve included the tcpdump file for the successful linux connection. Its upsmon connecting to upsd over 127.0.0.1:3493. Its called Linux-success. I can see the certificate being passed over. And my password no longer being visible.

Tcpdump file can be viewed in wireshark.

One final note: My Solaris upsmon is now failing to connect to the Linux upsd. However it gets a bit farther in the handshake.

./upsmon –D

0.014075 Trying to connect to UPS [power0@159.62.74.88]
0.022129 Unknown return value from SSL_connect -1: Error 0
0.023805 ssl_error() ret=-1 SSL_ERROR 1
0.025522 Can not connect to 159.62.74.88 in SSL, disconnect
0.027593 UPS [power0@159.62.74.88]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.029192 Communications with UPS power0@159.62.74.88 lost
Broadcast Message from root (pts/5) on JRE-SVR10168 Wed Dec 23 23:29:16...
Communications with UPS power0@159.62.74.88 lost
5.045691 Trying to connect to UPS [power0@159.62.74.88]
5.048924 Unknown return value from SSL_connect -1: Error 0
5.048998 ssl_error() ret=-1 SSL_ERROR 1
5.049106 Can not connect to 159.62.74.88 in SSL, disconnect
5.049581 UPS [power0@159.62.74.88]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
5.049649 UPS power0@159.62.74.88 is unavailable

./upsd -D

9.214212 SSL_accept do not accept handshake.: Success
9.214264 ssl_error() ret=0 SSL_ERROR 1
9.214288 ssl_debug: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca

(trimmed)

1 0.00000 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Syn Seq=1048990410 Len=0 Win=14600 Options=<mss 1460,sackOK,tstamp 2517451091 0,nop,wscale 7>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 003c 1399 4000 4006 14c3 c0a8 4896 c0a8    .<..@.@.....H...
  32: 4879 a7e3 0da5 3e86 52ca 0000 0000 a002    Hy....>.R.......
  48: 3908 de5d 0000 0204 05b4 0402 080a 960d    9..]............
  64: 4153 0000 0000 0103 0307                   AS........

2 0.00030 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Ack=1553768145 Seq=1048990411 Len=0 Win=115 Options=<nop,nop,tstamp 2517451091 500688583>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 0034 139a 4000 4006 14ca c0a8 4896 c0a8    .4..@.@.....H...
  32: 4879 a7e3 0da5 3e86 52cb 5c9c 9ed1 8010    Hy....>.R.\.....
  48: 0073 45a2 0000 0101 080a 960d 4153 1dd7    .sE.........AS..
  64: e6c7                                       ..

3 0.00000 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Push Ack=1553768145 Seq=1048990411 Len=9 Win=115 Options=<nop,nop,tstamp 2517451091 500688583>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 003d 139b 4000 4006 14c0 c0a8 4896 c0a8    .=..@.@.....H...
  32: 4879 a7e3 0da5 3e86 52cb 5c9c 9ed1 8018    Hy....>.R.\.....
  48: 0073 0643 0000 0101 080a 960d 4153 1dd7    .s.C........AS..
  64: e6c7 5354 4152 5454 4c53 0a                ..STARTTLS.

4 0.00050 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Ack=1553768157 Seq=1048990420 Len=0 Win=115 Options=<nop,nop,tstamp 2517451092 500688584>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 0034 139c 4000 4006 14c8 c0a8 4896 c0a8    .4..@.@.....H...
  32: 4879 a7e3 0da5 3e86 52d4 5c9c 9edd 8010    Hy....>.R.\.....
  48: 0073 458b 0000 0101 080a 960d 4154 1dd7    .sE.........AT..
  64: e6c8                                       ..

5 0.00016 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Push Ack=1553768157 Seq=1048990420 Len=155 Win=115 Options=<nop,nop,tstamp 2517451092 500688584>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 00cf 139d 4000 4006 142c c0a8 4896 c0a8    ....@.@..,..H...
  32: 4879 a7e3 0da5 3e86 52d4 5c9c 9edd 8018    Hy....>.R.\.....
  48: 0073 7284 0000 0101 080a 960d 4154 1dd7    .sr.........AT..
  64: e6c8 1603 0100 9601 0000 9203 0156 7afa    .............Vz.
  80: 4f33 3a7d 40e3 eca9 7474 3731 b182 e52c    O3:}@...tt71...,
  96: c278 b9f2 62de b196 3974 fcb7 4800 004c    .x..b...9t..H..L
 112: c014 c00a 0039 0038 0088 0087 c00f c005    .....9.8........
 128: 0035 0084 c013 c009 0033 0032 c012 c008    .5.......3.2....
 144: 009a 0099 0045 0044 0016 0013 c00e c004    .....E.D........
 160: c00d c003 002f 0096 0041 000a 0007 c011    ...../...A......
 176: c007 c00c c002 0005 0004 00ff 0100 001d    ................
 192: 000b 0004 0300 0102 000a 0008 0006 0019    ................
 208: 0018 0017 0023 0000 000f 0001 01           .....#.......

6 0.00044 192.168.72.150 -> JRE-SVR10168 TCP D=3493 S=42979 Fin Ack=1553768158 Seq=1048990575 Len=0 Win=115 Options=<nop,nop,tstamp 2517451093 500688584>

   0: 0050 56b4 6f47 0050 56b4 542d 0800 4500    .PV.oG.PV.T-..E.
  16: 0034 139e 4000 4006 14c6 c0a8 4896 c0a8    .4..@.@.....H...
  32: 4879 a7e3 0da5 3e86 536f 5c9c 9ede 8011    Hy....>.So\.....
  48: 0073 44ed 0000 0101 080a 960d 4155 1dd7    .sD.........AU..
  64: e6c8                                       ..

@tmcneill30
Copy link
Author

Attached is truss output for upsd. I ran a upsc power0@localhost and captured how upsd reacted.

Other than seeing STARTTLS in there, it doesn’t mean much to me.

@clepple
Copy link
Member

clepple commented Dec 24, 2015

Here is the debug output from Solaris upsd when a connection is attempted by (.150) upsmon

./upsd -DD

25.954192 write: [destfd=6] [len=12] [OK STARTTLS]
25.956692 Unknown return value from SSL_accept: Resource temporarily unavailable
25.956883 ssl_error() ret=-1 SSL_ERROR_WANT_READ

Ah, this does look like issue #202

What confuses me is that the code should not get that error there - the file descriptor comes from select(), so it should be ready to read. (I think a short read is a different error code.)

I will try to look at this over the holidays.

Attached are the 6 packets generated by upsmon’s attempt to connect with upsd.

This might not be necessary, but note that none of your attachments are getting through to Github, only the text of your reply: #246 (comment)

Question: When my Linux upsmon is able to connect locally to upsd, why does it say Certificate Verification is Disabled?

https://github.com/networkupstools/nut/blob/master/clients/upsclient.c#L1041 and search for CERTVERIFY here: http://www.networkupstools.org/docs/man/upsmon.conf.html

clepple added a commit that referenced this issue Dec 30, 2015
@clepple
Copy link
Member

clepple commented Dec 30, 2015

@tmcneill30 please try this patch: f098486

@aquette does the Ubuntu QA test cover SSL? Can you please run it with this patch to make sure I didn't break anything? The SSL_accept fix mostly targets the OpenSSL side, but there is one case for a zero-length read.

@tmcneill30
Copy link
Author

Thanks for the patch. However it doesn’t compile.

The problem is a change you made in upsd.c, namely,

upsdebugx(2, "write: [destfd=%d] [len=%d] [%s]", client->sock_fd, len, str_rtrim(ans, '\n'));

This line generates a symbol not found error.
Undefined first referenced
symbol in file
str_rtrim upsd.o
ld: fatal: symbol referencing errors. No output written to upsd

Did you mean str_trim? I say this because I don’t get any google hits for str_rtrim, but I do for str_trim. The str_rtrim function call doesn’t exist in the original version of upsd.c.

Tom

(trimmed)

@tmcneill30
Copy link
Author

Compilation attempts with str_trim and strtrim also fail. I’m not a C developer hence I’m just guessing with these function names.

(trimmed)

@clepple
Copy link
Member

clepple commented Jan 7, 2016

The problem is a change you made in upsd.c, namely,

upsdebugx(2, "write: [destfd=%d] [len=%d] [%s]", client->sock_fd, len, str_rtrim(ans, '\n'));

I don't see that particular line in the patch. Are you grabbing the whole source tree from that revision? (I was thinking that you might want to minimize the number of lines changed from the base 2.7.3 version - the patch should be standalone, although I haven't tried that yet.)

If you are using the whole Git source tree at f098486, you might need to re-run autogen.sh and configure, since a few files have been added since then (in particular, common/str.c has str_rtrim() and company).

@clepple
Copy link
Member

clepple commented Jan 7, 2016

the patch should be standalone

Confirmed that it builds on OS X.

If you download this, you can apply it to a 2.7.3 tree with "patch -p1 < f098486.diff":

https://github.com/networkupstools/nut/commit/f098486803921b8899d4172fff97846428d82c12.diff

@aquette
Copy link
Member

aquette commented Jan 7, 2016

@clepple : sadly no, the Ubuntu QA test (also available in Debian) doesn't cover SSL yet. Part of the planned things for when NSS would be compiled in (already the case since 2.7.1-1)

@tmcneill30
Copy link
Author

I had copied the entire files for netssl.c and upsd.c from github. I am not using git but rather the source tar for version 2.7.3. My brain lock for plugging in files from HEAD. Thanks for solving that for me.

Now that I have patched the 2.7.3 files instead, the compile succeeds. Unfortunately the ssl connection is still broken.

Here is what the debug output was:

-bash-3.2# ./upsmon -D
Network UPS Tools upsmon 2.7.3
kill: No such process
0.000000 UPS: power0@localhost (master) (power value 1)
0.001277 UPS: ups0@localhost (master) (power value 1)
0.002841 Using power down flag file /var/state/ups/killpower
0.099623 debug level is '1'
0.111095 Trying to connect to UPS [power0@localhost]
0.120900 Unknown return value from SSL_connect -1: Error 0
0.121987 ssl_error() ret=-1 SSL_ERROR 1
0.123122 Can not connect to localhost in SSL, disconnect
0.124566 UPS [power0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.125569 Communications with UPS power0@localhost lost
0.127137 Trying to connect to UPS [ups0@localhost]
0.135478 Unknown return value from SSL_connect -1: Error 0
0.136565 ssl_error() ret=-1 SSL_ERROR 1
0.137583 Can not connect to localhost in SSL, disconnect
0.139024 UPS [ups0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.140222 Communications with UPS ups0@localhost lost

bash-3.2# ./upsd -D
Network UPS Tools upsd 2.7.3
0.000000 fopen /var/state/ups/upsd.pid: No such file or directory
0.003991 listening on 0.0.0.0 port 3493
0.006187 /var/state/ups is world readable
0.008549 Can't connect to UPS power0: No such file or directory
24.935220 ssl_error() ret=-1 SSL_ERROR_WANT_READ
24.938237 ssl_error() ret=-1 SSL_ERROR_WANT_READ
24.940071 ssl_error() ret=-1 SSL_ERROR 1
24.940325 ssl_debug: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
24.940454 ssl_debug: error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
24.949531 ssl_error() ret=-1 SSL_ERROR_WANT_READ
24.951322 ssl_error() ret=-1 SSL_ERROR_WANT_READ
24.954868 ssl_error() ret=-1 SSL_ERROR 1
24.955083 ssl_debug: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
24.955271 ssl_debug: error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

(trimmed)

@clepple
Copy link
Member

clepple commented Jan 8, 2016

24.940325 ssl_debug: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
24.940454 ssl_debug: error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

Is the CA certificate (and any intermediates, if applicable) included in CERTFILE?

@tmcneill30
Copy link
Author

Yes. I created the certificates following the instructions listed in section 9.5 of the NUT documentation.
http://www.networkupstools.org/docs/user-manual.chunked/ar01s09.html#_configuring_ssl

Keep in mind that moving the exact same certificate files to my rhel server, I see no errors in debug mode and I confirmed thru packet sniffing that the connection between a local uspmon and upsd was indeed encrypted.

I rechecked my config files for my Solaris test and upsmon.conf and upsd.conf have the proper settings.

(trimmed)

@clepple
Copy link
Member

clepple commented Jan 9, 2016

I wonder if the "unknown ca" is really an expired certificate. At one point you mentioned that Solaris upsmon couldn't connect to the RHEL upsd. I don't know when the validity checks occur in OpenSSL. (The NUT instructions do mention that the default command lines will create certificates that expire in 30 days.)

(Also, when you reply by email, please trim off the previous text. GitHub apparently does not remove that automatically.)

@tmcneill30
Copy link
Author

Good catch on the expired certificates but sadly after thorough re-testing I’m still getting the same message

-bash-3.2# ./upsmon –D
Network UPS Tools upsmon 2.7.3
kill: No such process
0.000000 UPS: power0@localhost (master) (power value 1)
0.000134 UPS: ups0@localhost (master) (power value 1)
0.000360 Using power down flag file /var/state/ups/killpower
0.000731 debug level is '1'
0.008470 Trying to connect to UPS [power0@localhost]
0.015578 Unknown return value from SSL_connect -1: Error 0
0.015709 ssl_error() ret=-1 SSL_ERROR 1
0.015904 Can not connect to localhost in SSL, disconnect
0.016365 UPS [power0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.016428 Communications with UPS power0@localhost lost
0.052170 Trying to connect to UPS [ups0@localhost]
0.057136 Unknown return value from SSL_connect -1: Error 0
0.058491 ssl_error() ret=-1 SSL_ERROR 1
0.059749 Can not connect to localhost in SSL, disconnect
0.061418 UPS [ups0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.062690 Communications with UPS ups0@localhost lost

My retest consisted of:
Creating new unexpired certificates.
Testing them on my rhel system. They worked. (RHEL was throwing bad certificate messages on the old expired certificates).
Re-downloaded the 2.7.3 source tar file.
Re-applied your patch. Manually checked to make sure patch was truly applied to upsd.c and netssl.c.
Re-made the package and installed it on my Solaris system.
Thoroughly checked file and directory permissions on the certificates. Verified config files were set properly.

In summary, Solaris upsmon fails with above tracing, RHEL works fine with exact same certs.
If you need any further info, let me know.

@clepple
Copy link
Member

clepple commented Jan 12, 2016

0.008470 Trying to connect to UPS [power0@localhost]
0.015578 Unknown return value from SSL_connect -1: Error 0
0.015709 ssl_error() ret=-1 SSL_ERROR 1

My mistake, it's the same blocking/non-blocking error on the client side.

This commit should be added on top of the previous one: 213ee3d

However, I can't really test that here: OS X does not have the non-blocking issue on the client side. It is no worse with this patch, though.

@tmcneill30
Copy link
Author

I added your upsclient.c patch to my build. Verified patch succeeded. Re-made package and re-installed.

I still get the below tracing when upsmon wants to connect locally with upsd.

-bash-3.2# ./upsmon -D
Network UPS Tools upsmon 2.7.3
kill: No such process
0.000000 UPS: power0@localhost (master) (power value 1)
0.000147 UPS: ups0@localhost (master) (power value 1)
0.000408 Using power down flag file /var/state/ups/killpower
0.000747 debug level is '1'
0.010818 Trying to connect to UPS [power0@localhost]
0.018496 Unknown return value from SSL_connect -1: Error 0
0.018595 ssl_error() ret=-1 SSL_ERROR 1
0.018698 Can not connect to localhost in SSL, disconnect
0.019150 UPS [power0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.019227 Communications with UPS power0@localhost lost
0.038963 Trying to connect to UPS [ups0@localhost]
0.043828 Unknown return value from SSL_connect -1: Error 0
0.045208 ssl_error() ret=-1 SSL_ERROR 1
0.046500 Can not connect to localhost in SSL, disconnect
0.048163 UPS [ups0@localhost]: connect failed: SSL error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
0.049529 Communications with UPS ups0@localhost lost

@clepple
Copy link
Member

clepple commented Jan 13, 2016

You might need to check with the OpenSSL developers, or a Solaris-specific forum.

@rgc2000
Copy link

rgc2000 commented Jan 18, 2016

Hi,
Please can you try this very simple patch against nut-2.7.3. It will disable non-blocking sockets on Solaris for server (upsd). Then tell me if it works for you.

nut-2.7.3.solaris_ssl.patch.gz

@clepple
Copy link
Member

clepple commented Jan 18, 2016

It will disable non-blocking sockets on Solaris for server (upsd).

René, thanks for posting this, but isn't this patch is for the server listening socket? With the commits mentioned earlier in this issue, I think we have the server's client FD retrying. (Note that since the server is single-threaded, it shouldn't block - but the client API was written with the expectation that calls will block.)

Were you able to reproduce the error in upsmon or upsc?

@tmcneill30
Copy link
Author

f098486.patch.txt
nut-2.7.3.solaris_ssl.patch.txt
patched-nut-2.7.3.tar.txt

Pardon the .txt appended to each filename. Its my first time using github and it says it won't accept .tar or anything linux. For some reason it refused the zip file I made even though that is supposed to be acceptable. (file size limitation?) the zip was only 8.5 MB.

Anyways, I"ve included the 2 patch files I made. One with your patch and the other from Clepple. I included my patched source (just to have you double check that my source was properly patched). I also included my package that was derived from that. (I have a build script that creates the package). My NUT compile relies on my openssl package which installs in /usr/local/ssl. That package is included.

Again here is the configure command I used. Note lack of --with-wrap.
./configure --with-ssl --with-openssl --with-logfacility=LOG_DAEMON --with-openssl-includes=-I/usr/local/ssl/include:/usr/include -with-openssl-libs='-L/usr/local/
ssl/lib:/usr/lib -lssl -lcrypto' --with-group=ups --with-user=root --with-snmp-libs=/usr/sfw/lib --with-snmp=no --with-usb=no

then the usual make && make install

The package includes the certificates and config files I am using. (If I get this working, I'll re-make my certificate) It also installs /etc/init.d/ups_nut.init. I haven't taken the time yet to move it from legacy to SMF.

Thank you for looking at these. Let me if you need anything further.

@rgc2000
Copy link

rgc2000 commented Jan 23, 2016

There is a problem with your compilation options, your binaries are not linked with your OpenSSL but with the system (and old) openssl located in /usr/sfw/lib

The SSL connexions work but the server certificate check fails. In upsmon.conf if you set CERTVERIFY 0 everything works fine.

I'm still investigating to understand why the certificate check fails. You are using a self-signed certificate for your server. I tried with a root CA and a server cert and it fails too using your binaries. It works with my binaries but I'm not using /usr/sfw openssl libs.

Next step : I will recompile your wineries with your sources and your options but I will ensure that your OpenSSL is used.

@clepple
Copy link
Member

clepple commented Jan 23, 2016

Note lack of --with-wrap.

I think this is tangential to the SSL issue, but note that in ./configure option parsing, the lack of an option allows the script to auto-detect the library or feature. You might want --without-wrap or --with-wrap=no.

@rgc2000
Copy link

rgc2000 commented Jan 24, 2016

SSL test results for nut-2.7.3 and patch f098486 on solaris 10 (32 or 64 bits)

  • without libwrap : SSL connection OK with CERTVERIFY 1, upsc can't check server certificate
  • with libwrap : SSL connection OK with CERTVERIFY 0, otherwise server certificate fails if set to 1 and SSL connection is closed (with error: tlsv1 alert unknown ca). Can't make it work with CERTVERIFY 1. upsc still can't check server certificate.

I guess that upsc can never check server certificates because it has no configuration file to make it search for trusted certificate authorities.

libwrap disturbs server certificate check but it still allows SSL connections.

Tests realized with OpenSSL 1.0.1q, 1.0.2e and the one in solaris directory /usr/sfw (0.9.7d)
tcp-wrappers is the one in solaris directory /usr/sfw

@clepple
Copy link
Member

clepple commented Jan 24, 2016

Is libwrap altering the linker flags?

@rgc2000
Copy link

rgc2000 commented Jan 25, 2016

You are right. --with-wrap changes the library path order at compile time and thus it always links with the libssl located in /usr/sfw/lib

I managed to use configure options to make it use the libraries I want and now everything works (except server certificate verify with upsc as this is only available with nss, not openssl)

So. tmcneil30, to compile and openssl to match your needs :

export LD_RUN_PATH=/usr/local/ups/lib:/usr/local/ssl/lib:/usr/sfw/lib
./Configure -DPIC -fPIC --prefix=/usr/local/ssl --openssldir=/usr/local/ssl/etc zlib-dynamic shared solaris-x86-gcc
make
(then as root)
make MANDIR=/usr/local/ssl/share/man MANSUFFIX=ssl install

The to compile NUT with f098486 patch included :
export LD_RUN_PATH=/usr/local/ups/lib:/usr/local/ssl/lib:/usr/sfw/lib
./configure --with-ssl --with-openssl --with-logfacility=LOG_DAEMON --with-group=ups --with-user=root --with-snmp-libs=/usr/sfw/lib --with-wrap --with-snmp=no --with-usb=no --prefix=/usr/local/ups CFLAGS="-I/usr/local/ssl/include -I/usr/sfw/include" LDFLAGS="-L/usr/local/ssl/lib -L/usr/sfw/lib"
make
(the as root)
make install

You can remove LD_LIBRARY_PATH stuff in your init script, as it will no longer be needed thanks to LD_RUN_PATH set at compile time.

ldd /usr/local/ups/sbin/upsmon
libupsclient.so.4 => /usr/local/ups/lib/libupsclient.so.4
libssl.so.1.0.0 => /usr/local/ssl/lib/libssl.so.1.0.0
libcrypto.so.1.0.0 => /usr/local/ssl/lib/libcrypto.so.1.0.0
librt.so.1 => /lib/librt.so.1
libsocket.so.1 => /lib/libsocket.so.1
libnsl.so.1 => /lib/libnsl.so.1
libc.so.1 => /lib/libc.so.1
libgcc_s.so.1 => /usr/sfw/lib/libgcc_s.so.1
libdl.so.1 => /lib/libdl.so.1
libaio.so.1 => /lib/libaio.so.1
libmd.so.1 => /lib/libmd.so.1
libmp.so.2 => /lib/libmp.so.2
libscf.so.1 => /lib/libscf.so.1
libdoor.so.1 => /lib/libdoor.so.1
libuutil.so.1 => /lib/libuutil.so.1
libgen.so.1 => /lib/libgen.so.1
libm.so.2 => /lib/libm.so.2

Using solaris libssl in /usr/srw/lib makes upsmon fail in certificate verify when compiling with libwrap

@tmcneill30
Copy link
Author

Rene - thank you for figuring this out. Especially for compiling --with-wrap. That is a requirement for me. I haven't had time to test your instructions, but will soon and will report back my results.

question - are the patches still necessary?

@rgc2000
Copy link

rgc2000 commented Jan 29, 2016 via email

@tmcneill30
Copy link
Author

I've followed your instructions and .... SUCCESS!! However I needed to apply not only the f09486 patch but the patch you sent me was well. If you haven't done so, I would commit that patch and post the number here. I was still getting ssl errors in upsd until I recompiled with your patch.

The LD_RUN_PATH was critical to getting the ups binaries to depend upon my newer ssl. I did need to re-compile ssl as you stated above.

Last question - do I hit the close button? How do I know the 2 patches will make it into the next version of NUT?

@clepple
Copy link
Member

clepple commented Jan 30, 2016

Last question - do I hit the close button? How do I know the 2 patches will make it into the next version of NUT?

My usual procedure is to close the issue when all of the patches are merged into master. At this point, that branch is slated to be the basis for the next release.

However, unless I missed an update, the O_NDELAY removal patch has the potential for a denial-of-service if someone only partially completes the SSL handshake.

@rgc2000
Copy link

rgc2000 commented Jan 30, 2016

I've followed your instructions and .... SUCCESS!! However I needed to apply not only the f09486 patch but the patch you sent me was well. If you haven't done so, I would commit that patch and post the number here. I was still getting ssl errors in upsd until I recompiled with your patch.

I did not apply the O_NDELAY removal patch. The f09486 patch against nut-2.7.3 was enough to make SSL work fine under Solaris. Please check again because it should work with only one of those patches, both are not needed.

@tmcneill30
Copy link
Author

I re-checked and still get the same results.
To recap: I re-compiled 2.7.3 with only the f09486 patch following above instructions. My ldd commands against upsmon look exactly like yours.

I start upsd -D and initially get this:
-bash-3.2$ sudo /usr/local/ups/sbin/upsd -D
Network UPS Tools upsd 2.7.3
0.000000 fopen /var/state/ups/upsd.pid: No such file or directory
0.002299 listening on 0.0.0.0 port 3493
0.002837 /var/state/ups is world readable

I then start upsmon:
-bash-3.2$ sudo /usr/local/ups/sbin/upsmon -D
Network UPS Tools upsmon 2.7.3
kill: No such process
0.000000 UPS: power0@localhost (master) (power value 1)
0.000157 UPS: ups0@localhost (master) (power value 1)
0.000457 Using power down flag file /var/state/ups/killpower
0.000902 debug level is '1'
0.020375 Trying to connect to UPS [power0@localhost]
0.047428 Connected to localhost in SSL
0.048919 Logged into UPS power0@localhost
0.049834 Poll UPS [power0@localhost] failed - Driver not connected
0.049885 Communications with UPS power0@localhost lost
0.051993 Trying to connect to UPS [ups0@localhost]
0.072135 Connected to localhost in SSL
0.073696 Login on UPS [ups0@localhost] failed - got [ERR UNKNOWN-UPS]
5.075487 Poll UPS [power0@localhost] failed - Driver not connected
5.075698 UPS power0@localhost is unavailable
5.080378 Poll UPS [ups0@localhost] failed - [ups0] does not exist on server localhost
5.080460 Communications with UPS ups0@localhost lost

Once upsmon connects, I get the following from upsd
10.415016 ssl_error() ret=-1 SSL_ERROR_WANT_READ
10.416700 ssl_error() ret=-1 SSL_ERROR_WANT_READ
10.435500 ssl_error() ret=-1 SSL_ERROR_WANT_READ
10.439147 User jreupsmon@127.0.0.1 logged into UPS power0
10.445059 ssl_error() ret=-1 SSL_ERROR_WANT_READ
10.446088 ssl_error() ret=-1 SSL_ERROR_WANT_READ
10.459232 ssl_error() ret=-1 SSL_ERROR_WANT_READ
24.428169 User jreupsmon@127.0.0.1 logged out from UPS power0

When I add in the O_NDELAY patch, the SSL errors do not appear.

@rgc2000
Copy link

rgc2000 commented Feb 1, 2016

I have tested your attached packages and I am running into the same error messages as you (SSL_ERROR_WANT_READ).
When I am using my own compiled nut suite I don't have these error messages even with O_NDELAY enabled.

What gcc compiler are you using ? I am using my self compiled gcc 4.8.3

gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/gnu/libexec/gcc/i386-pc-solaris2.10/4.8.3/lto-wrapper
Target: i386-pc-solaris2.10
Configured with: ../gcc-4.8.3/configure --prefix=/opt/gnu --enable-languages=c,c++ --with-gmp=/opt/gnu --with-mpfr=/opt/gnu --without-gnu-ld --with-ld=/usr/ccs/bin/ld --with-gnu-as CFLAGS='-g -O2 -mtune=opteron -march=opteron' CXXFLAGS='-g -O2 -mtune=opteron -march=opteron'
Thread model: posix
gcc version 4.8.3 (GCC)

@tmcneill30
Copy link
Author

-bash-3.2$ gcc -v
Reading specs from /usr/sfw/lib/gcc/i386-pc-solaris2.10/3.4.3/specs
Configured with: /builds/sfw10-gate/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/sfw/bin/gas --with-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++ --enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)

@tmcneill30
Copy link
Author

I switched my c compiler to a newer one I had installed. Same result. Still get ssl errors from upsd upon a upsmon connection.

-bash-3.2$ gcc -v
Reading specs from /opt/csw/lib/gcc/i386-pc-solaris2.10/4.9.2/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/csw/libexec/gcc/i386-pc-solaris2.10/4.9.2/lto-wrapper
Target: i386-pc-solaris2.10
Configured with: /home/dam/mgar/pkg/gcc4/trunk/work/solaris10-i386/build-isa-pentium_pro/gcc-4.9.2/configure --prefix=/opt/csw --exec_prefix=/opt/csw --bindir=/opt/csw/bin --sbindir=/opt/csw/sbin --libexecdir=/opt/csw/libexec --datadir=/opt/csw/share --sysconfdir=/etc/opt/csw --sharedstatedir=/opt/csw/share --localstatedir=/var/opt/csw --libdir=/opt/csw/lib --infodir=/opt/csw/share/info --includedir=/opt/csw/include --mandir=/opt/csw/share/man --enable-cloog-backend=isl --enable-java-awt=xlib --enable-languages=ada,c,c++,fortran,go,java,objc --enable-libada --enable-libssp --enable-nls --enable-objc-gc --enable-threads=posix --program-suffix=-4.9 --with-cloog=/opt/csw --with-gmp=/opt/csw --with-included-gettext --with-ld=/usr/ccs/bin/ld --without-gnu-ld --with-libiconv-prefix=/opt/csw --with-mpfr=/opt/csw --with-ppl=/opt/csw --with-system-zlib=/opt/csw --with-gnu-as --with-as=/opt/csw/bin/gas
Thread model: posix
gcc version 4.9.2 (GCC)

Attached is my config.log. I don't use --prefix because I don't want 'make install' to install in /usr/local/ups. I instruct make install directly to install in a build directory. And then of course I am creating a package from there. Other than that, I believer our configure commands are the same, but maybe you can spot something in there.
config.log.txt

@rgc2000
Copy link

rgc2000 commented Feb 2, 2016

I've just tried the same compiler as you (gcc 3.4.3 from /usr/sfw/bin) and I'm not getting the SSL error.
Now I know that I'm not using the same solaris release as you. I'm running Solaris 10 update 11 kernel 150401-26.
I would need to install solaris 10 update 6, right ? But why your binaries don't work fine on my Sol10u11 machine ?

@tmcneill30
Copy link
Author

My solaris:
cat /etc/release
Oracle Solaris 10 8/11 s10x_u10wos_17b X86
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Assembled 23 August 2011

Also I am running in 32 bit mode. Its a requirement I have for this.

@rgc2000
Copy link

rgc2000 commented Feb 2, 2016

OpenSSL is ok. My nut with your OpenSSL works fine. The problem seems to be in nut only. Please can you send me your source code in a tar file. I want to compare files.

@tmcneill30
Copy link
Author

certainly.
patched-src-nut-273.tar.txt

@rgc2000
Copy link

rgc2000 commented Feb 2, 2016

Oh! When comparing sources I have just realized that I am using a source tree where O_NDELAY is already removed! That’s why everything was working fine from the beginning. If I set O_NDELAY (or O_NONBLOCK) I have the SSL_ERROR_WANT_READ message.

BUT! According to patched source code this is only a warning message. Return code is 0 from the ssl_error() function. upsmon is connected to upsd using SSL with CERTVERIFY. Communication works OK

I have tested the configuration using dummy-ups driver to simulate power0 UPS. Everything looks ok

I can see 3 consecutive SSL_ERROR_WANT_READ per STARTTLS connection startup. 3 first ones to your power0 UPS. 3 next to your ups0 UPS that is missing from ups.conf.
upsc also makes this 3 warnings on ups side each time command is run.

So I suggest to keep O_NDELAY set and to ignore the SSL_ERROR_WANT_READ message as it is only an harmless debug message.

If you prefer to remove O_NDELAY to prevent those ugly debug messages you no longer need patch f098486.

Note : in your /etc/init.d/ups_nut.init script path to upsdrvctl is wrong, should be /sbin/ instead of /bin/ (2 lines)
Note 2 : I am running solaris 10 in 32 bit mode to match your needs.

@tmcneill30
Copy link
Author

If I remove the f098486 patch, I am back to having ssl handshake failures. Keep in mind, I am using the released 2.7.3 source. I re-tested just now to be sure. I was thinking that perhaps the change in compilation options for NUT and SSL were enough to fix this problem. But no, without the f098486 patch, the ssl handshake fails. I'd prefer not to have ugly debug warnings but that is preferable to a DOS vulnerability. So I'm not sure what should be done with that patch going forward, but it is necessary to secure NUT running Solaris.

When I get time, I'll packet sniff the upsd/upsmon connection and verify the connection is truly functional. Will report back then.

@tmcneill30
Copy link
Author

packet sniffing confirms that communication from upsmon to upsd is encrypted despite debug ssl warnings from upsd. So I finally have something that works even though its not entirely pretty.

@clepple
Copy link
Member

clepple commented Feb 3, 2016

We can raise the debug level of the SSL_ERROR_WANT_READ. However, they are not logged to syslog, just to the console in debug mode.

Other than that, it sounds like we're good to merge f098486 and its corresponding client-side patch, 213ee3d. If either of you have suggestions for how we can change the documentation, let me know. The user manual is in Asciidoc, which is mostly plain text, but I don't mind reformatting if you tell me what to include, and where.

@tmcneill30
Copy link
Author

FYI - I tested with 213ee3d and still got the upsd debug ssl warnings. For my edification, why do I need 213ee3d.

The documentation seems fine to me. I don't know if there is a place for a caveat that if you are running on Solaris you can't use the standard ssl libraries if you want to encrypt connections.

@clepple
Copy link
Member

clepple commented Feb 6, 2016

For my edification, why do I need 213ee3d 213ee3d.

My thought was that if the server-side sockets default to non-blocking, the client-side sockets need to be forced back to blocking because the non-SSL API doesn't provide a way to just loop back around the way that the select() loop does on the server side. It's one of those "if you don't need it now, you probably will later" correctness patches. Might depend on the size of the various packets over the TCP connection.

The documentation seems fine to me. I don't know if there is a place for a caveat that if you are running on Solaris you can't use the standard ssl libraries if you want to encrypt connections.

Sometimes we interpret "FAQ" as "Frequently Anticipated Questions"... When you say "standard SSL libraries" do you mean SFW?

@tmcneill30
Copy link
Author

Ø Sometimes we interpret "FAQ" as "Frequently Anticipated Questions"... When you say "standard SSL libraries" do you mean SFW?

Yes I did mean /usr/sfw

From: Charles Lepple [mailto:notifications@github.com]
Sent: Friday, February 05, 2016 6:43 PM
To: networkupstools/nut
Cc: McNeill, Thomas @ EngilityCorp
Subject: Re: [nut] Securing connections with SSL not working on Solaris 10 x86 (#246)

For my edification, why do I need 213ee3d 213ee3d.

My thought was that if the server-side sockets default to non-blocking, the client-side sockets need to be forced back to blocking because the non-SSL API doesn't provide a way to just loop back around the way that the select() loop does on the server side. It's one of those "if you don't need it now, you probably will later" correctness patches. Might depend on the size of the various packets over the TCP connection.

The documentation seems fine to me. I don't know if there is a place for a caveat that if you are running on Solaris you can't use the standard ssl libraries if you want to encrypt connections.

Sometimes we interpret "FAQ" as "Frequently Anticipated Questions"... When you say "standard SSL libraries" do you mean SFW?


Reply to this email directly or view it on GitHubhttps://github.com//issues/246#issuecomment-180661463.

@jimklimov jimklimov added solaris Solaris and illumos systems SSL/NSS Issues and PRs about SSL, TLS and other crypto-related matters labels Nov 10, 2022
@jimklimov
Copy link
Member

Cheers all, noticed this discussion and that a fix was proposed here (branch https://github.com/networkupstools/nut/compare/ssl_accept_nbio still exists) but not merged into main NUT codebase.

Any memories - what did this stall on? Did the change solve the practical issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug solaris Solaris and illumos systems SSL/NSS Issues and PRs about SSL, TLS and other crypto-related matters
Projects
None yet
Development

No branches or pull requests

5 participants