Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3DES broken? #451

Closed
Fabian-Gruenbichler opened this issue Apr 10, 2019 · 8 comments
Closed

3DES broken? #451

Fabian-Gruenbichler opened this issue Apr 10, 2019 · 8 comments

Comments

@Fabian-Gruenbichler
Copy link
Contributor

While testing #450, I noticed the following broken behaviour with regards to 3DES..

Corosync 2.x on Debian Stretch: works as expected

Corosync 3.x + knet on Debian Stretch (backported)/Buster (stock) + crypto_model nss:
cluster falls a apart / quorum can't be established, as verification/decryption fails

Apr 10 12:34:19 corosynctest1 systemd[1]: Starting Corosync Cluster Engine...
Apr 10 12:34:19 corosynctest1 systemd[423]: Failed to attach 423 to compat systemd cgroup /system.slice/corosync.service: No such file or directory
Apr 10 12:34:19 corosynctest1 corosync[423]:   [MAIN  ] Corosync Cluster Engine 3.0.1 starting up
Apr 10 12:34:19 corosynctest1 corosync[423]:   [MAIN  ] Corosync built-in features: dbus monitoring watchdog augeas systemd xmlconf snmp pie relro bindnow
Apr 10 12:34:19 corosynctest1 corosync[423]:   [TOTEM ] Initializing transport (Kronosnet).
Apr 10 12:34:19 corosynctest1 corosync[423]:   [TOTEM ] kronosnet crypto initialized: 3des/sha1
Apr 10 12:34:19 corosynctest1 corosync[423]:   [TOTEM ] totemknet initialized
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] common: crypto_nss.so has been loaded from /usr/lib/x86_64-linux-gnu/kronosnet/crypto_nss.so
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync configuration map access [0]
Apr 10 12:34:19 corosynctest1 systemd[1]: Started Corosync Cluster Engine.
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QB    ] server name: cmap
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync configuration service [1]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QB    ] server name: cfg
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QB    ] server name: cpg
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync profile loading service [4]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync resource monitoring service [6]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [WD    ] Watchdog not enabled by configuration
Apr 10 12:34:19 corosynctest1 corosync[423]:   [WD    ] resource load_15min missing a recovery key.
Apr 10 12:34:19 corosynctest1 corosync[423]:   [WD    ] resource memory_used missing a recovery key.
Apr 10 12:34:19 corosynctest1 corosync[423]:   [WD    ] no resources configured.
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync watchdog service [7]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QUORUM] Using quorum provider corosync_votequorum
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QB    ] server name: votequorum
Apr 10 12:34:19 corosynctest1 corosync[423]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QB    ] server name: quorum
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 has no active links
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 has no active links
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 10 12:34:19 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 has no active links
Apr 10 12:34:19 corosynctest1 corosync[423]:   [TOTEM ] A new membership (1:6636) was formed. Members joined: 1
Apr 10 12:34:19 corosynctest1 corosync[423]:   [CPG   ] downlist left_list: 0 received
Apr 10 12:34:19 corosynctest1 corosync[423]:   [QUORUM] Members[1]: 1
Apr 10 12:34:19 corosynctest1 corosync[423]:   [MAIN  ] Completed service synchronization, ready to provide service.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] rx: host: 2 link: 0 is up
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 10 12:34:20 corosynctest1 corosync[423]:   [TOTEM ] A new membership (1:6640) was formed. Members joined: 2
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [TOTEM ] Retransmit List: 2
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [TOTEM ] Retransmit List: 2
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [TOTEM ] Retransmit List: 2
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.
Apr 10 12:34:20 corosynctest1 corosync[423]:   [KNET  ] nsscrypto: PK11_DigestFinal (decrypt) failed (err -8190): security library: received bad data.

Corosync 3.x + ... + crypto_model openssl:

Apr 10 12:32:31 corosynctest1 systemd[1]: Starting Corosync Cluster Engine...
Apr 10 12:32:31 corosynctest1 corosync[407]:   [MAIN  ] Corosync Cluster Engine 3.0.1 starting up
Apr 10 12:32:31 corosynctest1 corosync[407]:   [MAIN  ] Corosync built-in features: dbus monitoring watchdog augeas systemd xmlconf snmp pie relro bindnow
Apr 10 12:32:31 corosynctest1 corosync[407]:   [TOTEM ] Initializing transport (Kronosnet).
Apr 10 12:32:32 corosynctest1 corosync[407]:   [TOTEM ] knet_handle_crypto failed: -2
Apr 10 12:32:32 corosynctest1 corosync[407]:   [KNET  ] common: crypto_openssl.so has been loaded from /usr/lib/x86_64-linux-gnu/kronosnet/crypto_openssl.so
Apr 10 12:32:32 corosynctest1 corosync[407]:   [KNET  ] opensslcrypto: unknown crypto cipher type requested
Apr 10 12:32:32 corosynctest1 corosync[407]:   [MAIN  ] Can't initialize TOTEM layer
Apr 10 12:32:32 corosynctest1 corosync[407]:   [MAIN  ] Corosync Cluster Engine exiting with status 15 at main.c:1529.
Apr 10 12:32:32 corosynctest1 systemd[1]: corosync.service: Main process exited, code=exited, status=15/n/a
Apr 10 12:32:32 corosynctest1 systemd[1]: corosync.service: Failed with result 'exit-code'.
Apr 10 12:32:32 corosynctest1 systemd[1]: Failed to start Corosync Cluster Engine.

(Maybe this one just needs a mapping from "3des" to however openssl calls it for EVP_get_cipherbyname?)

config is the following (modulo crypto_model, also tried different combinations of 3des + crypto_hash with no luck):

totem {
        version: 2

        # Set name of the cluster
        cluster_name: foobar

        # crypto_cipher and crypto_hash: Used for mutual node authentication.
        # If you choose to enable this, then do remember to create a shared
        # secret with "corosync-keygen".
        # enabling crypto_cipher, requires also enabling of crypto_hash.
        # crypto works only with knet transport
        crypto_cipher: 3des
        crypto_hash: sha1
        crypto_model: nss

        ip_version: ipv4
        config_version: 2
}

logging {
        # Log the source file and line where messages are being
        # generated. When in doubt, leave off. Potentially useful for
        # debugging.
        fileline: off
        # Log to standard error. When in doubt, set to yes. Useful when
        # running in the foreground (when invoking "corosync -f")
        to_stderr: yes
        # Log to a log file. When set to "no", the "logfile" option
        # must not be set.
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        # Log to the system log daemon. When in doubt, set to yes.
        to_syslog: yes
        # Log debug messages (very verbose). When in doubt, leave off.
        debug: off
        # Log messages with time stamps. When in doubt, set to hires (or on)
        #timestamp: hires
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
}

nodelist {
        # Change/uncomment/add node sections to match cluster configuration

        node {
                # Hostname of the node
                name: corosynctest1
                # Cluster membership node identifier
                nodeid: 1
                quorum_votes: 1
                # Address of first link
                ring0_addr: 10.0.0.100
                # When knet transport is used it's possible to define up to 8 links
                #ring1_addr: 192.168.1.1
        }
        node {
                # Hostname of the node
                name: corosynctest2
                # Cluster membership node identifier
                nodeid: 2
                quorum_votes: 1
                # Address of first link
                ring0_addr: 10.0.0.101
                # When knet transport is used it's possible to define up to 8 links
                #ring1_addr: 192.168.1.2
        }
        # ...
}

Switching to aesXXX and restarting corosync immediately fixes the issue (both for NSS and OpenSSL), so I am fairly certain this is just related to 3DES.

I have no interest in using 3DES whatsoever, I just stumbled upon this behaviour discrepancy and didn't want to leave it unreported ;) Not sure whether the actual breakage is in Corosync or knet either.

@jfriesse
Copy link
Member

@Fabian-Gruenbichler Thank for report. OpenSSL "bug" is quite easy to fix. Openssl expects "des3", not "3des". NSS looks more serious, especially because it was working in 2.x (and knet crypto code should be same) and also because it works for some packets (probably < 256 bytes) but not for larger one.

@fabbione You may consider take a look to this (NSS) problem. 3des is kind of "do not care" but we should be sure that it is not something more serious.

For 3des/des3. We can solve this in corosync quite easily, but it may make sense to handle this in knet. Easiest solution may be to change 3des to des3 in crypto_nss (so in corosync we could pass des3 if "3des" is selected to knet no matter what library is used), but this breaks "compatibility"

@fabbione
Copy link
Member

@jfriesse i am looking at the nss problem, for the 3des vs des3, I can mask it in crypto_openssl so that corosync only has to pass 3des, but in general knet does NOT mingle with those parameters in an attempt to standardize the crypto APIs and only act as pass-through.

@jfriesse
Copy link
Member

@fabbione crypto_nss DOES mingle those parameters (and that's why I've suggested to change crypto_nss).

fabbione added a commit to kronosnet/kronosnet that referenced this issue Apr 10, 2019
reported in corosync/corosync#451

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
@fabbione
Copy link
Member

@jfriesse as discussed on IRC, the algorithm name is 3des and not des3, therefor we will just fix openssl.c to understand 3des (already done).

still looking into why nss 3des doesn´t work, but i suspect some memory corruption because for example the PMTUd process works fine, but the TX thread gets stuck.

@fabbione
Copy link
Member

I found the issue with nss and 3des and the bug is in libnss. I do have an easy workaround in knet, but at this point I wonder if it´s worth supporting 3des at all given that is an old and broken encryption method.

@jfriesse @chrissie-c : any strong opinion on the subject?

@fabbione
Copy link
Member

FYI this is the workaround kronosnet/kronosnet@4f3828f

@jfriesse
Copy link
Member

@fabbione I'm all for removing 3des. "Thanks" to this bug I'm not even considering it as a backwards compatibility breakage.

PR is #452

@jfriesse
Copy link
Member

Closing this issue in favor of PR #452 . Also biggest possible problem turned out to be NSS bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants