-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi,
I am using glusterfs 11.1 and Python 3.9.5.
I am trying to start a geo-replication:
Case 1: The secondary is missing the required lines in authorized_keys, and hence, the previously created session cannot be started - this is as expected.
Case 2: On the secondary, I didn't restart the glusterd after configuring the mountbroker. The geo-replication cannot start either - this is also as expected.
However, I cannot see the according errors in the logs on the primary or secondary because errlog
raises an exception. (Similar to #1132)
The exact command to reproduce the issue:
Case 1: To get the exception on the primary:
Remove these lines from authorized_keys on the secondary
command="/usr/libexec/glusterfs/gsyncd" ssh-rs ...
command="tar ${SSH_ORIGINAL_COMMAND#* }" ssh-rsa ....
and try to start a geo-replication session.
Case 2: To get the exception on the secondary:
Do not restart glusterd after removing a volume
gluster-mountbroker node-remove vol <user>
and adding a new volume
gluster-mountbroker add sec_vol <user>
. Afterwards, try to create+start a geo-replication.
The full output of the command that failed:
Case 1: On primary:
From "/var/log/glusterfs/geo-replication/some_path/gsyncd.log"
[2024-12-05 11:39:25.504956] E [syncdutils(monitor):845:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 <user>@<host> /usr/sbin/gluster --xml --remote-host=localhost volume info ter_vol}, {error=255}]
[2024-12-05 11:39:25.505311] E [syncdutils(monitor):363:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 317, in main
func(args)
File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 60, in subcmd_monitor
return monitor.monitor(local, remote)
File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 360, in monitor
return Monitor().multiplex(*distribute(local, remote))
File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 319, in distribute
svol = Volinfo(secondary.volume, "localhost", prelude, primary=False)
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 924, in __init__
po.terminate_geterr()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 894, in terminate_geterr
self.errfail()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 863, in errfail
self.errlog()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 854, in errlog
ls[0] = lp + ls[0]
TypeError: can only concatenate str (not "bytes") to str
Case 2: On secondary, I see this error:
From "/var/log/glusterfs/geo-replication-secondaries/some_path/gsyncd.log"
[2024-12-05 14:41:58.422231] E [syncdutils(secondary <path>/brick):363:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 317, in main
func(args)
File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 96, in subcmd_secondary
local.connect()
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1138, in connect
self.mounter.inhibit(label)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 880, in inhibit
po.terminate_geterr()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 894, in terminate_geterr
self.errfail()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 863, in errfail
self.errlog()
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 853, in errlog
ls = l.split(b'\n')
TypeError: must be str or None, not bytes
Expected results:
Case 1: On primary I should see this error:
[2024-12-06 10:04:50.667697] E [syncdutils(monitor):846:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 <user>@<host> /usr/sbin/gluster --xml --remote-host=localhost volume info sec_vol}, {error=255}]
[2024-12-06 10:04:50.667984] E [syncdutils(monitor):852:logerr] Popen: ssh> The account is locked due to 3 failed logins.
[2024-12-06 10:04:50.668061] E [syncdutils(monitor):852:logerr] Popen: ssh> (5 minutes left to unlock)
[2024-12-06 10:04:50.668185] E [syncdutils(monitor):852:logerr] Popen: ssh>
[2024-12-06 10:04:50.668245] E [syncdutils(monitor):852:logerr] Popen: ssh> The account is locked due to 3 failed logins.
[2024-12-06 10:04:50.668295] E [syncdutils(monitor):852:logerr] Popen: ssh> (5 minutes left to unlock)
[2024-12-06 10:04:50.668365] E [syncdutils(monitor):852:logerr] Popen: ssh>
[2024-12-06 10:04:50.668419] E [syncdutils(monitor):852:logerr] Popen: ssh> The account is locked due to 3 failed logins.
[2024-12-06 10:04:50.668466] E [syncdutils(monitor):852:logerr] Popen: ssh> (5 minutes left to unlock)
[2024-12-06 10:04:50.668511] E [syncdutils(monitor):852:logerr] Popen: ssh>
[2024-12-06 10:04:50.668560] E [syncdutils(monitor):852:logerr] Popen: ssh> <user>@<host>: Permission denied (publickey,keyboard-interactive).
Case 2: On secondary, I should see this error:
[2024-12-05 14:56:46.606551] E [syncdutils(secondary <path>/brick):851:logerr] Popen: /usr/sbin/gluster> 1 : failed with this errno (Operation not permitted)
Additional info:
This fix worked for me:
Adding these lines in /usr/libexec/glusterfs/python/syncdaemon/syncdutils.py
+ if isinstance(l, str):
+ l = l.encode()
ls = l.split(b'\n')
+ ls = list(map(lambda x: x.decode("utf-8"), ls))
Activity
sanjurakonde commentedon Mar 7, 2025
@it33-dev please open a pull request by adding suggested logs.
Exception in the errlog after an unsuccessful attempt to start a geo-…
geo-replication: Exception in the errlog after an unsuccessful attemp…