-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brick going offline on another host as well as the host which rebooted #2480
Comments
@dit101 Could you attach the client and brick logs to find what happened? |
logs.tar.gz |
@pranithk could you have a look at the logs to see the issue please? |
@dit101 As per the following logs on server2:
There are no cleanup_and_exit() logs which indicate graceful exit of the process. So someone killed the process with SIGKILL. One possibility could be OOM kill. Did you see any OOM kill messages in dmesg or /var/log/messages ? on server2 when this happened? Is this consistently recreateable? |
@pranithk it doesn't happen all the time. I may be able to repeat it by rebooting the same server again. I checked the logs on server2 and OOM killer didn't run. |
@dit101 if there was an OOM killer you should see it in /var/log/messages file, check the rotated logs also. |
@pranithk I checked /var/log/messages from yesterday for anything related to OOM Killer and there's nothing there |
@pranithk the log message at 1345 would have been me force starting the volume to bring the brick online. Server1 was rebooted at 1339 which then brought the bricks on Server2 offline as well as those on Server1. These servers aren't used at the moment as we're testing before they go live. I can try recreate the issue again if you want to collect more information? |
@dit101 What I was pointing at is, before that, there is no cleanup_and_exit() log. Something like this.
If this log is missing, then the brick is killed with SIGKILL, which can happen with OOM killer, so was asking |
@pranithk thanks for explaining. Nothing was done on Server2 and there was no issue. When I rebooted Server1 the bricks went offline on Server2. I've seen this issue on a couple of clusters now. Is there anything else you need from me? |
@dit101 Went through the logs multiple times. I don't see any logs that indicate any operation on the brick is done which could lead to brick process going down on server2 from logs. SIGKILL is the only possibility with the info you gave so far. If there is a way to find who is killing this process, that would be helpful. |
@pranithk I don't have a way to check at the moment but I think the brick process was running but just goes offline in Gluster. So a process listing still shows the brick process. I can try validate this by reproducing the issue if you want. If I can reproduce the issue is there any logging and output you'd like? |
Oh, Could you capture |
cc @nik-redhat One possibility for this could be because of portmap handling issues.Do you have any other information that needs captured? |
I too, couldn't found anything problematic in the logs shared.
|
Makes sense. I am wondering if there is a situation where the portmap information went wrong where the brick is running but the port in glusterd is not mapped correctly or something.
|
That might be the case....hence we need to check what port value is being stored in the brick vol files, and if at all it is getting updated correctly or not. |
In that case maybe netstat output would be helpful too. |
|
Thanks @pranithk @nik-redhat I'll try reproduce later and gather the required information |
@pranithk @nik-redhat this time when I rebooted server4 the server1 brick for volume1 went offline. I've collected and attached the logs including process and netstat output so hope they help. The brick process continued to run on server1 as per the process listing. |
@nik-redhat Do you want to give this a shot? I am suspecting this to be because of recent changes to port-mapper. But I could be wrong. |
@pranithk the recent changes to the portmapper have not yet gone into a release. It is only in the devel branch till now, so that shouldn't be the case here. |
Ah! Cool.
Sure, will take a look. |
Based on the info provided today. I went back to yesterday's logs and find the following suspicious logs
These logs suggest that when the glusterd went down on server1, brick processes were sending signin and signout as if they have come up and gone down to server2 which is leading to the volume status misbehaving on server2 because the brick paths are identical on both the servers. This is happening because when the glusterd was brought down on server1, connections were made to backup volfile server i.e server2.
So when the bricks are killed, signoff went to server2 even though the bricks are killed on server1. @dit101 As a workaround, please kill brick processes before killing glusterd for now. @amarts @xhernandez @rafikc30 @srijan-sivakumar As per my understanding the following patch introduced the bug. I think we shouldn't send the extra volfile servers for bricks.
Code in glusterfsd-mgmt.c that is relevant:
|
Thanks @pranithk I shutdown the bricks first and they didn't go offline on another server. I'll let you know if I see the issue when shutting down the bricks first as it doesn't happen all the time. But from what you've posted it looks like you've found the cause |
@pranithk Nice explanation. I can send a patch if you are busy. |
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Fixes: gluster#2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Fixes: gluster#2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Fixes: gluster#2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
Thanks Guys. Was a bit busy and will be all week. Was going to try this over this coming weekend if you needed me to. Since @pranithk confirmed it won't occur with different brick paths I won't test that scenario :-) |
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Fixes: gluster#2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE #2480 Fixes: #2480 Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE gluster#2480 Fixes: gluster#2480 >Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c >Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com> Change-Id: I45d962994a1128b59a04ca9880404fe7357990c6 Signed-off-by: nik-redhat <nladha@redhat.com>
A brick process requires portmap with it's local glusterd. Since the portmapper is not a centralized one, the informations are stored locally in each glusterd. When a glusterd goes down, connecting to a backup volfile server will result in undefined behaviour especially when the portmap signin and signout requests are send to a different glusterd than it is intended. If that happens then there can be undefined behaviour when there are bricks with the same path are present in differnt nodes. In this patch, we will prevent bricks connecting to a backup volfile servers. Which means that the bricks won't be connected to any other glusterd's to receive a management update if the glusterd on the local node goes down. THANKS TO PRANITH FOR THE RCA HERE #2480 Fixes: #2480 >Change-Id: Iddd6f1d0f0da1cf0c90729043f23a293d478bf7c >Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com> Change-Id: I45d962994a1128b59a04ca9880404fe7357990c6 Signed-off-by: nik-redhat <nladha@redhat.com> Co-authored-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
Description of problem:
Hi,
I have an issue where sometimes if I reboot a Gluster node the bricks on that host go offline as expected but also a brick on another host which can cause volume failures. I have to force a volume start to bring the brick back online.
Thanks
The exact command to reproduce the issue:
systemctl stop glusterd
killall glusterfs glusterfsd glusterd
init 6
gluster volume status
The full output of the command that failed:
Server1 was rebooted and the bricks on server2 also went offline and stayed offline
[root@server2 ~]# gluster volume status
Status of volume: volume1
Gluster process TCP Port RDMA Port Online Pid
Brick server1:/data/gluster/brick2/bric
k 49152 0 Y 1459
Brick server2:/data/gluster/brick2/bric
k N/A N/A N N/A
Brick server4:/data/gluster/brick2/bric
k 49152 0 Y 8873
Self-heal Daemon on localhost N/A N/A Y 8847
Bitrot Daemon on localhost N/A N/A Y 8994
Scrubber Daemon on localhost N/A N/A Y 9019
Self-heal Daemon on server3 N/A N/A Y 8980
Bitrot Daemon on server3 N/A N/A Y 9108
Scrubber Daemon on server3 N/A N/A Y 9127
Self-heal Daemon on server4 N/A N/A Y 8839
Bitrot Daemon on server4 N/A N/A Y 8997
Scrubber Daemon on server4 N/A N/A Y 9008
Self-heal Daemon on server1.prod.bluefa
ce.com N/A N/A Y 1521
Bitrot Daemon on server1.prod.blueface.
com N/A N/A Y 1481
Scrubber Daemon on server1.prod.bluefac
e.com N/A N/A Y 1493
Task Status of Volume volume1
There are no active volume tasks
Status of volume: volume2
Gluster process TCP Port RDMA Port Online Pid
Brick server1:/data/gluster/brick1/bric
k 49153 0 Y 1470
Brick server2:/data/gluster/brick1/bric
k N/A N/A N N/A
Brick server3:/data/gluster/brick1/bric
k 49152 0 Y 8963
Self-heal Daemon on localhost N/A N/A Y 8847
Bitrot Daemon on localhost N/A N/A Y 8994
Scrubber Daemon on localhost N/A N/A Y 9019
Self-heal Daemon on server4 N/A N/A Y 8839
Bitrot Daemon on server4 N/A N/A Y 8997
Scrubber Daemon on server4 N/A N/A Y 9008
Self-heal Daemon on server1.prod.bluefa
ce.com N/A N/A Y 1521
Bitrot Daemon on server1.prod.blueface.
com N/A N/A Y 1481
Scrubber Daemon on server1.prod.bluefac
e.com N/A N/A Y 1493
Self-heal Daemon on server3 N/A N/A Y 8980
Bitrot Daemon on server3 N/A N/A Y 9108
Scrubber Daemon on server3 N/A N/A Y 9127
Task Status of Volume volume2
There are no active volume tasks
Expected results:
The expectations is that bricks on server2 would not go offline when server1 is rebooted
Mandatory info:
- The output of the
gluster volume info
command:[root@server2 ~]# gluster volume info
Volume Name: volume1
Type: Replicate
Volume ID: f330ead1-2f98-49bc-a8ec-db3f5c18a3f4
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: server1:/data/gluster/brick2/brick
Brick2: server2:/data/gluster/brick2/brick
Brick3: server4:/data/gluster/brick2/brick (arbiter)
Options Reconfigured:
diagnostics.brick-log-level: INFO
features.scrub: Active
features.bitrot: on
network.ping-timeout: 5
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
Volume Name: volume2
Type: Replicate
Volume ID: f645aa78-cd37-4670-a27b-c4e3bb14965e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: server1:/data/gluster/brick1/brick
Brick2: server2:/data/gluster/brick1/brick
Brick3: server3:/data/gluster/brick1/brick (arbiter)
Options Reconfigured:
diagnostics.brick-log-level: INFO
features.scrub: Active
features.bitrot: on
network.ping-timeout: 5
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
- The output of the
gluster volume status
command:[root@server2 ~]# gluster volume status
Status of volume: volume1
Gluster process TCP Port RDMA Port Online Pid
Brick server1:/data/gluster/brick2/bric
k 49152 0 Y 1459
Brick server2:/data/gluster/brick2/bric
k N/A N/A N N/A
Brick server4:/data/gluster/brick2/bric
k 49152 0 Y 8873
Self-heal Daemon on localhost N/A N/A Y 8847
Bitrot Daemon on localhost N/A N/A Y 8994
Scrubber Daemon on localhost N/A N/A Y 9019
Self-heal Daemon on server3 N/A N/A Y 8980
Bitrot Daemon on server3 N/A N/A Y 9108
Scrubber Daemon on server3 N/A N/A Y 9127
Self-heal Daemon on server4 N/A N/A Y 8839
Bitrot Daemon on server4 N/A N/A Y 8997
Scrubber Daemon on server4 N/A N/A Y 9008
Self-heal Daemon on server1.prod.bluefa
ce.com N/A N/A Y 1521
Bitrot Daemon on server1.prod.blueface.
com N/A N/A Y 1481
Scrubber Daemon on server1.prod.bluefac
e.com N/A N/A Y 1493
Task Status of Volume volume1
There are no active volume tasks
Status of volume: volume2
Gluster process TCP Port RDMA Port Online Pid
Brick server1:/data/gluster/brick1/bric
k 49153 0 Y 1470
Brick server2:/data/gluster/brick1/bric
k N/A N/A N N/A
Brick server3:/data/gluster/brick1/bric
k 49152 0 Y 8963
Self-heal Daemon on localhost N/A N/A Y 8847
Bitrot Daemon on localhost N/A N/A Y 8994
Scrubber Daemon on localhost N/A N/A Y 9019
Self-heal Daemon on server4 N/A N/A Y 8839
Bitrot Daemon on server4 N/A N/A Y 8997
Scrubber Daemon on server4 N/A N/A Y 9008
Self-heal Daemon on server1.prod.bluefa
ce.com N/A N/A Y 1521
Bitrot Daemon on server1.prod.blueface.
com N/A N/A Y 1481
Scrubber Daemon on server1.prod.bluefac
e.com N/A N/A Y 1493
Self-heal Daemon on server3 N/A N/A Y 8980
Bitrot Daemon on server3 N/A N/A Y 9108
Scrubber Daemon on server3 N/A N/A Y 9127
Task Status of Volume volume2
There are no active volume tasks
- The output of the
gluster volume heal
command:[root@server2 ~]# gluster volume heal volume1 info
Brick server1:/data/gluster/brick2/brick
Status: Connected
Number of entries: 0
Brick server2:/data/gluster/brick2/brick
Status: Connected
Number of entries: 0
Brick server4:/data/gluster/brick2/brick
Status: Connected
Number of entries: 0
[root@server2 ~]# gluster volume heal volume2 info
Brick server1:/data/gluster/brick1/brick
Status: Connected
Number of entries: 0
Brick server2:/data/gluster/brick1/brick
Status: Connected
Number of entries: 0
Brick server3:/data/gluster/brick1/brick
Status: Connected
Number of entries: 0
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
**- Is there any crash ? Provide the backtrace and coredump
No crash
Additional info:
To get the bricks online again I used the force command
gluster volume start volname force
- The operating system / glusterfs version:
Centos 7\Gluster FS 9.2
Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration
The text was updated successfully, but these errors were encountered: