New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop node when disk full #281
Comments
|
...ping! |
|
just experienced that exact scenario last week... I have a monitoring system that alerted disk at 95%, but it was to late by the time I was able to get on a computer & fix the problem. had the node dropped, it wouldn't have been an issue. please consider adding this feature to galera... |
|
It may be useful for the code maintainers to know which version(s) of Galera wsrep you are using. |
|
mysql-wsrep-client-5.6-5.6.35-25.18.20170106.1f9ae89.el6.x86_64 |
|
We have the same issue. We are running a recent MariaDB image 10.1.22 with the wsrep components built in. I think this scenario will happen frequently as people are putting database nodes into Docker containers on systems with shared disk. Related to this, we have also experienced several recent SSD failures where the SSD will no longer accept writes and instead returns a disk full error. After a cold reboot the SSD becomes alive again. A fix to drop the node out of the cluster on write errors would help both scenarios (running out of disk space and having a bad node). |
|
Occured to most recent Debian packaged version: 10.2.9-MariaDB-10.2.9+maria~jessie-log. |
|
Does anyone have a solution for this yet? |
|
Just experienced this also. It seems like a really simple issue to resolve. |
|
So it's debian specific bug and not feature? Then only solution is crontab script check left space every minute and shutdown node when full. |
|
experienced it too with maria 10.4.28 and galera 4-26.4.14-1. |
When the disk is full on one single node, the whole cluster can get into a halt. I recently experienced this issue in a 3-node setup with
MariaDB-Galera-server-5.5.51-1.el6.x86_64andgalera-25.3.17-1.rhel6.el6.x86_64.Upon investigation I found out somebody else had the exact same issue, see https://groups.google.com/forum/#!topic/codership-team/qWkhRfa7xcM. It was suggested to file an issue, but searching this repository, I couldn't find it so I decided to file the report myself.
Situation:
Logs on node with disk full:
Nothing was logged on the other 2 nodes. The wsrep cluster stayed in the same state uuid. Nodes still accepted queries, but were unable to answer them. They just kept stalled in a state (
statistics,query end, ...) depending on the query type.After freeing up disk space on the node with the full disk, the cluster continued normal operations as if nothing happened (and still nothing was being logged on the other nodes).
Suggested solution:
The text was updated successfully, but these errors were encountered: