-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[resolved] Read failed. Insufficient number of disks online and Disk does not support O_DIRECT #10206
Comments
What type of drives do you have? Here do you have failed drives? can you share your mount options ? Can you share entire messages from the server console? |
Minio binaries are running in a LXC container (using proxmox), and exposing one folder. Backend storage is either using ceph/rdp or an lvm mount (without any difference). They are mounted like this:
There is not much in console. Here are the full output for a minio with O_DIRECT messages:
|
Yes this is a correct error you can see we tried to write with O_DIRECT and it failed. But it is more like we saw invalid argument for the I/O |
Yes, because O_DIRECT seem to be working, the message seem wrong. Is there a way to get the original one ? In system logs I don't see anything. |
@the-glu can you provide output from all drives for the object ?
|
I think i might have found the problem its a typo in the internal rename() call @the-glu |
Fixed by #10208 |
Hello, Thanks for the fix. However I updated to the latest version and I still have the same errors during healing. Some examples:
With files like this:
Is it normal ? Or are minio in a somehow corrupted state ? Thanks |
Extra details during healing:
|
This is perfectly fine, the issue shouldn't occur though unless you had disk down and some inconsistent objects to begin with. There are of course ways to manually move things back and forth. There is something else that happened on your system which the healing is code is not able to handle, i.e missing entries. I will check .. |
Yes, I think there was one disk down at some point. If it's going to auto-heal somehow at some point it's fine :) |
@the-glu this shouldn't happen because this is exactly the fix that was already done. So unless you haven't upgraded all servers properly. can you check that? |
I'm can confirm they are all at the latest version. ^^' I tested the version in your PR and I can confirm indeed than O_DIRECT errors are gone. I still have a few corrupted files that I restored from backups and it's seems to works fine now. |
Those files were not really corrupted but instead, in an unexpected state (which is sort of quasi unreadable state), I helped another user by copying the same content on the server-side I am glad it worked for you. |
I tested on some less important files that where not restored and I can also confirm that mc cp things and copying them back is also a way to fix files in the unexpected state :) So for me it's fine with #10218 - thanks a lot for the help and the quick fixes! |
Fixed by #10218 |
Hello,
I upgraded one my cluster from minio.RELEASE.2020-04-23 to minio.RELEASE.2020-08-04 and I do have very strange behavior:
I'm trying to heal my cluster to ensure everything is ok.
Some files are failing with Read failed. Insufficient number of disks online.
When I try to look at files, they are strangely stored on disk:
I also see, randomly, not on all nodes that error:
I'm almost sure O_DIRECT is working (using dd with the flag worked, using a python script with os.open("mc", os.O_DIRECT | os.O_RDWR) got me a file descriptor).
I'm not sure what is happening, do someone have any pointer ?
Thanks!
The text was updated successfully, but these errors were encountered: