New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File lost, no handle? #362
Comments
The oplog you sent here shows that everything is OK on MooseFS side - when asked about the file You write "it happens sporadically", which means you don't have a foolproof way to repeat it. But do you have any scenario in which this happens more often than not? |
Hello @chogata I am unfortunately not able to replicate it on demand, but it has happened repeatedly over the last several months of testing. The error 'Resource temporarily unavailable' seems to be the EAGAIN (35) error from 'man errno'. The issue affects multiple mfsmounts across multiple servers. It does not however affect every mfsmount. The difference seems to be weather there is software, or processes running that interact with the files with write actions. For example, there are fifteen different mount points over three servers in which the files could be accessed. Three of the mfsmounts across two servers show the error. All other mfsmounts work properly. The three mount points showing the issue are also the only ones that have write interactions with the files. From what I can tell, a write action failed when I tried to delete the searx-env folder and sub folders. The issue then manifested on each of the mfsmounts where processes were running that also tried to write to the files. Mfsmounts where no writing is attempted do not seem affected. I've left the system running, with the issue still happening, so I can run further tests. Any ideas are most welcome. |
Hello @chogata, I've done additional testing over the last few weeks. From what I can tell, the file system action happens correctly on MooseFS, but the FuseFS cache is being corrupted somehow. I've been able to replicate the issue several times this week by doing large amounts of small file transfers using rsync. I've also seen it happen by making several-layered folders using a script, though that only happened once. It seems like cache consistency in relation to what's actually on MooseFS is lost. During my tests, I mounted MooseFS in two different ways.
The only fuse related error I've found is I've now set all systems to using 'mfscachemode=DIRECT'. I haven't been able to replicate the issue over the last week when using direct mode. Another indicator to it being a cache-related issue. I hope this helps to track down the root cause and find a resolution. For now, MooseFS doesn't appear safe to use with the default mfscachemode=FBSDAUTO. |
Thank you for all this new information - we will try to track the issue again. |
This just happened again while using mfscachemode=DIRECT. The mount command is: mfsmount IP:/web -o mfspassword=password -o mfsmkdircopysgid=1 -o mfscachemode=DIRECT /storage/chunk It happened while unzipping Drupal 9 zip from from https://www.drupal.org/project/drupal/releases/9.0.0. Here's an last bit of output from the unzip process.
I tried to delete the directory and received this.
Here's the mfsfileinfo for the sourcedialog.js file from three different physical machines all accessing the same export. The original mount:
Server two:
Server three:
I deleted the drupal-9.0.0 folder from a different mfs mount than the one the issue started on. It deletes just fine, even though it would not delete on the initial mount. I then attempted to unzip the archive again on the initial mfs mount. It failed the same way, but with a different file. Again, deleting it from a secondary mfs mount works. A few more tests show the same repeat pattern. Lastly, I attempted the same unzip from the other two mount points. It failed each time, and now I can't delete it without creating a fourth mfs mount, or rebooting servers to clear it out. After rebooting the server, I was able to delete the files, and run the unzip again. It worked correctly this time. The systems were using nullfs mounts on top of the mfs one in this case. I've switched back to using multiple mfs mounts without nullfs. I've also left mfscachemode at default since the issue has now happened in direct mode. Other details: |
It happened again this morning. I used rsync to copy a bunch of files over the network from a non mfs disk to a mfs mount. After the copy, I attempted to delete some unneeded data that was copied and received the 'Resource temporarily unavailable' error again. In this case, I was using mfscachemode=DIRECT and multiple mfs mounts on a single server. Only one of the mfs mounts exhibited the issue. Rebooting the server resolved the issue. |
Okay, that's good to know. Our test instance is currently running a long batch of intense testing, it needs to finish before I can start testing this issue, so it might be a couple more days. But I do have this on my todo list still. |
I've run into additional issues that seem related. 1: Using mfscachemode=AUTO, or FBSDAUTO on FreeBSD 12.1p6 leads to data corruption. Changes in files do not propagate to other mfs mount points. This is easily replicated. I can open a separate issue for this if needed. The The logs show this message repeatedly.
2: Using mfscachemode=DIRECT exhibits the 'Resource temporarily unavailable' issue sporadically. I've found I can trigger this most often by doing large amounts of rapid file manipulation. For example, using rsync on a large number of small files. Secondly, I've found compiling programs such as Node.js directly on the mfs mount will eventually fail leaving a zombie process. I don't see the 'Resource temporarily unavailable' error when that happens, but program compiling works when using AUTO/FBSDAUTO mode. The 3: Using mfscachemode=NEVER seems to avoid issues 1 and 2. File changes propagate correctly and the 'Resource temporarily unavailable' issue does not happen. The At this point, I'm setting all the servers to mfscachemode=NEVER and I will continue to see whether I can replicate any of the above errors again. Other information: The The value of EX: I ran this three times in a row.
I'll report again if I can replicate any of the issues while mfscachemode=NEVER is set. Thanks for putting this on your list. Please let me know if there is anything else I can do to help. |
The
|
I've been trying to replicate your initial issue (Resource temporarily unavailable) for a couple of days now, with no success. No amount of copying/deleting, rsyncing/deleting or unpacking/deleting is yielding any bad effects, including trying to delete (rm -rf) a directory tree on one client, while another is still actively writing to it. So, just to make sure: you reported this first on .112, but you then mentioned .113, so I assume the problem still exists on .113 for you? |
It happened again last night. This time it was a simple WordPress plugin update that triggered it. There was an additional error last night.
I've seen that error, and it's opposite on different days, but I'd not associated it to this issue. Here's the opposite.
Here's the sysctl.conf values that differ from the defaults.
I've also experimented with using different loader.conf options, but the issue shows up either way.
|
This might be a problem and might possibly explain the 'Resource temporarily unavailable' error. On a local file system on a single operating system, when you open a session and On a network filesystem, like MooseFS, there is no single system that "knows" all CWDs. The OS on the machine with your first MooseFS mount does not know, what happens on your other machine with your other MooseFS mount, so as long as a a directory deleted from MooseFS directory is no longer a CWD of any process on this first machine, this OS thinks it's okay to reuse this inode number. But if it is a CWD of a process on the second machine and we re-use it, we have a problem. So it falls to MooseFS to "ask" (through clients) about all CWDs on all machines that have this instance of MooseFS mounted and keep this list and make sure inodes from this list are never reused as long as they remain on the list. We had a lot of issues with that on some versions of Linux kernel, but so far not on FreeBSD. However, we always run FreeBSD on default security options, which do not include hiding other uid's processes... This asks for more tests, I will try to squeeze them somewhere in this week. I'm not saying this is for sure the case, but IF it is, it might turn out we just cannot support this setting. Tests will tell. In the meantime, since this might turn out to be a different issue altogether, I would like to ask if you could mount your mounts with some cache options disabled, specifically those: The |
The mfsmount program is run as root, who can see all users regardless of the security.bsd.see_* settings. I temporarily set the requested cache features to 0 on a single node. The network bandwidth jumped from an average of 20Mb/s to 100Mb/s. There is new hardware being deployed this month. I'm setting aside a server to run stress tests on. I'll include testing with all the cache features set to 0 once the server is available. |
Hello, Even with new replacement hardware, this continues to happen. I can't keep a single server up for more than 48 hours without this issue occurring. What can I do to help track this down for a resolution? |
I have tried many times to repeat your problem. I succeeded ONCE. I tried to repeat exactly what I did then, but no luck. We can try to raise the issue with FreeBSD, but I'm afraid without a repeatable scenario they won't be able to do anything about it... |
Hello @chogata I'm glad to hear you were able to replicate it at least once. May I ask what steps you took that exhibted the issue? I've noticed this problem has become more common as the traffic and file changes have increased. I'm still trying to find a way to manually replicate this, but I suspect a very specific combination of things has to happen. Thank you for sticking with it. I'll follow up once I have a way to replicate this. |
This is still an issue in 3.0.114. Unfortunately, while it continues to happen regularly, I'm still unable to consistently replicate it by hand. |
Unfortunately, this is still an issue on 3.0.115 with FreeBSD 12.1-RELEASE-p10. Can you provide any additional troubleshooting advice? |
Hello @chogata I've tried testing mfsmount with I've continued testing the normal mount options and found an interesting issue. I'm not been able to create the issue, or catch it happening within twenty four hours of a client server reboot. I can replicate it by hand after the server has been up for at least 24 hours. The longer the server is up, the easier it is. My test is simply to extract a tarball on one mfsmount while deleting the extracted files from another mfsmount. Rinse wash repeat and the issue will eventually show. When it shows up, it happens all at once. It does not get better or worse after that. Un-mounting mfs is the only resolution. Again, I can't make it happen if the servers have been up for less than 24 hours. The closer I get to 48 hours the easier it is trigger. This is happening on servers with a fairly consistent work load. I wonder if perhaps some other type of limit is being hit causing a disconnect between the mfsclient and underlying FuseFS? The CPU, memory and bandwidth are well below their limits. Here's the average bandwidth usage over the last 30 days for the various MooseFS related systems. Clients: 10 Mb/s Of note, this continues to happen on FreeBSD 12.2 clients too. |
Running the above tests on a freshly rebooted client server took until 28 hours and 16 minutes to trigger. Note, I was not running the test in an automated fashion. I was simply running it by hand when ever I had a few minutes to run it.
This on FreeBSD 12.2, which seems to be easier to trigger on than 12.1 was. Since I can replicate this somewhat reliably now, is there anything else that can be done to resolve it? |
A bit of an explanation for anybody interested in this thread :) On a network filesystem it can happen that on one client (mount) an inode gets deleted while it is still used on another client (mount) on a different machine. If this inode is, for some reason, remembered in kernel cache on this second machine, it cannot be re-used. There are many instances of such behaviour. One of them is if a deleted (on one machine) catalog is a CWD - current working directory - of a process (on another machine). Those inodes are reserved and cannot be re-used in MooseFS until they stop being a CWD. This is enough for all systems, but FreeBSD, which, for some reason, remembers also parents of those CWD directories and disallows their re-use. It's a very specific, FreeBSD-only behaviour and it took us quite some time to find it as the cause of problems described in this tread. We introduced a special, FreeBSD only workaround for this problem, so hopefully our FreeBSD users won't experience the "resource temporary unavailable" problem anymore with 3.0.116. We encourage everybody to test and share the results with us. |
nice detective work. |
Have you read through available documentation and open Github issues?
Yes
Is this a BUG report, FEATURE request, or a QUESTION? Who is the indended audience?
BUG report
System information
Your moosefs version and its origin (moosefs.com, packaged by distro, built from source, ...).
All componets built from the FreeBSD ports tree.
moosefs3-master-3.0.112_1
moosefs3-cgiserv-3.0.112_1
moosefs3-master-3.0.112_1
moosefs3-chunkserver-3.0.112_1
moosefs3-client-3.0.112_1
fusefs-libs3-3.9.1
Operating system (distribution) and kernel version.
FreeBSD 12.1p3
Hardware / network configuration, and underlying filesystems on master, chunkservers, and clients.
MooseFS master server: 1x
CPU: Intel(R) Xeon(R) CPU E5-2637 v2 @ 3.50GHz
RAM: 256GB
HW DISK: 4x 1.2TB in raidz1
Moosefs chunk servers: 5x identical
CPU: Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz
RAM: 32GB
HW DISK: 4x 8TB in raidz1.
Moose disk is a single folder using ZFS reservation and quotas mounted at /storage/chunk
DATA: storage/chunk 889G 18.1T 889G /storage/chunk
MooseFS client servers: 2x identical
CPU: Intel(R) Xeon(R) CPU E5-2667 0 @ 2.90GHz
RAM: 192GB
HW DISK: 4x 1.2TB configured as raidz1
Network
All servers are plugged in twice with 10GB uplinks. The uplinks are configured as LACP across two Cisco switches in a stack. The network cards are Dell branded Intel cards using the 'ix' driver. MTU is set to 9198.
How much data is tracked by moosefs master (order of magnitude)?
Describe the problem you observed.
It appears that the file handle can be lost, causing files and folders to be unmanageable. I can't replicate it manually, it happens sporadically. Here's the output from one such event.
The initial command and error happened during the deletion of a folder with multiple sub folders and files. Here's the one that triggered the issue this time.
This is the lower directory point from the above error.
This is what exists in the packaging folder.
I'm not sure what 'Resource temporarily unavailable' means, but the file can not be managed by normal tools anymore.
Here's the oplog output during the ls -l command.
At this point, the entire /apps/searx-env/lib/python3.8/site-packages/setuptools/_vendor/packaging structure is undeletable because of the non-existent, yet existing file pycache.
I've seen this with different files/folder combinations as well.
Other MooseFS mount points are not affected, even on the same file directory structure. For example, using a separate mount and running ls works.
The only fix I've found is to unmount the mfsmount with the issue and remount it. This may be related to #350, or #354 as it seems the file handler is being lost.
The text was updated successfully, but these errors were encountered: