New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug:1702043] Newly created files are inaccessible via FUSE #906
Comments
Time: 20200109T17:42:23 To set trace log level run these commands: gluster volume set brick-log-level TRACEgluster volume set client-log-level TRACEOnce the error happens, I would need all brick logs and mount log. |
Time: 20200224T04:42:53 Can you share some updates if you are able to reproduce it? Thanks, |
Are you still able to reproduce this ? |
Thank you for your contributions. |
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it. |
We actually have a similar problem. With writing 0,5 million 1k file into 1000 directory under glusterfs 6.0 and have the similar issues. It is probably a good idea to reopen this issue. |
@kindofblue Do you have steps to recreate this issue? If yes, could you please share? |
This issue can be created in two ways:
Notice since after sometime the problematic files can be back to normal. So it is better to put the test in a loop so you can see it more easily.
I also include the gluster volume configuration below, I disabled most cache, write-behind etc to run in a "safe" setting to prevent other factors to play. The glusterfs version is glusterfs-6.0-30.1.el7rhgs and the volume is mounted with default options.
|
@pranithk @xhernandez please check my above comment. |
Hi, This is what I see know from box03:
The same folder is perfectly fine from box01 and box02:
Here my gluster volume config:
Here my config on client side:
As said before, unmountg / re-mounting the gluster volume temporarily fixes the problem:
Any ideas ? |
It just happened again. This time, from box02 I can see some broken directories:
From box01 and box03 the directory is perfectly fine:
On box02 I can see from log files:
And on cloud10-gl (one of the gluster servers) I see:
I can't find anything else concerning uuid bf82eeb6-4ee9-4a4e-8f75-c917f83a4ce5 on the other servers log files. |
Hi reposit,
while 4 other fuse clients cannot access the correspondig directory
as you mentioned, umount / mount glusterfs solves the problem for a while.
directly after activation the above shown diretories and files were accessible even from the problematic clients...without umount / mount of the glusterfs. |
Hi @diete-p BTW, this is the full list of my current settings:
|
Hi resposit, |
Hi @diete-p Today I got the same problem again. This time I enabled "trace" logging on my clients. This is what I'm seeing from box03:
Seeing this in fuse client log file:
It looks there is something wrong with nl-cache. Not sure what it does exactly, I'll try to disable it. |
nl-cache is -ve lookup cache. It was developed for use in samba workloads as far as I understand. If a file created in some other mount is accessed quickly in this mount, you may get this error I think. |
I enabled nl-cache after reading recommendations for best performance:
I'll see if it gets better. Unfortunately those errors come out randomly, so I have to wait and see if they appear again. |
@pranithk you do not need to use multi-mount to observe this issue, single mount can reproduce this issue easily. |
Hi resposit, the day you published your last post we have been faced one more time with this error. I have then just turned off performance.readdir-ahead and the error went away. Regarding my first message, it should now be clear that this is not the solution. It just triggers 'something' so that the clients can access files and directories again without umount / mount. Then i turned off performance.nl-cache as you mentioned in your last post. Since then the error does not appear anymore. best regards. |
Hi @diete-p |
Thank you for your contributions. |
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it. |
We are facing the same issue, certain files show up as ????? ???? under the gluster FUSE mount. However the files are there in the underlying filesystem ( Individual brick )
Gluster version: 7.9 / Ubuntu 18.04 |
We also encountered the same problem. After repair, the fuse cache of the client cannot be deleted |
Same issue with us, both performance.nl-cache and performance.readdir-ahead are off |
URL: https://bugzilla.redhat.com/1702043
Creator: bio.erikson at gmail
Time: 20190422T19:45:11
Description of problem:
Newly created files/dirs will be inaccessible to the local FUSE mount after file IO is completed.
I have recently started to experience this problem after upgrading to gluster 6.0, and did not previously experience this problem.
I have two nodes running glusterfs, each with a FUSE mount pointed to localhost.
I have ran in to this problem with rsync, random file creation with dd, and mkdir/touch. I have noticed that files are accessible while being written too, and become inaccessible once the file IO is complete. It usually happens in 'chunks' of sequential files. After some period of time >15 min the problem resolves itself. The files on the local bricks ls just fine. The problematic files/dirs are accessible via FUSE mounts on other machines. Heal doesn't report any problems. Small file workloads seem to make the problem worse. Overwriting existing files seems to not create problematic files.
Gluster Info
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: ...
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
...
Options Reconfigured:
cluster.self-heal-daemon: enable
server.ssl: on
client.ssl: on
auth.ssl-allow: *
transport.address-family: inet
nfs.disable: on
user.smb: disable
performance.write-behind: on
diagnostics.latency-measurement: off
diagnostics.count-fop-hits: off
cluster.lookup-optimize: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
cluster.readdir-optimize: on
storage.build-pgfid: off
diagnostics.brick-log-level: ERROR
diagnostics.brick-sys-log-level: ERROR
diagnostics.client-log-level: ERROR
Client Log
The FUSE log is flooded with:
Version-Release number of selected component (if applicable):
apt list | grep gluster
bareos-filedaemon-glusterfs-plugin/stable 16.2.4-3+deb9u2 amd64
bareos-storage-glusterfs/stable 16.2.4-3+deb9u2 amd64
glusterfs-client/unknown 6.1-1 amd64 [upgradable from: 6.0-1]
glusterfs-common/unknown 6.1-1 amd64 [upgradable from: 6.0-1]
glusterfs-dbg/unknown 6.1-1 amd64 [upgradable from: 6.0-1]
glusterfs-server/unknown 6.1-1 amd64 [upgradable from: 6.0-1]
tgt-glusterfs/stable 1:1.0.69-1 amd64
uwsgi-plugin-glusterfs/stable,stable 2.0.14+20161117-3+deb9u2 amd64
How reproducible:
Always
Steps to Reproduce:
Actual results:
The text was updated successfully, but these errors were encountered: