Skip to content

linkfile: use fops->create in dht_linkfile_create #4585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

chen1195585098
Copy link
Contributor

A newly created linkfile will be deleted for the race, after this, the gfid between linkfile and datafile may get mismatched. And shards deletion process is also affected by such a mismatch, most shards in BRICK_PATH/.shard cannot be removed correctly and space cannot be release.

To fix this race, replace mknod with create during linkfile creation, in this case, we can keep this linkfile opening during the whole linkfile and datafile creations.

A linkfile will not be regarded as stale and deleted then, because its fd is opened and posix_unlink can handle it properly.

Fixes: #4548

A newly created linkfile will be deleted for the race, after
this, the gfid between linkfile and datafile may get
mismatched. And shards deletion process is also affected
by such a mismatch, most shards in BRICK_PATH/.shard
cannot be removed correctly and space cannot be release.

To fix this race, replace mknod with create during linkfile
creation, in this case, we can keep this linkfile opening
during the whole linkfile and datafile creations.

A linkfile will not be regarded as stale and deleted then,
because its fd is opened and posix_unlink can handle it
properly.

Fixes: gluster#4548
Signed-off-by: chenjinhao <chen.jinhao@zte.com.cn>
@gluster-ant
Copy link
Collaborator

Can one of the admins verify this patch?

1 similar comment
@gluster-ant
Copy link
Collaborator

Can one of the admins verify this patch?

@chen1195585098
Copy link
Contributor Author

During linkfile and datafile creations, the linkfile is kept opening.

[root@chenjinhao-pc ~]# ps aux|grep glusterfsd
root       44498  0.0  0.0 1221832 23704 ?       SLsl 15:30   0:00 /usr/local/sbin/glusterfsd -s 192.168.122.211 --volfile-id issue.192.168.122.211.brick1 -p /var/run/gluster/vols/issue/192.168.122.211-brick1.pid -S /var/run/gluster/605c42f9050e3ce0.socket --brick-name /brick1 -l /var/log/glusterfs/bricks/brick1.log --xlator-option *-posix.glusterd-uuid=0f58ea27-f808-47b2-9e92-5aed82a3091f --process-name brick --brick-port 59028 --xlator-option issue-server.listen-port=59028
root       44514  0.0  0.0 1221540 23652 ?       SLsl 15:30   0:00 /usr/local/sbin/glusterfsd -s 192.168.122.211 --volfile-id issue.192.168.122.211.brick2 -p /var/run/gluster/vols/issue/192.168.122.211-brick2.pid -S /var/run/gluster/a152b698ffdd88a0.socket --brick-name /brick2 -l /var/log/glusterfs/bricks/brick2.log --xlator-option *-posix.glusterd-uuid=0f58ea27-f808-47b2-9e92-5aed82a3091f --process-name brick --brick-port 51591 --xlator-option issue-server.listen-port=51591

[root@chenjinhao-pc ~]# ll /proc/44498/fd|grep -w 9
l-wx------. 1 root root 64  6月18日 15:32 272 -> /brick1/9
lrwx------. 1 root root 64  6月18日 15:32 9 -> socket:[94888]
[root@chenjinhao-pc ~]# ll /proc/44514/fd|grep -w 9
l-wx------. 1 root root 64  6月18日 15:32 272 -> /brick2/9
lrwx------. 1 root root 64  6月18日 15:32 9 -> socket:[108826]

And a newly created linkfile will not be deleted because its fd is opend and flag open-fd-key-status is set accordingly. So, posix_unlink can handle this properly.

[root@chenjinhao-pc 1]# grep -rn posix_unlink /var/log/glusterfs/bricks/brick{1,2}.log|grep "2025-06-18"
/var/log/glusterfs/bricks/brick1.log:35587:[2025-06-18 07:33:36.604831 +0000] I [MSGID: 113030] [posix-entry-ops.c:1401:posix_unlink] 0-issue-posix: open-fd-key-status: 1 for /brick1/9 
/var/log/glusterfs/bricks/brick1.log:35591:The message "I [MSGID: 113030] [posix-entry-ops.c:1401:posix_unlink] 0-issue-posix: open-fd-key-status: 1 for /brick1/9" repeated 2 times between [2025-06-18 07:33:36.604831 +0000] and [2025-06-18 07:33:36.608125 +0000]
/var/log/glusterfs/bricks/brick2.log:34740:[2025-06-18 07:33:36.604900 +0000] I [MSGID: 113030] [posix-entry-ops.c:1401:posix_unlink] 0-issue-posix: open-fd-key-status: 1 for /brick2/9 
/var/log/glusterfs/bricks/brick2.log:34744:The message "I [MSGID: 113030] [posix-entry-ops.c:1401:posix_unlink] 0-issue-posix: open-fd-key-status: 1 for /brick2/9" repeated 2 times between [2025-06-18 07:33:36.604900 +0000] and [2025-06-18 07:33:36.608189 +0000]

@chen1195585098
Copy link
Contributor Author

chen1195585098 commented Jun 18, 2025

@pranithk @mohit84 Hi, can you have a look at this? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concurrent creates when min-free-disk is exceeded can lead to gfid mismatches between linkto and data files
2 participants