[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

amarts · 2020-02-23T16:48:17Z

bugzilla-URL: https://bugzilla.redhat.com/1618932
Created attachment 1476762
gfapi log

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

There's mulitple application using gfapi concurrently creating file in the same directory
(e51fd83622674cc9) and (e21ea6832d2b13d0) are log from different application processes.

application log

timezone is GMT+8

2018-08-18 19:35:03,703 DEBUG -31021968- writing to file cluster=4 FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88 (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:03,734 DEBUG -32369552- writing to file cluster=4 FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88 (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:03,786 DEBUG -31022448- Create new directory [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m] on cluster [4] ((unknown file): 0)(e51fd83622674cc9)
2018-08-18 19:35:03,795 CRITICAL -31021968- Failed to open cluster [4] object [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/ 0004_bfab2d1ea2da11e8a3196c92bf5c1b88] with mode [w]: [[Errno 5] Input/output error] (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:03,903 DEBUG -32366672- Directory [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m] exists on cluster [4] ((unknown file): 0)(e21ea6832d2b13d0)
2018-08-18 19:35:03,945 DEBUG -32369552- Open cluster [4] file [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88] with mode [w] (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:04,127 DEBUG -31021968- Open cluster [4] file [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88] with mode [w] (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:04,391 INFO -32369552- Rename file: cluster=4 src=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88 dst=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001 (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:04,485 INFO -31021968- Rename file: cluster=4 src=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88 dst=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004 (app:1461)(e51fd83622674cc9)

Actual results:

IO error happended when creating file, success after retry

dht-selfheal failure is observed in gfapi log, there is unmatched inode unlock
request reported from brick.

Expected results:

Additional info:

"gluster volume status" output is all ok,
but runing "gluster volume heal vol0 info" blocks and no output

gluster volume info

Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 18e1c05d-570a-4c97-aa91-ef984881c4f2
Status: Started
Snapshot Count: 0
Number of Bricks: 36 x 3 = 108
Transport-type: tcp

Options Reconfigured:
locks.trace: false
client.event-threads: 6
cluster.self-heal-daemon: enable
performance.write-behind: True
transport.keepalive: True
cluster.rebal-throttle: lazy
server.event-threads: 4
performance.io-cache: False
nfs.disable: True
cluster.quorum-type: auto
network.ping-timeout: 120
features.cache-invalidation: False
performance.read-ahead: False
performance.client-io-threads: True
cluster.server-quorum-type: none
performance.md-cache-timeout: 0
performance.readdir-ahead: True

Kazz3r24 · 2022-06-10T00:17:18Z

Is there any fix for this? I'm currently suffering from this issue on 10.2, Ubuntu server 20.04.

amarts added Migrated Type:Bug labels Feb 23, 2020

amarts self-assigned this Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

amarts commented Feb 23, 2020

Kazz3r24 commented Jun 10, 2022

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

Comments

amarts commented Feb 23, 2020

application log

gluster volume info

Kazz3r24 commented Jun 10, 2022