Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

Open
amarts opened this issue Feb 23, 2020 · 1 comment
Open

[bug:1618932] dht-selfheal.c: Directory selfheal failed #15

amarts opened this issue Feb 23, 2020 · 1 comment
Assignees

Comments

@amarts
Copy link
Owner

amarts commented Feb 23, 2020

bugzilla-URL: https://bugzilla.redhat.com/1618932
Created attachment 1476762
gfapi log

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

There's mulitple application using gfapi concurrently creating file in the same directory
(e51fd83622674cc9) and (e21ea6832d2b13d0) are log from different application processes.

application log

timezone is GMT+8

2018-08-18 19:35:03,703 DEBUG -31021968- writing to file cluster=4 FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88 (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:03,734 DEBUG -32369552- writing to file cluster=4 FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88 (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:03,786 DEBUG -31022448- Create new directory [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m] on cluster [4] ((unknown file): 0)(e51fd83622674cc9)
2018-08-18 19:35:03,795 CRITICAL -31021968- Failed to open cluster [4] object [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/ 0004_bfab2d1ea2da11e8a3196c92bf5c1b88] with mode [w]: [[Errno 5] Input/output error] (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:03,903 DEBUG -32366672- Directory [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m] exists on cluster [4] ((unknown file): 0)(e21ea6832d2b13d0)
2018-08-18 19:35:03,945 DEBUG -32369552- Open cluster [4] file [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88] with mode [w] (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:04,127 DEBUG -31021968- Open cluster [4] file [FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88] with mode [w] (app:1461)(e51fd83622674cc9)
2018-08-18 19:35:04,391 INFO -32369552- Rename file: cluster=4 src=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001_bfafdf58a2da11e8a3196c92bf5c1b88 dst=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0001 (app:1461)(e21ea6832d2b13d0)
2018-08-18 19:35:04,485 INFO -31021968- Rename file: cluster=4 src=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004_bfab2d1ea2da11e8a3196c92bf5c1b88 dst=FS/rt/mbXx/service-log_0/0760dee6406533f5aefa43f83bdd8918_171654947375444628.m/0004 (app:1461)(e51fd83622674cc9)

Actual results:

IO error happended when creating file, success after retry

dht-selfheal failure is observed in gfapi log, there is unmatched inode unlock
request reported from brick.

Expected results:

Additional info:

"gluster volume status" output is all ok,
but runing "gluster volume heal vol0 info" blocks and no output

gluster volume info

Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 18e1c05d-570a-4c97-aa91-ef984881c4f2
Status: Started
Snapshot Count: 0
Number of Bricks: 36 x 3 = 108
Transport-type: tcp

Options Reconfigured:
locks.trace: false
client.event-threads: 6
cluster.self-heal-daemon: enable
performance.write-behind: True
transport.keepalive: True
cluster.rebal-throttle: lazy
server.event-threads: 4
performance.io-cache: False
nfs.disable: True
cluster.quorum-type: auto
network.ping-timeout: 120
features.cache-invalidation: False
performance.read-ahead: False
performance.client-io-threads: True
cluster.server-quorum-type: none
performance.md-cache-timeout: 0
performance.readdir-ahead: True

@Kazz3r24
Copy link

Is there any fix for this? I'm currently suffering from this issue on 10.2, Ubuntu server 20.04.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants