Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ganesa server get ABORT signal every 1-5 min #106

Closed
slavonnet opened this issue Aug 20, 2016 · 8 comments
Closed

Ganesa server get ABORT signal every 1-5 min #106

slavonnet opened this issue Aug 20, 2016 · 8 comments

Comments

@slavonnet
Copy link

slavonnet commented Aug 20, 2016

[root@xintel2 ~]# rpm -qa | grep gane
nfs-ganesha-2.3.0-1.el7.x86_64
nfs-ganesha-gluster-2.3.0-1.el7.x86_64
glusterfs-ganesha-3.7.13-1.el7.x86_64

Use as HA cluster for gluster. 3 Hosts.

Now i get fast fix by adding Reload=on-abort to systemd, but i can't do long copy work....

dmesg;:

[140718.922482] nfs: server 192.168.50.12 not responding, timed out
[140718.922498] nfs: server 192.168.50.12 not responding, timed out
[140721.906696] nfs: server 192.168.50.12 not responding, timed out
[141046.628418] nfs: RPC call returned error 22
[141046.628432] nfs: RPC call returned error 22
[141046.628434] nfs: RPC call returned error 22

Logs detailed:

`20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[decoder] nfs_rpc_enqueue_req :RW LOCK :F_DBG :Released mutex 0x7f567f5dd218 (&wqe->lwe.mtx) at /builddir/build/BUILD/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1321
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[work-5] nfs_rpc_dequeue_req :RW LOCK :F_DBG :Released mutex 0x7f567f5dd218 (&wqe->lwe.mtx) at /builddir/build/BUILD/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1487
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[decoder] thr_decode_rpc_requests :DISP :DEBUG :exiting, stat=XPRT_IDLE
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[work-5] nfs_rpc_dequeue_req :DISP :F_DBG :wqe wakeup 0x7f567f5dd210
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[decoder] gsh_xprt_unref :DISP :F_DBG :DISP: FULLDEBUG: xprt 0x7f563c181e90 prerelease xp_requests=1 xp_refs=3 tag=thr_decode_rpc_requests line=1928
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[work-5] nfs_rpc_dequeue_req :DISP :F_DBG :dequeue_req try qpair REQ_Q_CALL 0x7f567f3f8c48:0x7f567f3f8cb0
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[decoder] gsh_xprt_unref :DISP :F_DBG :DISP: FULLDEBUG: xprt 0x7f563c181e90 postrelease xp_requests=1 xp_refs=2 tag=thr_decode_rpc_requests line=1928
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[work-5] nfs_rpc_consume_req :DISP :F_DBG :try splice, qpair REQ_Q_CALL consumer qsize=0 producer qsize=0
20/08/2016 21:07:54 : epoch 57b89ba5 : xintel2 : ganesha.nfsd-20470[decoder] fridgethr_freeze :RW LOCK :F_DBG :Acquired mutex 0x7f567f5b2910 (&fr->mtx) at /builddir/build/BUILD/nfs-ganesha-2.3.0/src/support/fridgethr.c:369
20/08/2016 21:07:54 : epoch 57b89c7a : xintel2 : ganesha.nfsd-24170[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version /builddir/build/BUILD/nfs-ganesha-2.3.0/src, built at Oct 30 2015 12:03:29 on c1bk.rdu2.centos.org
20/08/2016 21:07:54 : epoch 57b89c7a : xintel2 : ganesha.nfsd-24171[main] SetLevelDebug :LOG :NULL :LOG: Setting log level for all components to NIV_FULL_DEBUG
20/08/2016 21:07:54 : epoch 57b89c7a : xintel2 : ganesha.nfsd-24171[main] proc_block :CONFIG :F_DBG :------ At (/etc/ganesha/ganesha.conf:17): commit LOG
`
@slavonnet
Copy link
Author

type=ANOM_ABEND msg=audit(1471717188.581:112594): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=31889 comm="ganesha.nfsd" reason="memory violation" sig=6
type=ANOM_ABEND msg=audit(1471717214.001:112620): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=36578 comm="ganesha.nfsd" reason="memory violation" sig=6
type=ANOM_ABEND msg=audit(1471717642.271:112946): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=37118 comm="ganesha.nfsd" reason="memory violation" sig=6
type=ANOM_ABEND msg=audit(1471717707.481:112997): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=44448 comm="ganesha.nfsd" reason="memory violation" sig=6
type=ANOM_ABEND msg=audit(1471717814.734:113078): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=45590 comm="ganesha.nfsd" reason="memory violation" sig=6
type=ANOM_ABEND msg=audit(1471718055.017:113269): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=47528 comm="ganesha.nfsd" reason="memory violation" sig=6

@slavonnet
Copy link
Author

Also found funny memfree bug )
https://bugzilla.redhat.com/show_bug.cgi?id=1368739

@slavonnet
Copy link
Author

Maybe this fix?

gluster/glusterfs#48

@slavonnet
Copy link
Author

https://bugzilla.redhat.com/show_bug.cgi?id=1368741
Also...

Now i switch to 3.8.2.... 1 hour load and all look good

@soumyakoduri
Copy link
Contributor

Could you try disabling upcall and re-test -

gluster v set features.cache-invalidation off (for all the volumes)

@slavonnet
Copy link
Author

slavonnet commented Aug 21, 2016

whether it is normal to work with high availability?
with "features.cache-invalidation on":
On 3.8.2 did not seem turned off.

-- erased ---

@slavonnet
Copy link
Author

Ovirt can't read metadata objects on new created domain

[2016-08-21 16:50:30.982670] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-SSD-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 772273e9-d950-4bcf-88aa-b92e8134ecd3. (Possible split-brain)
[2016-08-21 16:50:30.983469] E [MSGID: 109040] [dht-helper.c:1186:dht_migration_complete_check_task] 0-SSD-dht: <gfid:772273e9-d950-4bcf-88aa-b92e8134ecd3>: failed to lookup the file on SSD-dht [Устаревший дескриптор файла]
[

@ffilz
Copy link
Member

ffilz commented Jan 17, 2017

Can this issue be closed? I will close it next week if there is no response otherwise.

@ffilz ffilz closed this as completed Feb 14, 2017
ffilz pushed a commit that referenced this issue Mar 3, 2018
(was16backport)
 * Remove xdr_array.c
 * inline xdr_array and xdr_vector
 * Replace xdr_[u_]int
 * Replace xdr_[u_]long
 * Remove unused xdr_[u_]quad
 * Remove unused xdr_[u_]hyper
 * Remove unused xdr_[u_]short
 * Remove unused xdr_[u_]char
 * Remove unused inline xdr_free and xdr_void
 * Merge inline xdr_union
 * Merge inline xdr_bool
 * Merge inline xdr_enum
 * Merge inline xdr_[u_]int.._t functions

Change-Id: Iaf1c6952f244468d92318def768f88b9370784ef
Signed-off-by: William Allen Simpson <william.allen.simpson@redhat.com>
madhuthorat pushed a commit to madhuthorat/nfs-ganesha that referenced this issue Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants