Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XrdHttp crash when dealing with incomplete headers #1176

Closed
esindril opened this issue Apr 12, 2020 · 17 comments · Fixed by #1228
Closed

XrdHttp crash when dealing with incomplete headers #1176

esindril opened this issue Apr 12, 2020 · 17 comments · Fixed by #1228
Assignees

Comments

@esindril
Copy link
Contributor

This crash happened in one of the DOMA tests. Apparently, there an issue with the "state machine" represented by the CurrentReq.reqstate since its value is "-1" and no external handler is called for the HTTP request but also no Bridge object is created. Note there is an external handler that should deal with such GET requests but since reqstate is non-0 this never gets invoked. Below you can find a backtrace of the crash and other relevant information. This happens with XRootD 4.11.3. A full coredump is available here:

lxfsre09a02.cern.ch:/var/spool/abrt/ccpp-2020-04-09-20:06:50-3949

The backtrace:

(gdb) f 0
#0  0x00007f2cfdae8e8f in XrdHttpReq::ProcessHTTPReq (this=this@entry=0x7f2cdd8a9520) at /usr/src/debug/xrootd-4.11.3/src/XrdHttp/XrdHttpReq.cc:1240
1240                if (!prot->Bridge->Run((char *) &xrdreq, 0, 0)) {
(gdb) bt
#0  0x00007f2cfdae8e8f in XrdHttpReq::ProcessHTTPReq (this=this@entry=0x7f2cdd8a9520) at /usr/src/debug/xrootd-4.11.3/src/XrdHttp/XrdHttpReq.cc:1240
#1  0x00007f2cfdae1105 in XrdHttpProtocol::Process (this=0x7f2cdd8a9400, lp=<optimized out>) at /usr/src/debug/xrootd-4.11.3/src/XrdHttp/XrdHttpProtocol.cc:988
#2  0x00007f2d00845a59 in XrdLink::DoIt (this=0x7f2ca6385818) at /usr/src/debug/xrootd-4.11.3/src/Xrd/XrdLink.cc:441
#3  0x00007f2d00848def in XrdScheduler::Run (this=0x610e78 <XrdMain::Config+440>) at /usr/src/debug/xrootd-4.11.3/src/Xrd/XrdScheduler.cc:357
#4  0x00007f2d00848f39 in XrdStartWorking (carg=<optimized out>) at /usr/src/debug/xrootd-4.11.3/src/Xrd/XrdScheduler.cc:87
#5  0x00007f2d0080ea67 in XrdSysThread_Xeq (myargs=0x7f2caac5cde0) at /usr/src/debug/xrootd-4.11.3/src/XrdSys/XrdSysPthread.cc:86
#6  0x00007f2d003c2e65 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f2cff6c488d in clone () from /lib64/libc.so.6
(gdb) p prot->Bridge
$10 = (XrdXrootd::Bridge *) 0x0
(gdb) f 1
#1  0x00007f2cfdae1105 in XrdHttpProtocol::Process (this=0x7f2cdd8a9400, lp=<optimized out>) at /usr/src/debug/xrootd-4.11.3/src/XrdHttp/XrdHttpProtocol.cc:988
988      rc = CurrentReq.ProcessHTTPReq();
(gdb) p CurrentReq.reqstate
$11 = -1
(gdb) p CurrentReq
$12 = {<XrdXrootd::Bridge::Result> = {_vptr.Result = 0x7f2cfdcf5ad0 <vtable for XrdHttpReq+16>}, httpStatusCode = 0, httpStatusText = "", m_transfer_encoding_chunked = false, m_current_chunk_offset = 0,
  m_current_chunk_size = -1, prot = 0x7f2cdd8a9400, ralist = 0x0, request = XrdHttpReq::rtGET, requestverb = "GET", allheaders = std::map with 6 elements = {["Accept-Encoding"] = "gzip,deflate",
    ["Authorization"] = "Bearer MDAxNGxvY2F0aW9uIGVvc3BwcwowMDM0aWRlbnRpZmllciA3ODUxMWY3YS1jOGQ4LTQ2MDAtODFhMS1kZTgxMGRjMzg5NGEKMDAxNmNpZCBuYW1lOnRiZWVybWFuCjAwNTJjaWQgYWN0aXZpdHk6UkVBRF9NRVRBREFUQSxVUExPQUQsRE9XTkxPQUQsREVMR"..., ["Connection"] = "Keep-Alive", ["Host"] = "lxfsre09a02.cern.ch:9001", ["User-Agent"] = "dCache/5.2.16", ["Want-Digest"] = "adler32;q=1, md5;q=0.8"}, resource = {
    _vptr.XrdOucString = 0x7f2d00a6dc10 <vtable for XrdOucString+16>, str = 0x7f2c9dc36150 "/eos/opstest/tpc/https/domatest/file4_cfad65f5-4487-4901-9f0e-af2d20f95c7a", len = 74, siz = 75,
    static blksize = -1}, opaque = 0x7f2c8fa3f400, resourceplusopaque = {_vptr.XrdOucString = 0x7f2d00a6dc10 <vtable for XrdOucString+16>,
    str = 0x7f2cdd96b800 "/eos/opstest/tpc/https/domatest/file4_cfad65f5-4487-4901-9f0e-af2d20f95c7a?encURI=%26cap.sym%3Dv%2FTsLkR%2BAYK5TGUAdartlwDahHY%3D%26cap.msg%3DEHPrLOVsWhX%2FV5EHdHgLLXWEibD9RoqwyG9fTh%2BVylyiMajoSE0x7O"..., len = 947, siz = 964, static blksize = -1}, headerok = true, rwOps = std::vector of length 0, capacity 0, rwOps_split = std::vector of length 0, capacity 0, keepalive = true,
  length = 0, depth = 0, sendcontinue = false, host = "lxfsre09a02.cern.ch:9001", destination = "", m_req_digest = "adler32;q=1, md5;q=0.8", m_resource_with_digest = {
    _vptr.XrdOucString = 0x7f2d00a6dc10 <vtable for XrdOucString+16>, str = 0x7f2cdd80e098 "", len = 0, siz = 1, static blksize = -1}, m_digest_header = "", hdr2cgistr = "",
  m_appended_hdr2cgistr = false, rwOpDone = 0, rwOpPartialDone = 0, xrdreq = {header = {streamid = "\000", requestid = 47883, body = '\000' <repeats 15 times>, dlen = 0}, admin = {streamid = "\000",
      requestid = 47883, reserved = '\000' <repeats 15 times>, dlen = 0}, auth = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 11 times>, credtype = "\000\000\000", dlen = 0},
    bind = {streamid = "\000", requestid = 47883, sessid = '\000' <repeats 15 times>, dlen = 0}, chmod = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 13 times>, mode = 0, dlen = 0},
    close = {streamid = "\000", requestid = 47883, fhandle = "\000\000\000", fsize = 0, reserved = "\000\000\000", dlen = 0}, decrypt = {streamid = "\000", requestid = 47883, expectrid = 0,
      version = 0 '\000', flags = 0 '\000', reserved = '\000' <repeats 11 times>, dlen = 0}, dirlist = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 14 times>, options = "",
      dlen = 0}, endsess = {streamid = "\000", requestid = 47883, sessid = '\000' <repeats 15 times>, dlen = 0}, getfile = {streamid = "\000", requestid = 47883, options = 0,
      reserved = "\000\000\000\000\000\000\000", buffsz = 0, dlen = 0}, locate = {streamid = "\000", requestid = 47883, options = 0, reserved = '\000' <repeats 13 times>, dlen = 0}, login = {
      streamid = "\000", requestid = 47883, pid = 0, username = "\000\000\000\000\000\000\000", reserved = 0 '\000', ability = 0 '\000', capver = "", role = "", dlen = 0}, mkdir = {streamid = "\000",
      requestid = 47883, options = "", reserved = '\000' <repeats 12 times>, mode = 0, dlen = 0}, mv = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 13 times>, arg1len = 0,
      dlen = 0}, open = {streamid = "\000", requestid = 47883, mode = 0, options = 0, reserved = '\000' <repeats 11 times>, dlen = 0}, ping = {streamid = "\000", requestid = 47883,
      reserved = '\000' <repeats 15 times>, dlen = 0}, prepare = {streamid = "\000", requestid = 47883, options = 0 '\000', prty = 0 '\000', port = 0, optionX = 0,
      reserved = "\000\000\000\000\000\000\000\000\000", dlen = 0}, protocol = {streamid = "\000", requestid = 47883, clientpv = 0, flags = 0 '\000',
      reserved = "\000\000\000\000\000\000\000\000\000\000", dlen = 0}, putfile = {streamid = "\000", requestid = 47883, options = 0, reserved = "\000\000\000\000\000\000\000", buffsz = 0, dlen = 0},
    query = {streamid = "\000", requestid = 47883, infotype = 0, reserved1 = "\000", fhandle = "\000\000\000", reserved2 = "\000\000\000\000\000\000\000", dlen = 0}, read = {streamid = "\000",
      requestid = 47883, fhandle = "\000\000\000", offset = 0, rlen = 0, dlen = 0}, readv = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 14 times>, pathid = 0 '\000', dlen = 0},
    rm = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 15 times>, dlen = 0}, rmdir = {streamid = "\000", requestid = 47883, reserved = '\000' <repeats 15 times>, dlen = 0}, set = {
      streamid = "\000", requestid = 47883, reserved = '\000' <repeats 14 times>, modifier = 0 '\000', dlen = 0}, sigver = {streamid = "\000", requestid = 47883, expectrid = 0, version = 0 '\000',
      flags = 0 '\000', seqno = 0, crypto = 0 '\000', rsvd2 = "\000\000", dlen = 0}, stat = {streamid = "\000", requestid = 47883, options = 0 '\000',
      reserved = "\000\000\000\000\000\000\000\000\000\000", fhandle = "\000\000\000", dlen = 0}, sync = {streamid = "\000", requestid = 47883, fhandle = "\000\000\000",
      reserved = '\000' <repeats 11 times>, dlen = 0}, truncate = {streamid = "\000", requestid = 47883, fhandle = "\000\000\000", offset = 0, reserved = "\000\000\000", dlen = 0}, write = {
      streamid = "\000", requestid = 47883, fhandle = "\000\000\000", offset = 0, pathid = 0 '\000', reserved = "\000\000", dlen = 0}, writev = {streamid = "\000", requestid = 47883, options = 0 '\000',
      reserved = '\000' <repeats 14 times>, dlen = 0, static doSync = 1}}, xrdresp = kXR_ok, xrderrcode = kXR_noErrorYet, etext = "", redirdest = {
    _vptr.XrdOucString = 0x7f2d00a6dc10 <vtable for XrdOucString+16>, str = 0x7f2cdd80e088 "", len = 0, siz = 1, static blksize = -1}, iovP = 0x0, iovN = 0, iovL = 0, final = false, filesize = 0,
  fileflags = 0, filemodtime = 0, fhandle = "\000\000\000", fopened = false, stringresp = "", reqstate = -1, writtenbytes = 0}
(gdb)

The actual log lines corresponding to this request:

200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp: received dlen: 16
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp: received dump: 71 69 84 32 47 101 111 115 47 111 112 115 116 101 115 00
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp: Protocol matched. https: 0
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp:  Process. lp:0x7f2ca6385818 reqstate: 0
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp:  Setting host: [::ffff:192.41.230.26]
200409 20:06:49 14876 sysXrdHttp: getDataOneShot BuffAvailable: 1048576 maxread: 1048576
200409 20:06:49 14876 sysXrdHttp: read 1448 of 1048576 bytes
200409 20:06:49 14876 sysXrdHttp:  rc:962 got hdr line: GET /eos/opstest/tpc/https/domatest/file4_cfad65f5-4487-4901-9f0e-af2d20f95c7a?encURI=%26cap.sym%3Dv%2FTsLkR%2BAYK5TGUAdartlwDahHY%3D%26cap.msg%3DEHPrLOVsWhX%2FV5EHdHgLLXWEibD9RoqwyG9fTh%2BVylyiMajoSE0x7OiUlf630hWmuQQ13VLqh8%2FXxc5VS0VgaxQmDYDaZFYxUMbRvQR135ukoGAsHrrKsxXLo6gmQr2H%2BPHYLEU8xZp9wQXIQnOS8IsE545QWMLXpLnQjHvcJHd24aixCEEj8qPsWRaCGAl%2BxmaLX3IgZGvdNhSeac1tWW17AtPqPv37aE6KlFkvEOmQv42DGEtjckhTgQ4%2BVx9%2BHyDOwYgRpalieGEUqy%2F1Rj5zORWA3E7qbWsHBojBaIAB4pL4BNdNNkkHFvnF5k9ECME%2BkL%2Bhf6K5iXcPN%2F06BWDpRk9YV1jIRzYC0zffHbyQHH04p%2FNLhzUsV6r1%2FaxH00X3lWwEkWrImcl8obM7DzZeDBXHct8KvddyDXvYv95PTsfFxK5OA0icR6y1qSHbtQ%2FPl44Vf9x4lgRBbfaBruo0WMTiY3zFc93Y67Aw0fKygeQ6cM5EzEm2ZjG2PO3tIaNCRbfgya73gS5GWICUkf36OhTzdMBl587OH12l9Us%3D%26mgm.logid%3De2392620-7a8c-11ea-8e3e-fa163ec4aef9%26mgm.blockchecksum%3Dignore%26mgm.replicaindex%3D0%26mgm.replicahead%3D0%26mgm.etag%3D%221275783243513200640:88708a07%22%26mgm.id%3D11b47e19c%26mgm.mtime%3D1586451760 HTTP/1.1
200409 20:06:49 14876 sysXrdHttp:  Parsing first line: GET /eos/opstest/tpc/https/domatest/file4_cfad65f5-4487-4901-9f0e-af2d20f95c7a?encURI=%26cap.sym%3Dv%2FTsLkR%2BAYK5TGUAdartlwDahHY%3D%26cap.msg%3DEHPrLOVsWhX%2FV5EHdHgLLXWEibD9RoqwyG9fTh%2BVylyiMajoSE0x7OiUlf630hWmuQQ13VLqh8%2FXxc5VS0VgaxQmDYDaZFYxUMbRvQR135ukoGAsHrrKsxXLo6gmQr2H%2BPHYLEU8xZp9wQXIQnOS8IsE545QWMLXpLnQjHvcJHd24aixCEEj8qPsWRaCGAl%2BxmaLX3IgZGvdNhSeac1tWW17AtPqPv37aE6KlFkvEOmQv42DGEtjckhTgQ4%2BVx9%2BHyDOwYgRpalieGEUqy%2F1Rj5zORWA3E7qbWsHBojBaIAB4pL4BNdNNkkHFvnF5k9ECME%2BkL%2Bhf6K5iXcPN%2F06BWDpRk9YV1jIRzYC0zffHbyQHH04p%2FNLhzUsV6r1%2FaxH00X3lWwEkWrImcl8obM7DzZeDBXHct8KvddyDXvYv95PTsfFxK5OA0icR6y1qSHbtQ%2FPl44Vf9x4lgRBbfaBruo0WMTiY3zFc93Y67Aw0fKygeQ6cM5EzEm2ZjG2PO3tIaNCRbfgya73gS5GWICUkf36OhTzdMBl587OH12l9Us%3D%26mgm.logid%3De2392620-7a8c-11ea-8e3e-fa163ec4aef9%26mgm.blockchecksum%3Dignore%26mgm.replicaindex%3D0%26mgm.replicahead%3D0%26mgm.etag%3D%221275783243513200640:88708a07%22%26mgm.id%3D11b47e19c%26mgm.mtime%3D1586451760 HTTP/1.1
200409 20:06:49 14876 sysXrdHttp:  rc:37 got hdr line: Want-Digest: adler32;q=1, md5;q=0.8
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp:  rc:0Header not yet complete.
200409 20:06:49 3980 sysXrdHttp: Sending 1048576 bytes
200409 20:06:49 14927 ?:59@umfs06.aglt2.org sysXrdHttp:  Process. lp:0x7f2ca6385818 reqstate: -1
200409 20:06:49 14927 sysXrdHttp: getDataOneShot BuffAvailable: 1047128 maxread: 1047128
200409 20:06:49 14927 sysXrdHttp: read 195 of 1047128 bytes
200409 20:06:49 14927 sysXrdHttp:  rc:528 got hdr line: Authorization: Bearer MDAxNGxvY2F0aW9uIGVvc3BwcwowMDM0aWRlbnRpZmllciA3ODUxMWY3YS1jOGQ4LTQ2MDAtODFhMS1kZTgxMGRjMzg5NGEKMDAxNmNpZCBuYW1lOnRiZWVybWFuCjAwNTJjaWQgYWN0aXZpdHk6UkVBRF9NRVRBREFUQSxVUExPQUQsRE9XTkxPQUQsREVMRVRFLE1BTkFHRSxVUERBVEVfTUVUQURBVEEsTElTVAowMDFmY2lkIGFjdGl2aXR5OkRPV05MT0FELExJU1QKMDA1OGNpZCBwYXRoOi9lb3Mvb3BzdGVzdC90cGMvaHR0cHMvZG9tYXRlc3QvZmlsZTRfY2ZhZDY1ZjUtNDQ4Ny00OTAxLTlmMGUtYWYyZDIwZjk1YzdhCjAwMjRjaWQgYmVmb3JlOjIwMjAtMDQtMDlUMjA6NDM6NDZaCjAwMmZzaWduYXR1cmUgA0mewhQwo5ODcWyejOZ05Uo67s7XJdi380Tf53KAFx4K
200409 20:06:49 14927 sysXrdHttp:  rc:32 got hdr line: Host: lxfsre09a02.cern.ch:9001
200409 20:06:49 14927 sysXrdHttp:  rc:24 got hdr line: Connection: Keep-Alive
200409 20:06:49 14927 sysXrdHttp:  rc:27 got hdr line: User-Agent: dCache/5.2.16
200409 20:06:49 14927 sysXrdHttp:  rc:31 got hdr line: Accept-Encoding: gzip,deflate
200409 20:06:49 14927 sysXrdHttp:  rc:2 got hdr line:
200409 20:06:49 14927 sysXrdHttp:  rc:2 detected header end.
200409 20:07:58 2811 Starting on Linux 3.10.0-1062.9.1.el7.x86_64
Copr.  2004-2012 Stanford University, xrd version v4.11.3
@abh3 abh3 assigned abh3 and ffurano and unassigned abh3 Apr 14, 2020
@ffurano
Copy link
Contributor

ffurano commented Apr 14, 2020 via email

@esindril
Copy link
Contributor Author

Not really, this would just fix the effect not the cause. And there are any many more places where all this should be fixed depending on the type of request. The CurretReq.reqstate should be fixed after this code gets triggered and the rest of the header info is read:
https://github.com/xrootd/xrootd/blob/master/src/XrdHttp/XrdHttpProtocol.cc#L753

@bbockelm
Copy link
Contributor

@ffurano @esindril - any update on this one?

@ffurano
Copy link
Contributor

ffurano commented May 22, 2020

Not really by now, I may find some time on Monday to look at it

@ffurano
Copy link
Contributor

ffurano commented May 25, 2020

Hi,

may I know which xrootd you are using? I am on the master, and the line numbers do not match with respect to this issue...

@esindril
Copy link
Contributor Author

This was using Copr. 2004-2012 Stanford University, xrd version v4.11.3

@ffurano
Copy link
Contributor

ffurano commented May 25, 2020

Another detail that I find scary is that the log line of the master should print the reqstate:

TRACEI(DEBUG, " Process. lp:" << lp << " reqstate: " << CurrentReq.reqstate);

Instead I see a slightly different printout in your log:

#1 0x00007f2cfdae1105 in XrdHttpProtocol::Process (this=0x7f2cdd8a9400, lp=) at /usr/src/debug/xrootd-4.11.3/src/XrdHttp/XrdHttpProtocol.cc:988

That line has been changed a very long time ago, and so many other things. Would you mind to test with the master and your eos plugin?

In the meantime, the thing that is ugly to me is that reqstate goes below zero in this very particular case. I will fix that in my next little commit. This will likely avoid the situation at the roots, but will need testing.

Please let me know how it goes

@esindril
Copy link
Contributor Author

The log line from the logs in my first comment does seem to match the TRACEI:
200409 20:06:49 14876 ?:59@umfs06.aglt2.org sysXrdHttp: Process. lp:0x7f2ca6385818 reqstate: 0

Are you referring to something else?

It's quite complicated to test as this happens in very particular conditions that I was not yet able to reproduce ...

@ffurano
Copy link
Contributor

ffurano commented May 25, 2020

right! I pointed the wrong line, sorry. The line numbers still were not matching... as they are two slightly different versions.

I have merged into the master two commits, and I am starting the heavy test with DPM, which exercises an ext plugin, like eos does (but not the same plugin)

@esindril
Copy link
Contributor Author

Running with 4.12.3 which contains this fix in our pre-production instance leads to a segv in a matter of seconds. The output from the XrdHttp logs looks scrambled:

200622 09:50:22 29275 ?:473@se03.esc.qmul.ac.uk sysXrdHttp: Sending resp: 403 header len:134
200622 09:50:22 29275 sysXrdHttp: Sending 134 bytes
200622 09:50:22 29275 sysXrdHttp: Sending 3846 bytes
200622 09:50:22 29275 sysXrdHttp:  XrdHttpReq request ended.
200622 09:50:22 29275 ?:473@se03.esc.qmul.ac.uk sysXrdHttp: Process is exiting rc:1
200622 09:50:22 29266 ?:473@se03.esc.qmul.ac.uk sysXrdHttp:  Process. lp:0x7f8721d5a2b8 reqstate: 0
200622 09:50:22 29266 sysXrdHttp: getDataOneShot BuffAvailable: 1043630 maxread: 1043630
200622 09:50:22 29266 sysXrdHttp: getDataOneShot sslavail: 1043630
200622 09:50:22 29266 sysXrdHttp: read 8192 of 1043630 bytes
200622 09:50:22 29266 sysXrdHttp:  rc:115 got hdr line: ʰ<9A><E1>4<C6><FD><BF>^?<F3>/<87><U+0379><9E>))<A2>^?h<U+0618>M^Y<9B><A7><8F><CC><F3>3<B9>^XMI<AB>Νy<87>U<E3>7<DD><s<FB><C6>*<E4>^G<C8>Hwj<C6>ESC       |^D<96><B4><B5>\<CF>$<F7>^E`^X_<D9><E5><EF>j<E9>,<A2><FB>TQ<B3>^K<DF>ۥ^O^LHL<EB>5<FB>^\o^R<E2><E1>ױ0N=<82><A4><97>z<F3>\<B2><A9>Z<A1>^^C

200622 09:50:22 29266 sysXrdHttp:  Parsing first line: ʰ<9A><E1>4<C6><FD><BF>^?<F3>/<87><U+0379><9E>))<A2>^?h<U+0618>M^Y<9B><A7><8F><CC><F3>3<B9>^XMI<AB>Νy<87>U<E3>7<DD><s<FB><C6>*<E4>^G<C8>Hwj<C6>ESC        |^D<96><B4><B5>\<CF>$<F7>^E`^X_<D9><E5><EF>j<E9>,<A2><FB>TQ<B3>^K<DF>ۥ^O^LHL<EB>5<FB>^\o^R<E2><E1>ױ0N=<82><A4><97>z<F3>\<B2><A9>Z<A1>^^C

200622 09:50:22 29266 sysXrdHttp:  Parsing of first line failed with -1
200622 09:50:22 29266 sysXrdHttp:  rc:244 got hdr line: <C5><C7><C7>J<90><C0> <F6><A3><E6><E3>BUN{Ѩ<AD><AC><91><AF>%<EC><D9>^W^\r<AA><97><EE>w<97>^Gva<BD>5;l(<C7>m<CF>!^]^F<C6><CC>n/ESC<A1>z<D8>^PW;/<F7>v<AE><95>?<A6>*jwu^?X{Ą<U+99422><9B>^?<9C>^G^M<DE>P<84>5<B4><C1>^U^Q<DB>k^Ceۘ<<85>}<A3>MC̝}2$^GѭG%<89>bi7w<E8><EC>A<BF>^W<A3><D6>f~<F4>-<99><D3>Y<B4>I8'<A1>^X^]d<92><A0>^O<A2><BE>^]r(߆<96><DD>5<85><CD>^]<EC><E9>^H<D8><C1><8A>O<FD>(i<C2>&<F4>$^Yr=<A0><D6><ED><FA>+<EE><A8>^DPXq^F^D<A0>^_^K<8E><B1><C8>-<AB>-<ESC7^EJ<B7>bc;3B<ED><C5>$<B1><C6><F3>=:^Ye6<B9>^H*1J<BD>϶<B7>^?<AE>݆ J
<82>^LV<C3>nW<EC><BE>R7<B2>ESC

The point where this crashes:

#########################################################################
# -----------------------------------------------------------------------
# Responsible thread =>
# -----------------------------------------------------------------------
# Thread 20 (Thread 0x7f8998bef700 (LWP 29276)):
#########################################################################
#5  <signal handler called>
#6  0x00007f899cc65691 in __strlen_sse2_pminub () from /lib64/libc.so.6
#7  0x00007f8997eda245 in std::char_traits<char>::length (__s=0x0)
    at /usr/include/c++/4.8.2/bits/char_traits.h:259
#8  std::string::assign (__s=0x0, this=0x7f8720a8e2a8)
    at /usr/include/c++/4.8.2/bits/basic_string.h:1131
#9  std::string::operator= (__s=0x0, this=<optimized out>)
    at /usr/include/c++/4.8.2/bits/basic_string.h:555
#10 XrdHttpExtReq::XrdHttpExtReq (this=0x7f8998bee7a0, req=0x7f88f3d4b3a0,
    pr=<optimized out>)
    at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpExtHandler.cc:86
#11 0x00007f8997ed3c1f in XrdHttpReq::ProcessHTTPReq (
    this=this@entry=0x7f88f3d4b3a0)
    at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpReq.cc:986
#12 0x00007f8997eccc45 in XrdHttpProtocol::Process (this=0x7f88f3d4b280,
    lp=<optimized out>)
    at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpProtocol.cc:1004
#13 0x00007f899dd75d49 in XrdLink::DoIt (this=0x7f8721d5a2b8)
    at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdLink.cc:441
#14 0x00007f899dd790df in XrdScheduler::Run (
    this=0x610e58 <XrdMain::Config+440>)
    at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:357
#15 0x00007f899dd79229 in XrdStartWorking (carg=<optimized out>)
    at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:87
#16 0x00007f899dd3eba7 in XrdSysThread_Xeq (myargs=0x7f899a421700)
    at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:86
#17 0x00007f899d8f2ea5 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f899cbf48dd in clone () from /lib64/libc.so.6

@ffurano
Copy link
Contributor

ffurano commented Jun 22, 2020

OK, I'll have a look at this. So far I have not seen this behaviour in the dpm tests. May I ask you where you pick the xrootd builds?

@esindril
Copy link
Contributor Author

esindril commented Jun 22, 2020

Hi Fabrizio,

This is now running in eospps.cern.ch. The xrootd comes from the eos-depend repo which takes the tarball from the xrootd release (http://storage-ci.web.cern.ch/storage-ci/xrootd/release/cc-7-x86_64/v4.12.3/tarball/) and rebuilds it:

[root@eospps-fe1 (mgm:master mq:master) ~]$ yum info xrootd-server
Loaded plugins: changelog, kernel-module, ovl, priorities, tsflags, versionlock
427 packages excluded due to repository priority protections
Excluding 10 updates due to versionlock (use "yum versionlock status" to show them)
Installed Packages
Name        : xrootd-server
Arch        : x86_64
Epoch       : 1
Version     : 4.12.3
Release     : 1.el7
Size        : 1.1 M
Repo        : installed
From repo   : eos-deps
Summary     : Extended ROOT file server
URL         : http://xrootd.org/
License     : LGPLv3+
Description : XRootD server binaries

@bbockelm
Copy link
Contributor

@esindril - in terms of getting together some debug info, anything obviously wrong when run under valgrind?

@esindril
Copy link
Contributor Author

I've tried valgrind but this was terribly slow and in the end it crashed itself with a funny message:

valgrind: the 'impossible' happened:
   Max number of threads is too low

I tried increasing it but again things are extremely slow and I abandoned after half an hour. On the bright side this seem to be quite reliably reproducible although it does not happen immediately. If I were to guess, I would say this comes more from the SSL part rather then the fixes that Fabrizio did ... but this is just a guess. Below I will post more info from the backtrace which is always the same and some info from the log file once this starts happening.

In the log file all http requests seem fine up to this point:

stest/tpc/https/domatest/file26_c38ac3a5-2341-4a58-afc8-82d9bcc7d4ad: Unable to stat /eos/opstest/tpc/https/domatest/file26_c38ac3a5-2341-4a58-afc8-82d9bcc7d4ad; No such file or directory"
200624 10:11:27 24523 tbeerman.0:450@fts-devel-10 sysXrdHttp: Sending resp: 404 header len:106
200624 10:11:27 24523 sysXrdHttp: Sending 106 bytes
200624 10:11:27 24523 sysXrdHttp:  XrdHttpReq request ended.
200624 10:11:27 24523 tbeerman.0:450@fts-devel-10 sysXrdHttp: Process is exiting rc:1
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  SSL_accept returned :1
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  SSL_get_verify_result returned :0
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Extracting auth info.
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  SSL_get_peer_certificate returned :0x7f8e4a855d40
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Issuer name is : '/C=UK/O=eScienceCA/OU=Authority/CN=UK e-Science CA 2B'
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Mapping name: '/C=UK/O=eScienceCA/OU=Authority/CN=UK e-Science CA 2B' Failed. err: -14
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Subject name is : '/C=UK/O=eScience/OU=QueenMaryLondon/L=Physics/CN=se03.esc.qmul.ac.uk'
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Mapping name: /C=UK/O=eScience/OU=QueenMaryLondon/L=Physics/CN=se03.esc.qmul.ac.uk Failed. err: -14
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk  SSL_get_peer_certificate returned :0x7f8e4a855d40
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk  SSL_get_verify_result returned :0
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk  SSL_get_peer_cert_chain :0x7f8dc2cc72c0
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk  voms info retrieval failed: VOMS extension not found!
200624 10:11:27 24521 sysXrdHttp: getDataOneShot BuffAvailable: 1048576 maxread: 1048576
200624 10:11:27 24521 sysXrdHttp: getDataOneShot sslavail: 1048576
200624 10:11:27 24521 sysXrdHttp: read 4946 of 1048576 bytes
200624 10:11:27 24521 sysXrdHttp:  rc:90 got hdr line: PUT /eos/opstest/tpc/https/domatest/file21_6f1b285c-6ee3-4f1f-946b-ecb10146e8b2 HTTP/1.1

200624 10:11:27 24521 sysXrdHttp:  Parsing first line: PUT /eos/opstest/tpc/https/domatest/file21_6f1b285c-6ee3-4f1f-946b-ecb10146e8b2 HTTP/1.1

200624 10:11:27 24521 sysXrdHttp:  rc:555 got hdr line: Authorization: Bearer MDAxNGxvY2F0aW9uIGVvc3BwcwowMDM0aWRlbnRpZmllciBmNWJmYzQ3YS0yMzlhLTQyMDEtYTQ1Yy00NGJiNDM0NzY0MzYKMDAxNmNpZCBuYW1lOnRiZWVybWFuCjAwNTJjaWQgYWN0aXZpdHk6UkVBRF9NRVRBREFUQSxVUExPQUQsRE9XTkxPQUQsREVMRVRFLE1BTkFHRSxVUERBVEVfTUVUQURBVEEsTElTVAowMDMyY2lkIGFjdGl2aXR5Ok1BTkFHRSxVUExPQUQsREVMRVRFLExJU1QsTUFOQUdFCjAwNTljaWQgcGF0aDovZW9zL29wc3Rlc3QvdHBjL2h0dHBzL2RvbWF0ZXN0L2ZpbGUyMV82ZjFiMjg1Yy02ZWUzLTRmMWYtOTQ2Yi1lY2IxMDE0NmU4YjIKMDAyNGNpZCBiZWZvcmU6MjAyMC0wNi0yNFQxMDo0ODoyNloKMDAyZnNpZ25hdHVyZSDg7KTdfpKhppYnBI6nfDf2DswIUyYFB6fy-m_xU_bkbAo

200624 10:11:27 24521 sysXrdHttp:  rc:28 got hdr line: Content-Length: 1000000000

200624 10:11:27 24521 sysXrdHttp:  rc:40 got hdr line: Content-Type: application/octet-stream

200624 10:11:27 24521 sysXrdHttp:  rc:26 got hdr line: Host: eospps.cern.ch:443

200624 10:11:27 24521 sysXrdHttp:  rc:24 got hdr line: Connection: Keep-Alive

200624 10:11:27 24521 sysXrdHttp:  rc:54 got hdr line: User-Agent: Apache-HttpClient/4.5.8 (Java/1.8.0_252)

200624 10:11:27 24521 sysXrdHttp:  rc:31 got hdr line: Accept-Encoding: gzip,deflate

200624 10:11:27 24521 sysXrdHttp:  rc:2 got hdr line:

200624 10:11:27 24521 sysXrdHttp:  rc:2 detected header end.

Then it seems it can not properly decode the headers:

200624 10:11:27 time=1592986287.197808 func=HttpError                level=INFO  logid=static.............................. unit=mgm@eospps-fe1.cern.ch:1094 tid=00007f8ef00f0700 source=HttpServer:403                 tident= sec=(null) uid=99 gid=99 name=- geo="" errc=1, retcode=403 errmsg="Unable to open file /eos/opstest/tpc/https/domatest/file21_6f1b285c-6ee3-4f1f-946b-ecb10146e8b2; Operation not permitted"
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp: Sending resp: 403 header len:134
200624 10:11:27 24521 sysXrdHttp: Sending 134 bytes
200624 10:11:27 24521 sysXrdHttp: Sending 3846 bytes
200624 10:11:27 24521 sysXrdHttp:  XrdHttpReq request ended.
200624 10:11:27 24521 ?:449@se03.esc.qmul.ac.uk sysXrdHttp: Process is exiting rc:1
200624 10:11:27 24510 ?:449@se03.esc.qmul.ac.uk sysXrdHttp:  Process. lp:0x7f8dd44e12b8 reqstate: 0
200624 10:11:27 24510 sysXrdHttp: getDataOneShot BuffAvailable: 1043630 maxread: 1043630
200624 10:11:27 24510 sysXrdHttp: getDataOneShot sslavail: 1043630
200624 10:11:27 24510 sysXrdHttp: read 8192 of 1043630 bytes
200624 10:11:27 24510 sysXrdHttp:  rc:247 got hdr line: <BD>\<AC>^]n<A4><A6>3<BB>ʄb"<86><A3><EC><BC>^^<9E><81><83><F0>}s^^ϋi<E3><C6>2^Q<D0><FF><F0><8A><A3><EB><CF><E8>i2<9D>?<8F><CD><E9>@<B2><F2>^M<AA><B0>@bd{+?<A7>肠|/ESC2<E0>><91><E5><FB>U<E8>Aޮ<U+0372><89>{<EB>9^Gڔ߫8<AE><AA>6<A7>p<FF>
<E5><ED>f<B7>aA<9B>m^O<99><9F>^AESCŭ<82><86>h     "<A4>U<9D>^M&O}<E9><F1>KDe<83>^A+\9^]<92>^D<A7>^^<FE><F0><B9>o<9F><96><BA><B4>:ѿ^ZR<C6>*Rh^K<90>~<FD>*<DC>t<F3><DC><DF>ך<FD>^C1Δu<D8><D5>v<D8>aESCY<C6>9<D0><C1><9A><A4><EE><F9>^X>!<F7>q^GD^X<9E>!<B4>CW<A5>^N<9A><9C>^Y<D5>ESC<BD>^G<FA>ޡS
<FA><F3><DD>͝<A8>^K^A<9C>"%%2X]<C8>^CD<C5>6^^p<9E><B6>F<D3>><BA>^H@-<D8>

200624 10:11:27 24510 sysXrdHttp:  Parsing first line: <BD>\<AC>^]n<A4><A6>3<BB>ʄb"<86><A3><EC><BC>^^<9E><81><83><F0>}s^^ϋi<E3><C6>2^Q<D0><FF><F0><8A><A3><EB><CF><E8>i2<9D>?<8F><CD><E9>@<B2><F2>^M<AA><B0>@bd{+?<A7>肠|/ESC2<E0>><91><E5><FB>U<E8>Aޮ<U+0372><89>{<EB>9^Gڔ߫8<AE><AA>6<A7>p<FF><E5><ED>f<B7>aA<9B>m^O<99><9F>^AESCŭ<82><86>h      "<A4>U<9D>^M&O}<E9><F1>KDe<83>^A+\9^]<92>^D<A7>^^<FE><F0><B9>o<9F><96><BA><B4>:ѿ^ZR<C6>*Rh^K<90>~<FD>*<DC>t<F3><DC><DF>ך<FD>^C1Δu<D8><D5>v<D8>aESCY<C6>9<D0><C1><9A><A4><EE><F9>^X>!<F7>q^GD^X<9E>!<B4>CW<A5>^N<9A><9C>^Y<D5>ESC<BD>^G<FA>ޡS<FA>
<F3><DD>͝<A8>^K^A<9C>"%%2X]<C8>^CD<C5>6^^p<9E><B6>F<D3>><BA>^H@-<D8>

The backtrace looks like this:

gdb) bt
#0  0x00007f8ef40e4691 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1  0x00007f8eef3db245 in std::char_traits<char>::length (__s=0x0) at /usr/include/c++/4.8.2/bits/char_traits.h:259
#2  std::string::assign (__s=0x0, this=0x7f8e49939b78) at /usr/include/c++/4.8.2/bits/basic_string.h:1131
#3  std::string::operator= (__s=0x0, this=<optimized out>) at /usr/include/c++/4.8.2/bits/basic_string.h:555
#4  XrdHttpExtReq::XrdHttpExtReq (this=0x7f8eeffee7a0, req=0x7f8eaf88ee20, pr=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpExtHandler.cc:86
#5  0x00007f8eef3d4c1f in XrdHttpReq::ProcessHTTPReq (this=this@entry=0x7f8eaf88ee20) at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpReq.cc:986
#6  0x00007f8eef3cdc45 in XrdHttpProtocol::Process (this=0x7f8eaf88ed00, lp=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpProtocol.cc:1004
#7  0x00007f8ef51f4d49 in XrdLink::DoIt (this=0x7f8dd44e12b8) at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdLink.cc:441
#8  0x00007f8ef51f80df in XrdScheduler::Run (this=0x610e58 <XrdMain::Config+440>) at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:357
#9  0x00007f8ef51f8229 in XrdStartWorking (carg=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/Xrd/XrdScheduler.cc:87
#10 0x00007f8ef51bdba7 in XrdSysThread_Xeq (myargs=0x7f8ef1821720) at /usr/src/debug/xrootd/xrootd/src/XrdSys/XrdSysPthread.cc:86
#11 0x00007f8ef4d71ea5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f8ef40738dd in clone () from /lib64/libc.so.6

(gdb) f 6
#6  0x00007f8eef3cdc45 in XrdHttpProtocol::Process (this=0x7f8eaf88ed00, lp=<optimized out>) at /usr/src/debug/xrootd/xrootd/src/XrdHttp/XrdHttpProtocol.cc:1004
1004      rc = CurrentReq.ProcessHTTPReq();
(gdb) p CurrentReq
$1 = {<XrdXrootd::Bridge::Result> = {_vptr.Result = 0x7f8eef5e2ab0 <vtable for XrdHttpReq+16>}, httpStatusCode = 1114779509, httpStatusText = "", m_transfer_encoding_chunked = false, m_current_chunk_offset = 0, m_current_chunk_size = -1, prot = 0x7f8eaf88ed00, ralist = 0x0,
  request = XrdHttpReq::rtMalformed,
  requestverb = "\352\361\310E\b\350\313~~\215\214Q\354\351\253\375l\272\070\255Z\201\302l\366_\b\250\030~\211\016c~6\035\252\021\346\067\342\064\vR\035\267+\370\375oW\237\265X\346\336\027\351\337\325\324C\375[\244\302?\255t\206ֿ\243\305\347Y&\376r1n\001|\347\021y-\263\217\316ݹ\"|e\261\250و/M\305W^\302y|R", allheaders = std::map with 9 elements = {["`v"] = "%\r\022\275\002j4\236\232a|\305<\203#I\227d\210\324/y;\226\303\035\t\336\374\245(\fX\213-3\222\004bi(\366\360(",
    ["bK\237\t\324P\310\302\rO\221N\330\376\327\265-X\256\227\266ϒ(^J\017\004V\274\324a\363\305\373\226\343\070\341\266l\375J\202\231l\177\273EA\221o\276O\246\002\265\327ќ\327\\\301\322\306_\\.\216w\241\360\345\371\026\324\031\223\006\256ζl\205\325㳏\216u\200\271(", <incomplete sequence \360\272>] = "o\257Č/", ["xrd-http-query"] = "",
    ["\234\321Y\314e\306`\346\026\372\004u5\217\016ŵd\243c\277\224JM\264X\277{P\371\256`\025X\035n\341\352\005\"p9\vX\304\377\021,\244 u\020\"\321\372Ͳ\325/\330\031\270e\020<\201\256\351&\001\353\320\035\032_\340\310\332R\224\254,\b"] = "Vn\341i_\205\364\201\037\242\205@\367\232'%\267\242\336\032\071\260\a\b\021\035`\224\357\347/~\236\302\347ۗ\265\266\004?\343펄h\036\206\020\300\210\351z\374OH\367\anaK\222\263[\331ʐ\t!? \f\321\003\220\066\271|\262\340xMz\200s2\267\243\347}C\337\027\262\267\241\332X",
    ["\235jA\252\261\003\202\231\365\003\275!\257\377TR\a\005\345\356k\275\b\201\t\020\202QW\201\232\343\210RXƪ\251}\265\254\341\067PKY\003\360O^\340\243\204\276\225_\234\365\215\275D\\\202\\\"k\025\272\003\070\266\340I\255]\355\344l\030\035\214=(\251\342\305SV\325\350\332\006AơH'Ʌ\227\234失8\301Sǘ\353\001\211Z \017贸v\352i\357\252\361Hd\005\233j\257\273\226(S{\250\330\307+\317\005\324A\373\364\027\205h\021\035d"] = "A\327D\033\226\376\071\225\244\017\227)\023\222q\266\375Uvu\316V{=\026n\375\066!\361\372R\034\256^\332P\343\306\314\310\004\233\233\006\317ڲ#k\340\303\346(\316\301\021\005\232\337\035\303g1\351\305T\201\310v\teŮ\364\271Ճ\235\205\263V/\247\256\232\246\352\064H\001\202\341\323~`\215\203\276e,)}&\277\221\246M\227٧5",
    ["\240*\252b7\366\372\035\307T\216xka{\316b\341\307\067X6q\211\335\002\204G\r `9,^\241\205\a3\217\275l'\343t\273Tv\254\210\370}f\023\253Vko\232Gl\356]\342\221-\216\224\221\363P\001/\340\250\020\214\367\344\261\350\006A'\236O\024}\177\216\211\b>p@\201\004G\300\261!\300\\ޱd\375\210Re\ts1\002v0\257\217\275\247[id\277\224(I\373\021K\264\024b\226\355\257\363mc\275\004\350A\223mQa~%$\371\006\204J\033!\355hCJ\335|龧\215\355P\223"] = "`\365\337\302z\357\260\342J\022\022H\032\211uPi\206\266\355bMWa\323K\336\313\066\361\241\210;\034\344\210\060\002\021hT\222\377\017!Z\232\230f\276\202떯م9`\222\202\002\b\200\305v\216\325.\365\275\020\023\025\066\233\232K+ڭ\200\064Ğ\321pd",

If anyone wants access to the machine please let me know.

@xrootd-dev
Copy link

xrootd-dev commented Jun 24, 2020 via email

@ffurano
Copy link
Contributor

ffurano commented Jun 24, 2020 via email

@esindril
Copy link
Contributor Author

Hi Fabrizio,

I will deploy it in the after noon. Thanks!

Elvin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants