Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Crash on very long urls #1248

Closed
praveen262k opened this issue Jan 19, 2016 · 39 comments
Closed

Crash on very long urls #1248

praveen262k opened this issue Jan 19, 2016 · 39 comments

Comments

@praveen262k
Copy link

We enabled pagespeed module 2 weeks back. We have been monitoring site since then.

Last 2 days, we see below logs and server goes not responding. Any help would be appreciated! As of now we have pulled out the module from production.

GDB Backtrace log:

(gdb) bt
#0  0x00007f54cfd95625 in raise () from /lib64/libc.so.6
#1  0x00007f54cfd96e05 in abort () from /lib64/libc.so.6
#2  0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#3  0x00007f54cad1dbe6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x00007f54cad1dc13 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x00007f54cad1dd0e in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00007f54cacc2837 in std::__throw_logic_error(char const*) () from /usr/lib64/libstdc++.so.6
#7  0x00007f54cacfde59 in ?? () from /usr/lib64/libstdc++.so.6
#8  0x00007f54cacfdf33 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /usr/lib64/libstdc++.so.6
#9  0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#10 0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#11 0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 0x00007f54d1860a58 in ap_send_error_response ()
#14 0x00007f54d1848506 in ap_read_request ()
#15 0x00007f54d18600e0 in ?? ()
#16 0x00007f54d185c148 in ap_run_process_connection ()
#17 0x00007f54d1869052 in ?? ()
#18 0x00007f54d00fe9d1 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f54cfe4b8fd in clone () from /lib64/libc.so.6

Apache log:

$ grep "child pid" error.20160118000000.log
[Sun Jan 17 20:52:31 2016] [notice] child pid 12587 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 08:51:39 2016] [notice] child pid 14373 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 09:36:22 2016] [notice] child pid 473 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 09:40:47 2016] [notice] child pid 24587 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 29667 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8104 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8695 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:44 2016] [notice] child pid 18148 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:46 2016] [notice] child pid 22018 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 22962 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23747 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23748 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:52 2016] [notice] child pid 25136 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25719 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25720 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27435 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27436 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27443 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:04 2016] [notice] child pid 32360 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9726 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9727 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:22 2016] [notice] child pid 14263 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 20134 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 21619 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 26322 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 3596 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 28482 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:53 2016] [notice] child pid 6108 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:02 2016] [notice] child pid 6925 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:07 2016] [notice] child pid 8408 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:13 2016] [notice] child pid 7685 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:14 2016] [notice] child pid 9603 exit signal Aborted (6), possible coredump in /etc/httpd
@crowell
Copy link
Contributor

crowell commented Jan 19, 2016

What version/os/64 or 32 bits?
On Jan 19, 2016 5:51 PM, "praveen262k" notifications@github.com wrote:

We enabled pagespeed module 2 weeks back. We have been monitoring site
since then.

Last 2 days, we see below logs and server goes not responding. Any help
would be appreciated! As of now we have pulled out the module from
production.
GDB Backtrace log:

(gdb) bt
#0 0x00007f54cfd95625 in raise () from /lib64/libc.so.6
#1 #1
0x00007f54cfd96e05 in abort () from /lib64/libc.so.6
#2 #2
0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler() () from
/usr/lib64/libstdc++.so.6
#3 #3
0x00007f54cad1dbe6 in ?? () from /usr/lib64/libstdc++.so.6
#4 #4
0x00007f54cad1dc13 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5 #5
0x00007f54cad1dd0e in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6 #6
0x00007f54cacc2837 in std::__throw_logic_error(char const

) () from /usr/lib64/libstdc++.so.6 #7
#7 0x00007f54cacfde59 in
?? () from /usr/lib64/libstdc++.so.6 #8
#8 0x00007f54cacfdf33 in
std::basic_string, std::allocator >::basic_string(char const
,
std::allocator const&) () from /usr/lib64/libstdc++.so.6
#9 #9
0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#10 #10
0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#11 #11
0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 #12
0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 #13
0x00007f54d1860a58 in ap_send_error_response ()
#14 #14
0x00007f54d1848506 in ap_read_request ()
#15 #15
0x00007f54d18600e0 in ?? ()
#16 #16
0x00007f54d185c148 in ap_run_process_connection ()
#17 #17
0x00007f54d1869052 in ?? ()
#18 #18
0x00007f54d00fe9d1 in start_thread () from /lib64/libpthread.so.0
#19 #19
0x00007f54cfe4b8fd in clone () from /lib64/libc.so.6

Apache log:
$ grep "child pid" error.20160118000000.log
[Sun Jan 17 20:52:31 2016] [notice] child pid 12587 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 08:51:39 2016] [notice] child pid 14373 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 09:36:22 2016] [notice] child pid 473 exit signal Aborted (6),
possible coredump in /etc/httpd
[Mon Jan 18 09:40:47 2016] [notice] child pid 24587 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 29667 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8104 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8695 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:44 2016] [notice] child pid 18148 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:46 2016] [notice] child pid 22018 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 22962 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23747 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23748 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:52 2016] [notice] child pid 25136 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25719 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25720 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27435 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27436 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27443 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:04 2016] [notice] child pid 32360 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9726 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9727 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:22 2016] [notice] child pid 14263 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 20134 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 21619 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 26322 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 3596 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 28482 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:53 2016] [notice] child pid 6108 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:02 2016] [notice] child pid 6925 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:07 2016] [notice] child pid 8408 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:13 2016] [notice] child pid 7685 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:14 2016] [notice] child pid 9603 exit signal Aborted
(6), possible coredump in /etc/httpd


Reply to this email directly or view it on GitHub
#1248.

@praveen262k
Copy link
Author

Apache version: 2.2
OS: Centos Red Hat Linux 64 bit

Thanks!

@crowell
Copy link
Contributor

crowell commented Jan 19, 2016

Also what version of mod_pagespeed?
On Jan 19, 2016 5:59 PM, "praveen262k" notifications@github.com wrote:

Apache version: 2.2
OS: Centos Red Hat Linux 64 bit

Thanks!


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

ModSpeed version: Version: 1.9.32.11
Thanks!

@praveen262k
Copy link
Author

I saw similar issue reported #334 (looks pretty old ticket) - and was mentioned that it was resolved in Release 0.10.22.4 (https://developers.google.com/speed/pagespeed/module/release_notes#release_1.9.32.11-stable)

@praveen262k
Copy link
Author

To add a few more information:

  1. We have enabled rewrite css, rewrite js and collapse whitespace are only enabled
  2. ModePagespeedRewriteLevel PassThrough

@praveen262k
Copy link
Author

A few more logs details we pulled from apache log:

Tue Jan 19 04:30:28 2016] [error] [mod_pagespeed 1.9.32.11-7550 @9909] .....
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct NULL not valid
[Tue Jan 19 04:30:29 2016] [notice] child pid 9909 exit signal Aborted (6), possible coredump in /etc/httpd

@crowell
Copy link
Contributor

crowell commented Jan 20, 2016

great, thanks. I'll symbolize the backtrace tomorrow morning and hopefully
that'll make pinpointing the problem easy.
On Jan 19, 2016 8:13 PM, "praveen262k" notifications@github.com wrote:

A few more logs details we pulled from apache log:

Tue Jan 19 04:30:28 2016] [error] [mod_pagespeed 1.9.32.11-7550 @9909]
.....
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct NULL not valid
[Tue Jan 19 04:30:29 2016] [notice] child pid 9909 exit signal Aborted
(6), possible coredump in /etc/httpd


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

A few more info:

We noticed that somehow this error is related to the crash (most of the time):
[Tue Jan 19 12:06:16 2016] [error] [client XXX.X.XXX.XX] request failed: URI too long (longer than 8190)

@crowell
Copy link
Contributor

crowell commented Jan 20, 2016

if you still have the process in gdb, could you do info proc mappings?

@praveen262k
Copy link
Author

Unfortunately it says proc folder not found

However out team gave some more insight from the coredump and below is details

(gdb) bt
#0  0x00007f54cfd95625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f54cfd96e05 in abort () at abort.c:92
#2  0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:93
#3  0x00007f54cad1dbe6 in __cxxabiv1::__terminate (handler=<value optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#4  0x00007f54cad1dc13 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#5  0x00007f54cad1dd0e in __cxxabiv1::__cxa_throw (obj=0x7f52a4010de0, tinfo=<value optimized out>, dest=<value optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
#6  0x00007f54cacc2837 in std::__throw_logic_error (__s=<value optimized out>) at ../../../../libstdc++-v3/src/functexcept.cc:60
#7  0x00007f54cacfde59 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct<char const*> (__beg=0x0, __end=<value optimized out>,
    __a=<value optimized out>) at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:134
#8  0x00007f54cacfdf33 in _S_construct_aux<char const*> (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1543
#9  _S_construct<char const*> (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1559
#10 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:216
#11 0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#14 0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#15 0x00007f54d1860a58 in ap_send_error_response (r=0x7f52a8097338, recursive_error=<value optimized out>) at /usr/src/debug/httpd-2.2.15/modules/http/http_protocol.c:1288
#16 0x00007f54d1848506 in ap_read_request (conn=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/server/protocol.c:944
#17 0x00007f54d18600e0 in ap_process_http_connection (c=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/modules/http/http_core.c:183
#18 0x00007f54d185c148 in ap_run_process_connection (c=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/server/connection.c:43
#19 0x00007f54d1869052 in process_socket (thd=0x7f54d2d8a340, dummy=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:544
#20 worker_thread (thd=0x7f54d2d8a340, dummy=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:894
#21 0x00007f54d00fe9d1 in start_thread (arg=0x7f54a0bcd700) at pthread_create.c:301
#22 0x00007f54cfe4b8fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

@crowell
Copy link
Contributor

crowell commented Jan 20, 2016

Can you do info sharedlibrary ?
On Jan 20, 2016 11:51 AM, "praveen262k" notifications@github.com wrote:

Unfortunately it says proc folder not found

However out team gave some more insight from the coredump and below is
details

(gdb) bt
#0 0x00007f54cfd95625 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 #1
0x00007f54cfd96e05 in abort () at abort.c:92
#2 #2
0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler () at
../../../../libstdc++-v3/libsupc++/vterminate.cc:93
#3 #3
0x00007f54cad1dbe6 in __cxxabiv1::__terminate (handler=) at
../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#4 #4
0x00007f54cad1dc13 in std::terminate () at
../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#5 #5
0x00007f54cad1dd0e in __cxxabiv1::__cxa_throw (obj=0x7f52a4010de0, tinfo=,
dest=) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
#6 #6
0x00007f54cacc2837 in std::__throw_logic_error (__s=) at
../../../../libstdc++-v3/src/functexcept.cc:60
#7 #7
0x00007f54cacfde59 in std::basic_string, std::allocator >::_S_construct
(__beg=0x0, __end=,
__a=) at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:134
#8 #8
0x00007f54cacfdf33 in _S_construct_aux (this=0x7f54a0bcc580, __s=0x0,
__a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1543
#9 #9 _S_construct
(this=0x7f54a0bcc580, __s=0x0, __a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1559
#10 #10
std::basic_string, std::allocator >::basic_string (this=0x7f54a0bcc580,
__s=0x0, __a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:216
#11 #11
0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 #12
0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 #13
0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#14 #14
0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#15 #15
0x00007f54d1860a58 in ap_send_error_response (r=0x7f52a8097338,
recursive_error=) at
/usr/src/debug/httpd-2.2.15/modules/http/http_protocol.c:1288
#16 #16
0x00007f54d1848506 in ap_read_request (conn=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/server/protocol.c:944
#17 #17
0x00007f54d18600e0 in ap_process_http_connection (c=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/modules/http/http_core.c:183
#18 #18
0x00007f54d185c148 in ap_run_process_connection (c=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/server/connection.c:43
#19 #19
0x00007f54d1869052 in process_socket (thd=0x7f54d2d8a340, dummy=) at
/usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:544
#20 #20 worker_thread
(thd=0x7f54d2d8a340, dummy=) at
/usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:894
#21 #21
0x00007f54d00fe9d1 in start_thread (arg=0x7f54a0bcd700) at
pthread_create.c:301
#22 #22
0x00007f54cfe4b8fd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

Please find the attached image - I could not format is well here
noname
noname-2

@praveen262k
Copy link
Author

Some more data!

screen shot 2016-01-20 at 11 24 52 am

URI too long -> No of time we see this error log
std::logic_error -> Count of this error messages
child Process -> No of times Child process terminate

We could not find yet why Apache is reporting long URL as we don't see that in our apache access log yet!

@crowell
Copy link
Contributor

crowell commented Jan 20, 2016

is the backtrace from the same process as the info sharedlibrary, the bt doesn't seem to align with anything reasonable here.

@jmarantz
Copy link
Contributor

Can you glance at the access.log? This message does come from
mod_pagespeed:
[Tue Jan 19 12:06:16 2016] [error] [client 204.2.223.88] request failed:
URI too long (longer than 8190)
but it looks important, as it is correlated strongly with the crashes.
Maybe the long URLs are in the access.log and will help us figure out
what's going on.

-Josh

On Wed, Jan 20, 2016 at 10:39 AM, Jeffrey Crowell notifications@github.com
wrote:

is the backtrace from the same process as the info sharedlibrary, the bt
doesn't seem to align with anything reasonable here.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

@jmarantz I ran splunk query from those IP address and I could not find any URI that is long during that time window or from any other request at all.

Is it possible that URI was not too long but it was something else reported as URI too long?

@crowell - Yes both were run against the same coredump! Strange!

@crowell
Copy link
Contributor

crowell commented Jan 20, 2016

not doubting the relation between the crashes and the message, but that
logging is done by apache itself, not by pagespeed

http://apache2.sourcearchive.com/documentation/2.2.16-4/protocol_8c-source.html

On Wed, Jan 20, 2016 at 1:57 PM, Joshua Marantz notifications@github.com
wrote:

Can you glance at the access.log? This message does come from
mod_pagespeed:
[Tue Jan 19 12:06:16 2016] [error] [client 204.2.223.88] request failed:
URI too long (longer than 8190)
but it looks important, as it is correlated strongly with the crashes.
Maybe the long URLs are in the access.log and will help us figure out
what's going on.

-Josh

On Wed, Jan 20, 2016 at 10:39 AM, Jeffrey Crowell <
notifications@github.com>
wrote:

is the backtrace from the same process as the info sharedlibrary, the bt
doesn't seem to align with anything reasonable here.


Reply to this email directly or view it on GitHub
<
#1248 (comment)

.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

Jeff Crowell
https://github.com/crowell

@praveen262k
Copy link
Author

I am wondering since process crashed, may explain why I am not finding this 'long URI' on access log. Does that make sense?

@jmarantz
Copy link
Contributor

I'm also suspicious that the crash is caused by an out-of-memory condition. I'm wondering if there are any other messages in the error log that might shed light on the cause?

Also, you mentioned above you have three filters enabled: rewrite_javascript, rewrite_css, and collapse_whitespace. Can you try turning them all off and seeing if the problem shows up? If it doesn't show up, can you add them one at a time, and see which one causes the problem?

Thanks!
-Josh

@praveen262k
Copy link
Author

I am not sure if it is related to memory. So far we have not seen memory issue reported but I will dig in to those area. Every time crashing with URI too long make me think it may not be related to memory crash but I could be complete wrong.

We were going to recreate this issue on non-prod environment to simulate these scenarios as well. Without getting in to details initial thought apache was returning empty response with 414 status code and pagespeed module crashed with null exception

@basic_string.tcc -> line 134: __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));

@jmarantz
Copy link
Contributor

You are probably right. If we can get an accurate symbolized stack trace
we should be able to narrow down the cause. Would you be able to take a
debug version of mod_pagespeed.so and run with that?

On Wed, Jan 20, 2016 at 7:58 PM, praveen262k notifications@github.com
wrote:

I am not sure if it is related to memory. So far we have not seen memory
issue reported but I will dig in to those area. Every time crashing with
URI too long make me think it may not be related to memory crash but I
could be complete wrong.

We were going to recreate this issue on non-prod environment to simulate
these scenarios as well. Without getting in to details initial thought
apache was returning empty response with 414 status code and pagespeed
module crashed with null exception

@basic_string.tcc -> line 134:
__throw_logic_error(__N("basic_string::_S_construct NULL not valid"));


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@jeffkaufman
Copy link
Contributor

The ap_send_error_response in the backtrace looks like a strong clue. That's not a path I'd expect us to go down often, and if we're processing an error response structures we expect to be set up properly might be incomplete. For example, url could be NULL.

I'm going to try to reproduce this here, triggering HTTP_REQUEST_URI_TOO_LARGE.

@jeffkaufman
Copy link
Contributor

To hit the request too large error message, the path is:

ap_read_request
  -> read_request_line(r, tmp_bb) returns false
     -> r->status == HTTP_REQUEST_URI_TOO_LARGE

read_request_line
  -> ap_rgetline() returns APR_ENOSPC
     ->     /* ap_rgetline returns APR_ENOSPC if it fills up the                 
             * buffer before finding the end-of-line.  This is only going to     
             * happen if it exceeds the configured limit for a request-line.     
             */

ap_rgetline()
  -> ap_rgetline_core()
    -> return APR_ENOSPC if this read would overrun the
       buffer or if there's more to read but we're full

This looks pretty normal, and like I should be able to repro by sending an 8k+ request line.

@jeffkaufman
Copy link
Contributor

Reproduced!

curl -sS -D- localhost:8080/mod_pagespeed_example/a...[16000 times repeated]...a
#7  0x00002b5c19585977 in std::string::_S_construct<char const*> (__beg=0x0, 
    __end=0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, __a=...) at /usr/include/c++/4.8/bits/basic_string.tcc:133
#8  0x00002b5c19aa0128 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) ()
   from /usr/local/apache2/modules/mod_pagespeed.so
#9  0x00002b5c19419846 in net_instaweb::GoogleUrl::GoogleUrl (
    this=0x2b5c29006850, str=0x0) at pagespeed/kernel/http/google_url.cc:57
#10 0x00002b5c1927a6e6 in net_instaweb::InstawebContext::MakeRequestUrl (
    global_options=..., request=0x2b5c2c002970)
    at pagespeed/apache/instaweb_context.cc:368
#11 0x00002b5c1927bc77 in net_instaweb::InstawebHandler::InstawebHandler (
    this=0x2b5c29006b00, request=0x2b5c2c002970)
    at pagespeed/apache/instaweb_handler.cc:118
#12 0x00002b5c19282b37 in net_instaweb::(anonymous namespace)::build_context_for_request (request=0x2b5c2c002970) at pagespeed/apache/mod_instaweb.cc:389
#13 0x00002b5c1928384b in net_instaweb::(anonymous namespace)::instaweb_out_filter (filter=0x2b5c2c00a888, bb=0x2b5c2c004408)
    at pagespeed/apache/mod_instaweb.cc:610

@jeffkaufman
Copy link
Contributor

The problem is that request->unparsed_uri can be NULL and we think it can't, in InstawebContext::MakeRequestUrl.

@crowell
Copy link
Contributor

crowell commented Jan 21, 2016

symbolized from the other bt, so yes it is the same issue 👍

(note to self info sharedlibrary gives .text address, not base)

[0x00078840]> fd 0x14e073
sym.net_instaweb::GoogleUrl::Relativize + 851
[0x00078840]> fd 0x7fb9a
sym.net_instaweb::InstawebContext::MakeRequestUrl + 282
[0x00078840]> fd 0x81e34
sym.net_instaweb::InstawebHandler::InstawebHandler + 436
[0x00078840]> fd 0x8b3a4
sym.net_instaweb::_anonymousnamespace_::instaweb_out_filter + 1012

@jeffkaufman
Copy link
Contributor

Turns out we're checking for this exact problem, just too late:

  InstawebHandler instaweb_handler(request);
     [ note: does a InstawebContext::MakeRequestUrl, which assumes unparsed_uri != NULL ]
  const RewriteOptions* options = instaweb_handler.options();

  if (request->unparsed_uri == NULL) {
    // TODO(jmarantz): consider adding Debug message if unparsed_uri is NULL,
    // possibly of request->the_request which was non-null in the case where
    // I found this in the debugger.
    ap_log_rerror(APLOG_MARK, APLOG_ERR, APR_SUCCESS, request,
                  "Request not rewritten because: "
                  "request->unparsed_uri == NULL");
    return NULL;
  }

jeffkaufman added a commit that referenced this issue Jan 21, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
@praveen262k
Copy link
Author

@crowell @jeffkaufman @jmarantz - This is great and was quick turnaround and really appreciate the attention and prompt response! Thanks

@jeffkaufman
Copy link
Contributor

@praveen262k

Thanks! We're going to prepare an updated release that has this fix, and we're hoping to get that out early next week.

jeffkaufman added a commit that referenced this issue Jan 21, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
crowell added a commit that referenced this issue Jan 21, 2016
apply befa494
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
@jeffkaufman jeffkaufman changed the title Child process killed and left Apache not-responding Crash on very long urls Jan 25, 2016
jeffkaufman added a commit that referenced this issue Jan 25, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
jeffkaufman added a commit that referenced this issue Jan 26, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
jeffkaufman added a commit that referenced this issue Jan 26, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
@praveen262k
Copy link
Author

Yet not successful finding the long URI - not in apache, F5, DynaTrace, CDN!

CDN came back with list of URL who total bytes exceed 8190 - I tried to dig apache code and I could not understand if the header or other attributes were included along with URI to come up with that number (8190 as configured in Apache) and throw error.

http://code.metager.de/source/xref/apache/httpd/server/protocol.c

@crowell
Copy link
Contributor

crowell commented Jan 26, 2016

@praveen262k yes, the 8190 is a limitation of apache. with the patch in mod_pagespeed, your server shouldn't crash any more, but we can't work around the limitation in apache.

@praveen262k
Copy link
Author

@crowell Thanks!

I was trying to find the actual URL because we don't expect anything of that large URI. However as part of RCA I was trying to find the URL to close the loop. CDN came back with logs that exceeded 8190 bytes but if I see those URI they were around 4000+ char length only.

@jmarantz
Copy link
Contributor

Could this be an external denial-of-service attack?

On Tue, Jan 26, 2016 at 1:51 PM, praveen262k notifications@github.com
wrote:

@crowell https://github.com/crowell Thanks!

I was trying to find the actual URL because we don't expect anything of
that large URI. However as part of RCA I was trying to find the URL to
close the loop. CDN came back with logs that exceeded 8190 bytes but if I
see those URI they were around 4000+ char length only.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

Is it possible log the source URI in the error by pagespeed module? :)

@jmarantz
Copy link
Contributor

I think that's what got us into trouble at first -- we were logging it and
it was NULL, because Apache refuses to populate it.

On Tue, Jan 26, 2016 at 5:04 PM, praveen262k notifications@github.com
wrote:

Is it possible log the source URI in the error by pagespeed module? :)


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k
Copy link
Author

@jeffkaufman - Do you think you might be able to tell us a tentative release date at this point? Thanks!

@jeffkaufman
Copy link
Contributor

@praveen262k I'm sorry, we've been running into issues with some other changes we're planning to include in this same bugfix release. We're hoping to have a release out that fixes this issue within a week, but we can't promise anything.

@praveen262k
Copy link
Author

@jeffkaufman Understood! Thanks for the update!

jeffkaufman added a commit that referenced this issue Jan 29, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
jeffkaufman added a commit that referenced this issue Jan 29, 2016
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants