New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on very long urls #1248

Closed
praveen262k opened this Issue Jan 19, 2016 · 39 comments

Comments

Projects
None yet
4 participants
@praveen262k
Copy link

praveen262k commented Jan 19, 2016

We enabled pagespeed module 2 weeks back. We have been monitoring site since then.

Last 2 days, we see below logs and server goes not responding. Any help would be appreciated! As of now we have pulled out the module from production.

GDB Backtrace log:

(gdb) bt
#0  0x00007f54cfd95625 in raise () from /lib64/libc.so.6
#1  0x00007f54cfd96e05 in abort () from /lib64/libc.so.6
#2  0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#3  0x00007f54cad1dbe6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x00007f54cad1dc13 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x00007f54cad1dd0e in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00007f54cacc2837 in std::__throw_logic_error(char const*) () from /usr/lib64/libstdc++.so.6
#7  0x00007f54cacfde59 in ?? () from /usr/lib64/libstdc++.so.6
#8  0x00007f54cacfdf33 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /usr/lib64/libstdc++.so.6
#9  0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#10 0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#11 0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 0x00007f54d1860a58 in ap_send_error_response ()
#14 0x00007f54d1848506 in ap_read_request ()
#15 0x00007f54d18600e0 in ?? ()
#16 0x00007f54d185c148 in ap_run_process_connection ()
#17 0x00007f54d1869052 in ?? ()
#18 0x00007f54d00fe9d1 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f54cfe4b8fd in clone () from /lib64/libc.so.6

Apache log:

$ grep "child pid" error.20160118000000.log
[Sun Jan 17 20:52:31 2016] [notice] child pid 12587 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 08:51:39 2016] [notice] child pid 14373 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 09:36:22 2016] [notice] child pid 473 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 09:40:47 2016] [notice] child pid 24587 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 29667 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8104 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8695 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:44 2016] [notice] child pid 18148 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:46 2016] [notice] child pid 22018 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 22962 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23747 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23748 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:52 2016] [notice] child pid 25136 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25719 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25720 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27435 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27436 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27443 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:04 2016] [notice] child pid 32360 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9726 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9727 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:22 2016] [notice] child pid 14263 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 20134 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 21619 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 26322 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 3596 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 28482 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:53 2016] [notice] child pid 6108 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:02 2016] [notice] child pid 6925 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:07 2016] [notice] child pid 8408 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:13 2016] [notice] child pid 7685 exit signal Aborted (6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:14 2016] [notice] child pid 9603 exit signal Aborted (6), possible coredump in /etc/httpd
@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 19, 2016

What version/os/64 or 32 bits?
On Jan 19, 2016 5:51 PM, "praveen262k" notifications@github.com wrote:

We enabled pagespeed module 2 weeks back. We have been monitoring site
since then.

Last 2 days, we see below logs and server goes not responding. Any help
would be appreciated! As of now we have pulled out the module from
production.
GDB Backtrace log:

(gdb) bt
#0 0x00007f54cfd95625 in raise () from /lib64/libc.so.6
#1 #1
0x00007f54cfd96e05 in abort () from /lib64/libc.so.6
#2 #2
0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler() () from
/usr/lib64/libstdc++.so.6
#3 #3
0x00007f54cad1dbe6 in ?? () from /usr/lib64/libstdc++.so.6
#4 #4
0x00007f54cad1dc13 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5 #5
0x00007f54cad1dd0e in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6 #6
0x00007f54cacc2837 in std::__throw_logic_error(char const

) () from /usr/lib64/libstdc++.so.6 #7
#7 0x00007f54cacfde59 in
?? () from /usr/lib64/libstdc++.so.6 #8
#8 0x00007f54cacfdf33 in
std::basic_string, std::allocator >::basic_string(char const
,
std::allocator const&) () from /usr/lib64/libstdc++.so.6
#9 #9
0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#10 #10
0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#11 #11
0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 #12
0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 #13
0x00007f54d1860a58 in ap_send_error_response ()
#14 #14
0x00007f54d1848506 in ap_read_request ()
#15 #15
0x00007f54d18600e0 in ?? ()
#16 #16
0x00007f54d185c148 in ap_run_process_connection ()
#17 #17
0x00007f54d1869052 in ?? ()
#18 #18
0x00007f54d00fe9d1 in start_thread () from /lib64/libpthread.so.0
#19 #19
0x00007f54cfe4b8fd in clone () from /lib64/libc.so.6

Apache log:
$ grep "child pid" error.20160118000000.log
[Sun Jan 17 20:52:31 2016] [notice] child pid 12587 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 08:51:39 2016] [notice] child pid 14373 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 09:36:22 2016] [notice] child pid 473 exit signal Aborted (6),
possible coredump in /etc/httpd
[Mon Jan 18 09:40:47 2016] [notice] child pid 24587 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 29667 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8104 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:32 2016] [notice] child pid 8695 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:44 2016] [notice] child pid 18148 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:46 2016] [notice] child pid 22018 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 22962 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23747 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:49 2016] [notice] child pid 23748 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:52 2016] [notice] child pid 25136 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25719 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 25720 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27435 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27436 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:52:54 2016] [notice] child pid 27443 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:04 2016] [notice] child pid 32360 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9726 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:14 2016] [notice] child pid 9727 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 10:53:22 2016] [notice] child pid 14263 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 20134 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 21619 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:50 2016] [notice] child pid 26322 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 3596 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:51 2016] [notice] child pid 28482 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:57:53 2016] [notice] child pid 6108 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:02 2016] [notice] child pid 6925 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:07 2016] [notice] child pid 8408 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:13 2016] [notice] child pid 7685 exit signal Aborted
(6), possible coredump in /etc/httpd
[Mon Jan 18 12:58:14 2016] [notice] child pid 9603 exit signal Aborted
(6), possible coredump in /etc/httpd


Reply to this email directly or view it on GitHub
#1248.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 19, 2016

Apache version: 2.2
OS: Centos Red Hat Linux 64 bit

Thanks!

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 19, 2016

Also what version of mod_pagespeed?
On Jan 19, 2016 5:59 PM, "praveen262k" notifications@github.com wrote:

Apache version: 2.2
OS: Centos Red Hat Linux 64 bit

Thanks!


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 19, 2016

ModSpeed version: Version: 1.9.32.11
Thanks!

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 19, 2016

I saw similar issue reported #334 (looks pretty old ticket) - and was mentioned that it was resolved in Release 0.10.22.4 (https://developers.google.com/speed/pagespeed/module/release_notes#release_1.9.32.11-stable)

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

To add a few more information:

  1. We have enabled rewrite css, rewrite js and collapse whitespace are only enabled
  2. ModePagespeedRewriteLevel PassThrough
@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

A few more logs details we pulled from apache log:

Tue Jan 19 04:30:28 2016] [error] [mod_pagespeed 1.9.32.11-7550 @9909] .....
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct NULL not valid
[Tue Jan 19 04:30:29 2016] [notice] child pid 9909 exit signal Aborted (6), possible coredump in /etc/httpd

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 20, 2016

great, thanks. I'll symbolize the backtrace tomorrow morning and hopefully
that'll make pinpointing the problem easy.
On Jan 19, 2016 8:13 PM, "praveen262k" notifications@github.com wrote:

A few more logs details we pulled from apache log:

Tue Jan 19 04:30:28 2016] [error] [mod_pagespeed 1.9.32.11-7550 @9909]
.....
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct NULL not valid
[Tue Jan 19 04:30:29 2016] [notice] child pid 9909 exit signal Aborted
(6), possible coredump in /etc/httpd


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

A few more info:

We noticed that somehow this error is related to the crash (most of the time):
[Tue Jan 19 12:06:16 2016] [error] [client XXX.X.XXX.XX] request failed: URI too long (longer than 8190)

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 20, 2016

if you still have the process in gdb, could you do info proc mappings?

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

Unfortunately it says proc folder not found

However out team gave some more insight from the coredump and below is details

(gdb) bt
#0  0x00007f54cfd95625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f54cfd96e05 in abort () at abort.c:92
#2  0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:93
#3  0x00007f54cad1dbe6 in __cxxabiv1::__terminate (handler=<value optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#4  0x00007f54cad1dc13 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#5  0x00007f54cad1dd0e in __cxxabiv1::__cxa_throw (obj=0x7f52a4010de0, tinfo=<value optimized out>, dest=<value optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
#6  0x00007f54cacc2837 in std::__throw_logic_error (__s=<value optimized out>) at ../../../../libstdc++-v3/src/functexcept.cc:60
#7  0x00007f54cacfde59 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct<char const*> (__beg=0x0, __end=<value optimized out>,
    __a=<value optimized out>) at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:134
#8  0x00007f54cacfdf33 in _S_construct_aux<char const*> (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1543
#9  _S_construct<char const*> (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1559
#10 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7f54a0bcc580, __s=0x0, __a=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:216
#11 0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#14 0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#15 0x00007f54d1860a58 in ap_send_error_response (r=0x7f52a8097338, recursive_error=<value optimized out>) at /usr/src/debug/httpd-2.2.15/modules/http/http_protocol.c:1288
#16 0x00007f54d1848506 in ap_read_request (conn=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/server/protocol.c:944
#17 0x00007f54d18600e0 in ap_process_http_connection (c=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/modules/http/http_core.c:183
#18 0x00007f54d185c148 in ap_run_process_connection (c=0x7f538c05d3c0) at /usr/src/debug/httpd-2.2.15/server/connection.c:43
#19 0x00007f54d1869052 in process_socket (thd=0x7f54d2d8a340, dummy=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:544
#20 worker_thread (thd=0x7f54d2d8a340, dummy=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:894
#21 0x00007f54d00fe9d1 in start_thread (arg=0x7f54a0bcd700) at pthread_create.c:301
#22 0x00007f54cfe4b8fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 20, 2016

Can you do info sharedlibrary ?
On Jan 20, 2016 11:51 AM, "praveen262k" notifications@github.com wrote:

Unfortunately it says proc folder not found

However out team gave some more insight from the coredump and below is
details

(gdb) bt
#0 0x00007f54cfd95625 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 #1
0x00007f54cfd96e05 in abort () at abort.c:92
#2 #2
0x00007f54cad1fa5d in __gnu_cxx::__verbose_terminate_handler () at
../../../../libstdc++-v3/libsupc++/vterminate.cc:93
#3 #3
0x00007f54cad1dbe6 in __cxxabiv1::__terminate (handler=) at
../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#4 #4
0x00007f54cad1dc13 in std::terminate () at
../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#5 #5
0x00007f54cad1dd0e in __cxxabiv1::__cxa_throw (obj=0x7f52a4010de0, tinfo=,
dest=) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
#6 #6
0x00007f54cacc2837 in std::__throw_logic_error (__s=) at
../../../../libstdc++-v3/src/functexcept.cc:60
#7 #7
0x00007f54cacfde59 in std::basic_string, std::allocator >::_S_construct
(__beg=0x0, __end=,
__a=) at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:134
#8 #8
0x00007f54cacfdf33 in _S_construct_aux (this=0x7f54a0bcc580, __s=0x0,
__a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1543
#9 #9 _S_construct
(this=0x7f54a0bcc580, __s=0x0, __a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:1559
#10 #10
std::basic_string, std::allocator >::basic_string (this=0x7f54a0bcc580,
__s=0x0, __a=...)
at
/usr/src/debug/gcc-4.4.7-20120601/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:216
#11 #11
0x00007f54c9fce073 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#12 #12
0x00007f54c9effb9a in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#13 #13
0x00007f54c9f01e34 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#14 #14
0x00007f54c9f0b3a4 in ?? () from /usr/lib64/httpd/modules/mod_pagespeed.so
#15 #15
0x00007f54d1860a58 in ap_send_error_response (r=0x7f52a8097338,
recursive_error=) at
/usr/src/debug/httpd-2.2.15/modules/http/http_protocol.c:1288
#16 #16
0x00007f54d1848506 in ap_read_request (conn=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/server/protocol.c:944
#17 #17
0x00007f54d18600e0 in ap_process_http_connection (c=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/modules/http/http_core.c:183
#18 #18
0x00007f54d185c148 in ap_run_process_connection (c=0x7f538c05d3c0) at
/usr/src/debug/httpd-2.2.15/server/connection.c:43
#19 #19
0x00007f54d1869052 in process_socket (thd=0x7f54d2d8a340, dummy=) at
/usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:544
#20 #20 worker_thread
(thd=0x7f54d2d8a340, dummy=) at
/usr/src/debug/httpd-2.2.15/server/mpm/worker/worker.c:894
#21 #21
0x00007f54d00fe9d1 in start_thread (arg=0x7f54a0bcd700) at
pthread_create.c:301
#22 #22
0x00007f54cfe4b8fd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

Please find the attached image - I could not format is well here
noname
noname-2

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

Some more data!

screen shot 2016-01-20 at 11 24 52 am

URI too long -> No of time we see this error log
std::logic_error -> Count of this error messages
child Process -> No of times Child process terminate

We could not find yet why Apache is reporting long URL as we don't see that in our apache access log yet!

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 20, 2016

is the backtrace from the same process as the info sharedlibrary, the bt doesn't seem to align with anything reasonable here.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Jan 20, 2016

Can you glance at the access.log? This message does come from
mod_pagespeed:
[Tue Jan 19 12:06:16 2016] [error] [client 204.2.223.88] request failed:
URI too long (longer than 8190)
but it looks important, as it is correlated strongly with the crashes.
Maybe the long URLs are in the access.log and will help us figure out
what's going on.

-Josh

On Wed, Jan 20, 2016 at 10:39 AM, Jeffrey Crowell notifications@github.com
wrote:

is the backtrace from the same process as the info sharedlibrary, the bt
doesn't seem to align with anything reasonable here.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

@jmarantz I ran splunk query from those IP address and I could not find any URI that is long during that time window or from any other request at all.

Is it possible that URI was not too long but it was something else reported as URI too long?

@crowell - Yes both were run against the same coredump! Strange!

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 20, 2016

not doubting the relation between the crashes and the message, but that
logging is done by apache itself, not by pagespeed

http://apache2.sourcearchive.com/documentation/2.2.16-4/protocol_8c-source.html

On Wed, Jan 20, 2016 at 1:57 PM, Joshua Marantz notifications@github.com
wrote:

Can you glance at the access.log? This message does come from
mod_pagespeed:
[Tue Jan 19 12:06:16 2016] [error] [client 204.2.223.88] request failed:
URI too long (longer than 8190)
but it looks important, as it is correlated strongly with the crashes.
Maybe the long URLs are in the access.log and will help us figure out
what's going on.

-Josh

On Wed, Jan 20, 2016 at 10:39 AM, Jeffrey Crowell <
notifications@github.com>
wrote:

is the backtrace from the same process as the info sharedlibrary, the bt
doesn't seem to align with anything reasonable here.


Reply to this email directly or view it on GitHub
<
#1248 (comment)

.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

Jeff Crowell
https://github.com/crowell

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 20, 2016

I am wondering since process crashed, may explain why I am not finding this 'long URI' on access log. Does that make sense?

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Jan 20, 2016

I'm also suspicious that the crash is caused by an out-of-memory condition. I'm wondering if there are any other messages in the error log that might shed light on the cause?

Also, you mentioned above you have three filters enabled: rewrite_javascript, rewrite_css, and collapse_whitespace. Can you try turning them all off and seeing if the problem shows up? If it doesn't show up, can you add them one at a time, and see which one causes the problem?

Thanks!
-Josh

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 21, 2016

I am not sure if it is related to memory. So far we have not seen memory issue reported but I will dig in to those area. Every time crashing with URI too long make me think it may not be related to memory crash but I could be complete wrong.

We were going to recreate this issue on non-prod environment to simulate these scenarios as well. Without getting in to details initial thought apache was returning empty response with 414 status code and pagespeed module crashed with null exception

@basic_string.tcc -> line 134: __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Jan 21, 2016

You are probably right. If we can get an accurate symbolized stack trace
we should be able to narrow down the cause. Would you be able to take a
debug version of mod_pagespeed.so and run with that?

On Wed, Jan 20, 2016 at 7:58 PM, praveen262k notifications@github.com
wrote:

I am not sure if it is related to memory. So far we have not seen memory
issue reported but I will dig in to those area. Every time crashing with
URI too long make me think it may not be related to memory crash but I
could be complete wrong.

We were going to recreate this issue on non-prod environment to simulate
these scenarios as well. Without getting in to details initial thought
apache was returning empty response with 414 status code and pagespeed
module crashed with null exception

@basic_string.tcc -> line 134:
__throw_logic_error(__N("basic_string::_S_construct NULL not valid"));


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

The ap_send_error_response in the backtrace looks like a strong clue. That's not a path I'd expect us to go down often, and if we're processing an error response structures we expect to be set up properly might be incomplete. For example, url could be NULL.

I'm going to try to reproduce this here, triggering HTTP_REQUEST_URI_TOO_LARGE.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

To hit the request too large error message, the path is:

ap_read_request
  -> read_request_line(r, tmp_bb) returns false
     -> r->status == HTTP_REQUEST_URI_TOO_LARGE

read_request_line
  -> ap_rgetline() returns APR_ENOSPC
     ->     /* ap_rgetline returns APR_ENOSPC if it fills up the                 
             * buffer before finding the end-of-line.  This is only going to     
             * happen if it exceeds the configured limit for a request-line.     
             */

ap_rgetline()
  -> ap_rgetline_core()
    -> return APR_ENOSPC if this read would overrun the
       buffer or if there's more to read but we're full

This looks pretty normal, and like I should be able to repro by sending an 8k+ request line.

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

Reproduced!

curl -sS -D- localhost:8080/mod_pagespeed_example/a...[16000 times repeated]...a
#7  0x00002b5c19585977 in std::string::_S_construct<char const*> (__beg=0x0, 
    __end=0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, __a=...) at /usr/include/c++/4.8/bits/basic_string.tcc:133
#8  0x00002b5c19aa0128 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) ()
   from /usr/local/apache2/modules/mod_pagespeed.so
#9  0x00002b5c19419846 in net_instaweb::GoogleUrl::GoogleUrl (
    this=0x2b5c29006850, str=0x0) at pagespeed/kernel/http/google_url.cc:57
#10 0x00002b5c1927a6e6 in net_instaweb::InstawebContext::MakeRequestUrl (
    global_options=..., request=0x2b5c2c002970)
    at pagespeed/apache/instaweb_context.cc:368
#11 0x00002b5c1927bc77 in net_instaweb::InstawebHandler::InstawebHandler (
    this=0x2b5c29006b00, request=0x2b5c2c002970)
    at pagespeed/apache/instaweb_handler.cc:118
#12 0x00002b5c19282b37 in net_instaweb::(anonymous namespace)::build_context_for_request (request=0x2b5c2c002970) at pagespeed/apache/mod_instaweb.cc:389
#13 0x00002b5c1928384b in net_instaweb::(anonymous namespace)::instaweb_out_filter (filter=0x2b5c2c00a888, bb=0x2b5c2c004408)
    at pagespeed/apache/mod_instaweb.cc:610
@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

The problem is that request->unparsed_uri can be NULL and we think it can't, in InstawebContext::MakeRequestUrl.

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 21, 2016

symbolized from the other bt, so yes it is the same issue 👍

(note to self info sharedlibrary gives .text address, not base)

[0x00078840]> fd 0x14e073
sym.net_instaweb::GoogleUrl::Relativize + 851
[0x00078840]> fd 0x7fb9a
sym.net_instaweb::InstawebContext::MakeRequestUrl + 282
[0x00078840]> fd 0x81e34
sym.net_instaweb::InstawebHandler::InstawebHandler + 436
[0x00078840]> fd 0x8b3a4
sym.net_instaweb::_anonymousnamespace_::instaweb_out_filter + 1012
@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

Turns out we're checking for this exact problem, just too late:

  InstawebHandler instaweb_handler(request);
     [ note: does a InstawebContext::MakeRequestUrl, which assumes unparsed_uri != NULL ]
  const RewriteOptions* options = instaweb_handler.options();

  if (request->unparsed_uri == NULL) {
    // TODO(jmarantz): consider adding Debug message if unparsed_uri is NULL,
    // possibly of request->the_request which was non-null in the case where
    // I found this in the debugger.
    ap_log_rerror(APLOG_MARK, APLOG_ERR, APR_SUCCESS, request,
                  "Request not rewritten because: "
                  "request->unparsed_uri == NULL");
    return NULL;
  }

jeffkaufman added a commit that referenced this issue Jan 21, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 21, 2016

@crowell @jeffkaufman @jmarantz - This is great and was quick turnaround and really appreciate the attention and prompt response! Thanks

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 21, 2016

@praveen262k

Thanks! We're going to prepare an updated release that has this fix, and we're hoping to get that out early next week.

jeffkaufman added a commit that referenced this issue Jan 21, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248

crowell added a commit that referenced this issue Jan 21, 2016

Move InstawebHandler initialization after validity checks.
apply befa494
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248

@jeffkaufman jeffkaufman changed the title Child process killed and left Apache not-responding Crash on very long urls Jan 25, 2016

jeffkaufman added a commit that referenced this issue Jan 25, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248

jeffkaufman added a commit that referenced this issue Jan 26, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248

jeffkaufman added a commit that referenced this issue Jan 26, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 26, 2016

Yet not successful finding the long URI - not in apache, F5, DynaTrace, CDN!

CDN came back with list of URL who total bytes exceed 8190 - I tried to dig apache code and I could not understand if the header or other attributes were included along with URI to come up with that number (8190 as configured in Apache) and throw error.

http://code.metager.de/source/xref/apache/httpd/server/protocol.c

@crowell

This comment has been minimized.

Copy link
Contributor

crowell commented Jan 26, 2016

@praveen262k yes, the 8190 is a limitation of apache. with the patch in mod_pagespeed, your server shouldn't crash any more, but we can't work around the limitation in apache.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 26, 2016

@crowell Thanks!

I was trying to find the actual URL because we don't expect anything of that large URI. However as part of RCA I was trying to find the URL to close the loop. CDN came back with logs that exceeded 8190 bytes but if I see those URI they were around 4000+ char length only.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Jan 26, 2016

Could this be an external denial-of-service attack?

On Tue, Jan 26, 2016 at 1:51 PM, praveen262k notifications@github.com
wrote:

@crowell https://github.com/crowell Thanks!

I was trying to find the actual URL because we don't expect anything of
that large URI. However as part of RCA I was trying to find the URL to
close the loop. CDN came back with logs that exceeded 8190 bytes but if I
see those URI they were around 4000+ char length only.


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 26, 2016

Is it possible log the source URI in the error by pagespeed module? :)

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Jan 26, 2016

I think that's what got us into trouble at first -- we were logging it and
it was NULL, because Apache refuses to populate it.

On Tue, Jan 26, 2016 at 5:04 PM, praveen262k notifications@github.com
wrote:

Is it possible log the source URI in the error by pagespeed module? :)


Reply to this email directly or view it on GitHub
#1248 (comment)
.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 27, 2016

@jeffkaufman - Do you think you might be able to tell us a tentative release date at this point? Thanks!

@jeffkaufman

This comment has been minimized.

Copy link
Contributor

jeffkaufman commented Jan 27, 2016

@praveen262k I'm sorry, we've been running into issues with some other changes we're planning to include in this same bugfix release. We're hoping to have a release out that fixes this issue within a week, but we can't promise anything.

@praveen262k

This comment has been minimized.

Copy link

praveen262k commented Jan 27, 2016

@jeffkaufman Understood! Thanks for the update!

jeffkaufman added a commit that referenced this issue Jan 29, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248

jeffkaufman added a commit that referenced this issue Jan 29, 2016

Move InstawebHandler initialization after validity checks.
Creating an instaweb handler will run MakeRequestUrl, which assumes that
request->unparsed_uri is non-null.  So move the creation to after where
we check that it's non-null.  To be safe, move it all the way down to
where it's first needed, in case some other validity checks end up being
relevant.

Fixes #1248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment