Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No default temp path in nginx module causes segfault. #2189

Closed
aried3r opened this issue Mar 22, 2019 · 30 comments

Comments

@aried3r
Copy link

commented Mar 22, 2019

Issue report

Question 1: What is the problem?
Your answer:

With nginx 1:1.15.8-8.6.0.1~xenial1 on a Rails 5.2.2.1 app I can use Rails send_file without a problem. When upgrading to nginx/xenial 1:1.15.8-8.6.0.2~xenial1 this doesn't work anymore (for most file, not all of them. For example thumbnails of images work, but the originals don't).

nginx' error.log shows:

2019/03/22 13:10:47 [alert] 21227#21227: worker process 2032 exited on signal 11
2019/03/22 13:10:47 [alert] 21227#21227: worker process 2031 exited on signal 11
2019/03/22 13:10:47 [alert] 21227#21227: worker process 23789 exited on signal 11
2019/03/22 13:12:02 [alert] 21227#21227: worker process 23788 exited on signal 11
2019/03/22 13:12:03 [alert] 21227#21227: worker process 23790 exited on signal 11
2019/03/22 13:12:03 [alert] 21227#21227: worker process 23898 exited on signal 11
2019/03/22 13:12:53 [alert] 21227#21227: worker process 23901 exited on signal 11
2019/03/22 13:16:02 [alert] 21227#21227: worker process 23902 exited on signal 11
2019/03/22 13:16:02 [alert] 21227#21227: worker process 23976 exited on signal 11
2019/03/22 13:16:02 [alert] 21227#21227: worker process 24219 exited on signal 11

I can reproduce this by upgrading nginx, nginx-common, nginx-extras to 1:1.15.8-8.6.0.2~xenial1 and then downgrading again to 1:1.15.8-8.6.0.1~xenial1.

Since I am using the passenger provided nginx packages, I believe this to be a passenger problem, but that's just an assumption. I couldn't find the source repo or changelog of these releases.

I'm not sure if I can provide a reproduction case since it just happens on our staging server after the upgrade of the packages and not for all files. I'm happy to provide additional information!

Question 2: Passenger version and integration mode:

  • For example: open source 5.0.26 standalone; enterprise 5.0.21/nginx
$ passenger --version
Phusion Passenger 6.0.2
$ nginx -v
nginx version: nginx/1.15.8

Question 3: OS or Linux distro, platform (including version):

Ubuntu 16.04 LTS

Question 4: Passenger installation method:

Phusion APT repo

# /etc/apt/sources.list.d/phusionpassenger.list
deb      "https://oss-binaries.phusionpassenger.com/apt/passenger" xenial main
deb-src  "https://oss-binaries.phusionpassenger.com/apt/passenger" xenial main

Question 5: Your app's programming language (including any version managers) and framework (including versions):

Ruby 2.6.2, rbenv, Rails 5.2.2.1

Question 6: Are you using a PaaS and/or containerization? If so which one?

No.

Question 7: Anything else about your setup that we should know?

Using letsencrypt and the following nginx config (don't mind nginx 1.14.0 being shown there, it's just what the page generates):
https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=nginx-1.14.0&openssl=1.0.2g&hsts=yes&profile=modern

@sauliusgrigaitis

This comment has been minimized.

Copy link

commented Mar 23, 2019

I can confirm this. Same problem on a few servers running Ubuntu 16.04 and Ubuntu 18.04. Nginx just exits whenever the file is a bit larger (starting from around a few hundred of KBs).

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2019

Thanks for this report I'll look into it asap.

@CamJN

This comment has been minimized.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

@aried3r since you indicated you were interested, here's our ubuntu/debian nginx repo source: https://github.com/phusion/passenger_apt_automation/tree/master/debian_specs/nginx

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

Ok I'm going to need more info, as unfortunately I'm not able to reproduce this issue in my test environment. I've setup a Xenial vm, running our nginx-extras package (1:1.15.8-8.6.0.2~xenial1) and a basic Rails (5.2.2.1) app, with a route that responds with send_file. The file it sends is just under a MB (897KB).

Here's the controller method:

def sendfile
  send_file '/var/www/blog/log/production.log'
end

So is there something else that you can think of that I should do to make my test case more similar to your setup?

@CamJN CamJN added the NeedsFeedback label Mar 26, 2019
@sauliusgrigaitis

This comment has been minimized.

Copy link

commented Mar 28, 2019

Maybe you can try with a much larger file? Say at least 100MB.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 28, 2019

Sure can, i just used this one because you mentioned it started from around a few hundred KBs.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 28, 2019

Tested with a 143MB file, same result unfortunately.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Perhaps one of you could attach gdb to nginx and grab a backtrace of the segfault? Since I'm still having no luck reproducing this on my end.

@jhill

This comment has been minimized.

Copy link

commented Apr 2, 2019

Core was generated by `nginx: worker pr'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004424d8 in ngx_create_temp_file ()
(gdb) backtrace
#0 0x00000000004424d8 in ngx_create_temp_file ()
#1 0x00000000004425ae in ngx_write_chain_to_temp_file ()
#2 0x000000000045c31f in ?? ()
#3 0x000000000045d46a in ngx_event_pipe ()
#4 0x000000000048d7aa in ?? ()
#5 0x000000000048c5cf in ?? ()
#6 0x0000000000464679 in ?? ()
#7 0x0000000000459ae7 in ngx_process_events_and_timers ()
#8 0x00000000004626c5 in ?? ()
#9 0x0000000000460d0c in ngx_spawn_process ()
#10 0x0000000000461e84 in ?? ()
#11 0x000000000046330f in ngx_master_process_cycle ()
#12 0x00000000004381c8 in main ()

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2019

Ah, so we removed some code that set ngx_http_proxy_temp_path in this commit: 7d4251b, I wonder if that variable was used as a default for all temp files for nginx (it didn't seem to cause an issue in my tests). Can you set a proxy_temp_path config and see if the issue persists?

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2019

@aried3r

This comment has been minimized.

Copy link
Author

commented Apr 3, 2019

@jhill, thanks for providing the stack trace. Is there any guide I could follow to also produce one? Just in case mine would be different.

@CamJN, thanks for investigating!
A side question, is using passenger's nginx version still the recommended way to go once moving from Ubuntu 16.04 to 18.04? I saw in the installation guides that after 17.04 using just libnginx-mod-http-passenger with the provided nginx package from Ubuntu or nginx directly is being shown in the guides. I'm mainly asking out of interest and if this would have prevented this bug from occurring on your systems, since there wouldn't have been a custom built of nginx. Thanks!

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2019

@aried3r we recommend using the system nginx for ubuntu 18.04. In-fact in light of this issue I just looked and I might be able to get ubuntu 16.04 onto a dynamic module+system nginx too.

Also possibly of interest, if the app is using send_file and nginx is writing the response to a file on disk, then there is no speedup. send_file is only faster if the kernel takes over writing directly into the socket, but by writing to a tempfile first any speedup is going to be wasted.

As for the backtrace, this should work:

gdb nginx
set detach-on-fork off
run
# run curl in another terminal or make a request in your browser to cause the issue, then come back here
bt
@CamJN

This comment has been minimized.

Copy link
Contributor

commented Apr 11, 2019

@sauliusgrigaitis or @aried3r have you had a chance to try this with ngx_http_proxy_temp_path set?

@sauliusgrigaitis

This comment has been minimized.

Copy link

commented Apr 18, 2019

@CamJN do you mean set proxy_temp_path ?

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

@sauliusgrigaitis yes, sorry.

@sauliusgrigaitis

This comment has been minimized.

Copy link

commented Apr 18, 2019

@CamJN I tried with proxy_temp_path set to custom path and it still crashes.

@sauliusgrigaitis

This comment has been minimized.

Copy link

commented Jun 7, 2019

Is there anything else we could try?

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Jun 7, 2019

At this point someone would need to install the nginx-extras-dbg package, then start nginx under gdb, then trigger the segfault, which would then provide the source code line that is actually causing the issue within the ngx_create_temp_file function; and then they could inspect the state of the program, such as the variables (esp pointers) to see what's actually breaking. It's likely that an invalid pointer was dereferenced or an array was indexed out of bounds, but it could be something else.

@kevgrig

This comment has been minimized.

Copy link

commented Jun 21, 2019

This issue is affecting me as well (Passenger; send_file; nginx 1.17.0). Backtrace:

(gdb) bt
#0  ngx_create_temp_file (file=file@entry=0xf8f898, path=0x0, pool=0xf91130, persistent=0, clean=0, access=0) at src/core/ngx_file.c:159
#1  0x00000000004548ab in ngx_write_chain_to_temp_file (tf=0xf8f898, chain=chain@entry=0xf8f9a0) at src/core/ngx_file.c:114
#2  0x000000000046e126 in ngx_event_pipe_write_chain_to_temp_file (p=p@entry=0xf92968) at src/event/ngx_event_pipe.c:839
#3  0x000000000046eec1 in ngx_event_pipe_read_upstream (p=0xf92968) at src/event/ngx_event_pipe.c:273
#4  ngx_event_pipe (p=p@entry=0xf92968, do_write=<optimized out>, do_write@entry=0) at src/event/ngx_event_pipe.c:49
#5  0x000000000049e4eb in ngx_http_upstream_process_upstream (r=0xf91180, u=0xf92520) at src/http/ngx_http_upstream.c:3924
#6  0x000000000049d4f2 in ngx_http_upstream_handler (ev=0xfa1e00) at src/http/ngx_http_upstream.c:1289
#7  0x000000000047662e in ngx_epoll_process_events (cycle=0xf26b10, timer=<optimized out>, flags=<optimized out>) at src/event/modules/ngx_epoll_module.c:902
#8  0x000000000046b9b2 in ngx_process_events_and_timers (cycle=cycle@entry=0xf26b10) at src/event/ngx_event.c:242
#9  0x0000000000474398 in ngx_worker_process_cycle (cycle=0xf26b10, data=<optimized out>) at src/os/unix/ngx_process_cycle.c:750
#10 0x00000000004728f3 in ngx_spawn_process (cycle=cycle@entry=0xf26b10, proc=proc@entry=0x474283 <ngx_worker_process_cycle>, data=data@entry=0x0, name=name@entry=0x5b39de "worker process", respawn=respawn@entry=-3)
    at src/os/unix/ngx_process.c:199
#11 0x0000000000473456 in ngx_start_worker_processes (cycle=cycle@entry=0xf26b10, n=1, type=type@entry=-3) at src/os/unix/ngx_process_cycle.c:359
#12 0x0000000000474c40 in ngx_master_process_cycle (cycle=cycle@entry=0xf26b10) at src/os/unix/ngx_process_cycle.c:131
#13 0x000000000044b98c in main (argc=1, argv=<optimized out>) at src/core/nginx.c:382

The SIGSEGV is because path is NULL:

(gdb) frame 0
#0  ngx_create_temp_file (file=file@entry=0xf8f898, path=0x0, pool=0xf91130, persistent=0, clean=0, access=0) at src/core/ngx_file.c:159
159             name = path->name;

The path comes from tf->path which is NULL:

(gdb) frame 1
#1  0x00000000004548ab in ngx_write_chain_to_temp_file (tf=0xf8f898, chain=chain@entry=0xf8f9a0) at src/core/ngx_file.c:114
114             rc = ngx_create_temp_file(&tf->file, tf->path, tf->pool,
(gdb) print *tf
$1 = {file = {fd = -1, name = {len = 0, data = 0x0}, info = {st_dev = 0, st_ino = 0, st_nlink = 0, st_mode = 0, st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0, st_size = 0, st_blksize = 0, st_blocks = 0, st_atim = {tv_sec = 0, 
        tv_nsec = 0}, st_mtim = {tv_sec = 0, tv_nsec = 0}, st_ctim = {tv_sec = 0, tv_nsec = 0}, __glibc_reserved = {0, 0, 0}}, offset = 0, sys_offset = 0, log = 0xf24900, valid_info = 0, directio = 0}, offset = 0, path = 0x0, 
  pool = 0xf91130, warn = 0x5ba390 "an upstream response is buffered to a temporary file", access = 0, log_level = 5, persistent = 0, clean = 0, thread_write = 0}

My location directive has proxy_temp_path set:

  location / {
    passenger_enabled on;
    proxy_temp_path /tmp/;
  }

Let me know if you need anything else.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2019

@kevgrig we haven't shipped an nginx 1.17. Can you describe your setup a bit more. This looks like it may be an upstream nginx bug after all.

@kevgrig

This comment has been minimized.

Copy link

commented Jun 21, 2019

@CamJN I'm on Fedora 30; Ruby 2.6.3; Rails 5.1.7; Passenger 6.0.2. I download nginx source from https://nginx.org/download/, and configure and compile with:

./configure --user=nginx --group=nginx --prefix=/opt/nginx/ \
        --with-http_ssl_module --with-pcre --with-http_gzip_static_module \
        --add-module=$(passenger-config --nginx-addon-dir) \
        --add-module=/usr/local/src/nginx-upload-module \
        --with-debug
make
sudo make install

Interestingly, the description of the issue implied this works with nginx < 1.15.8-8.6.0.2, so I tried nginx 1.15.7 and experience the same problem. I don't remember the last time I updated the passenger gem. I can confirm this used to work some months ago before I ran my general scripts to update everything.

@kevgrig

This comment has been minimized.

Copy link

commented Jun 23, 2019

I confirmed that the problem is resolved when I patched /usr/local/share/gems/gems/passenger-6.0.2/src/nginx_module/Configuration.c with the reverse of commit 7d4251b and recompiled nginx.

@kevgrig

This comment has been minimized.

Copy link

commented Jun 23, 2019

Related issue: #2075

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

Thanks for the additional legwork. I'm really surprised that I can't trigger the issue given that reverting 7d4251b fixes it. Anyway this should be easy enough to fix then in the next Passenger release. I just hope I can avoid reintroducing #2075 but if not then this will definitely win, as that bug was just annoying whereas this one is breaking.

@kevgrig

This comment has been minimized.

Copy link

commented Jun 24, 2019

@CamJN Yeah, that's odd that you can't reproduce it. Maybe it's some sort of combination of network/CPU speed and disk speed where the client is able to read the file at a rate such that the nginx send buffers never hit their max (at which point nginx writes to the temp file)? You could try reducing the nginx send buffer sizes, and/or executing the client from somewhere with higher network latency.

@nitrotm

This comment has been minimized.

Copy link

commented Jun 28, 2019

We got the same issue with send_file and larger files (eg. 1MB). After investigation, even sending the data without using 'send_file' explicitly (for instance using render in rails) also triggers a crash but at a different location by looking at the IP.

What I found is that if I disable passenger buffering, it doesn't crash anymore. But of course that's a very temporary solution as it will impact performances.

passenger_buffer_response off;

Note: I am running Debian/stretch: nginx version is 1.10.3 and passenger is 6.0.2.

@CamJN

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2019

Yeah, I'm going to update the title to reflect the actual issue. We know what the cause is now, and we have a solution. As I said above I hope I can avoid reintroducing #2075 but if not then this will definitely be fixed and that regressed, as that bug was just annoying whereas this one is breaking.

@CamJN CamJN changed the title Upgrade from nginx 1.15.8-8.6.0.1 to 1.15.8-8.6.0.2 causes problems ("worker process $PID exited on signal 11") No default temp path in nginx module causes segfault. Jun 29, 2019
@CamJN

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2019

6.0.3 released

@CamJN CamJN closed this Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.