Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.0.1->4.1.0, Rails logs flushing incorrectly #1906

Closed
pauloddr opened this issue Aug 12, 2019 · 15 comments · Fixed by #1946

Comments

@pauloddr
Copy link

commented Aug 12, 2019

Steps to reproduce

I can't seem to reproduce the issue locally, so I'll try my best to explain what happened.

  1. Deployed an existing Rails app on Heroku after upgrading Puma from 4.0.1 to 4.1.0.

  2. Made some requests. App works fine. Checked the logs.

  3. Request logs were absent. (Heroku Router logs were present.)

  4. Rebooted the application and all previous request logs were dumped at once with the wrong times (app shutdown time).

Expected behavior

Request logs should appear right after they happen.

Actual behavior

Request logs are dumped after the application exits instead, preventing real-time logging and breaking log times.

Reverting to Puma 4.0.1 restores intended behavior.

System configuration

Ruby version: 2.6.1
Rails version: 6.0.0.rc2
Puma version: 4.1.0

Log Dump

Aug 12 01:38:13 app app/web.1:  => Booting Puma 
Aug 12 01:38:13 app app/web.1:  => Rails 6.0.0.rc2 application starting in production  
Aug 12 01:38:13 app app/web.1:  => Run `rails server --help` for more startup options 
Aug 12 01:38:15 app app/web.1:  ** [NewRelic][2019-08-12 08:38:12 +0000 web.1 (4)] INFO : Starting the New Relic agent version 6.5.0.357 in "production" environment. 

... more NewRelic logs ...

Aug 12 01:38:15 app app/web.1:  Puma starting in single mode... 
Aug 12 01:38:15 app app/web.1:  * Version 4.1.0 (ruby 2.6.1-p33), codename: Fourth and One 
Aug 12 01:38:15 app app/web.1:  * Min threads: 2, max threads: 2 
Aug 12 01:38:15 app app/web.1:  * Environment: production 
Aug 12 01:38:15 app app/web.1:  * Listening on tcp://0.0.0.0:56371 
Aug 12 01:38:15 app app/web.1:  Use Ctrl-C to stop 
Aug 12 01:38:15 app heroku/web.1:  State changed from starting to up 

<missing request logs>

Aug 12 01:38:43 app heroku/router:  at=info method=GET path="/songs" host=...

<missing request logs>

Aug 12 01:38:43 app heroku/router:  at=info method=GET path="/stats/current" host=...

<missing request logs>

Aug 12 01:38:43 app heroku/router:  at=info method=GET path="/session" host=...

<missing request logs>

Aug 12 01:38:43 app heroku/router:  at=info method=GET path="/videos" host=...

<missing request logs>

Aug 12 01:38:43 app heroku/router:  at=info method=GET path="/artists" host=...

<missing request logs>

Aug 12 01:38:47 app heroku/router:  at=info method=GET path="/session" host=...

<missing request logs>

Aug 12 01:39:54 app heroku/router: at=info method=GET path="/session" host=...

<missing request logs>

Aug 12 01:40:30 app heroku/router: at=info method=HEAD path="/" host=...
Aug 12 01:42:09 app heroku/web.1: Restarting 
Aug 12 01:42:09 app heroku/web.1: State changed from up to starting 
Aug 12 01:42:09 app heroku/web.1: Stopping all processes with SIGTERM 
Aug 12 01:42:10 app app/web.1: - Gracefully stopping, waiting for requests to finish 
Aug 12 01:42:10 app app/web.1: === puma shutdown: 2019-08-12 08:42:09 +0000 === 
Aug 12 01:42:10 app app/web.1: - Goodbye! 
Aug 12 01:42:10 app app/web.1: ** [NewRelic][2019-08-12 08:38:16 +0000 web.1 (4)] INFO : Reporting to: https://rpm.newrelic.com/accounts/???/applications/???
Aug 12 01:42:10 app app/web.1: I, [2019-08-12T08:38:42.779002 #4]  INFO -- : [5379ed03-308c-4863-be83-7a6ee678ba4a] Started GET "/stats/current" for 172.69.3.170 at 2019-08-12 08:38:42 +0000 
Aug 12 01:42:10 app app/web.1: I, [2019-08-12T08:38:42.780675 #4]  INFO -- : [5379ed03-308c-4863-be83-7a6ee678ba4a] Processing by StatsController#current as HTML 
Aug 12 01:42:10 app app/web.1: I, [2019-08-12T08:38:42.791294 #4]  INFO -- : [d0892f85-4464-4689-b8b5-d9a55e0c3d8f] Started GET "/session" for 172.69.3.170 at 2019-08-12 08:38:42 +0000 
Aug 12 01:42:10 app app/web.1: I, [2019-08-12T08:38:42.791978 #4]  INFO -- : [d0892f85-4464-4689-b8b5-d9a55e0c3d8f] Processing by SessionsController#index as HTML 

... rest of previous request logs dumped with the wrong times ...

@nateberkopec nateberkopec added the bug label Aug 12, 2019

@nateberkopec

This comment has been minimized.

Copy link
Member

commented Aug 12, 2019

Definitely related to #1837.

Please post your Puma config file, if any.

@nateberkopec

This comment has been minimized.

Copy link
Member

commented Aug 12, 2019

FWIW everything does seem to work fine with a "normal" Puma config on Heroku, so we need to figure out what you're doing differently.

@pauloddr

This comment has been minimized.

Copy link
Author

commented Aug 12, 2019

My Puma config file is quite standard. I think it's the one that came default with Rails 6.

# frozen_string_literal: true

# Puma can serve each request in a thread from an internal thread pool.
# The `threads` method setting takes two numbers: a minimum and maximum.
# Any libraries that use thread pools should be configured to match
# the maximum value specified for Puma. Default is set to 5 threads for minimum
# and maximum; this matches the default thread size of Active Record.
#
max_threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }
min_threads_count = ENV.fetch('RAILS_MIN_THREADS') { max_threads_count }
threads min_threads_count, max_threads_count

# Specifies the `port` that Puma will listen on to receive requests; default is 3000.
#
port        ENV.fetch('PORT') { 3000 }

# Specifies the `environment` that Puma will run in.
#
environment ENV.fetch('RAILS_ENV') { 'development' }

# Specifies the number of `workers` to boot in clustered mode.
# Workers are forked web server processes. If using threads and workers together
# the concurrency of the application would be max `threads` * `workers`.
# Workers do not work on JRuby or Windows (both of which do not support
# processes).
#
# workers ENV.fetch("WEB_CONCURRENCY") { 2 }

# Use the `preload_app!` method when specifying a `workers` number.
# This directive tells Puma to first boot the application and load code
# before forking the application. This takes advantage of Copy On Write
# process behavior so workers use less memory.
#
# preload_app!

# Allow puma to be restarted by `rails restart` command.
plugin :tmp_restart

@pauloddr

This comment has been minimized.

Copy link
Author

commented Aug 12, 2019

Some other info that hopefully is useful:

Env variable RAILS_LOG_TO_STDOUT has the value "enabled" on Heroku. I believe this was set up automatically when I created the project there.

My config/environments/production.rb has the following block, also I believe to be default to Rails 6:

  if ENV['RAILS_LOG_TO_STDOUT'].present?
    logger           = ActiveSupport::Logger.new(STDOUT)
    logger.formatter = config.log_formatter
    config.logger    = ActiveSupport::TaggedLogging.new(logger)
  end

The log in the first post shows something interesting. NewRelic starts sending out logs before Puma shows its startup message. But the last line in the NewRelic log spam is omited, and only shown after the app is closed.

@ThibG

This comment has been minimized.

Copy link

commented Aug 13, 2019

I can see the same behavior with Mastodon, which has pretty much the same code in its Rails configuration:

  ActiveSupport::Logger.new(STDOUT).tap do |logger|
    logger.formatter = config.log_formatter
    config.logger = ActiveSupport::TaggedLogging.new(logger)
  end

Reverting #1837 fixes it

@nateberkopec nateberkopec changed the title 4.0.1->4.1.0, Rails logs being deferred on Heroku 4.0.1->4.1.0, Rails logs flushing incorrectly Aug 14, 2019

@nateberkopec nateberkopec added this to the 4.1.1 milestone Aug 14, 2019

@mensfeld

This comment has been minimized.

Copy link

commented Aug 18, 2019

This error seems to cause also capistrano3-puma to hang waiting on the IO when deploying.

@mattbrictson

This comment has been minimized.

Copy link

commented Aug 19, 2019

This error seems to cause also capistrano3-puma to hang waiting on the IO when deploying.

I'm seeing this as well. I've traced it to this commit: 70b28bb

My hunch is that due to that commit, output is no longer being flushed before Process.daemon calls fork, and so the IO is not cleanly disconnected from console (capistrano's SSH connection in this case).

I'm planning on opening an issue and PR if I am able to verify this.

@mensfeld

This comment has been minimized.

Copy link

commented Aug 19, 2019

@mattbrictson yes. That is exactly what is happening ;)

@mattbrictson

This comment has been minimized.

Copy link

commented Aug 19, 2019

I dug into this a little more.

Puma reopens STDOUT and STDERR to /dev/null in order to detach the IO of the forked process from the console (part of daemonizing). However it now is actually just reopening a copy of those streams, because they were previously dup'ed inside Events#initialize.

Here is where the streams are being reopened:

STDOUT.reopen null_out
STDERR.reopen null_out

But that doesn't affect the dup'ed ones created here:

puma/lib/puma/events.rb

Lines 32 to 33 in 3b34c3f

@stdout = stdout.dup
@stderr = stderr.dup

So the upshot is that Events is still writing to the original streams and Process.daemon is unable to fully detach the IO.

I can't find an easy fix besides just reverting #1837.

@montanalow any ideas?

@dentarg

This comment has been minimized.

Copy link
Contributor

commented Aug 19, 2019

@mattbrictson oh, I think that monkey patching of Process.daemon can be removed? nice find

It doesn't look like it has been meaningful touched since 2013: https://github.com/puma/puma/commits/v4.1.0/lib/puma/daemon_ext.rb

And according to a comment on the commit, 775534c, the bug in Ruby is solved.

I tested the repro from the bug with the lowest Ruby version Puma tests with:

$ docker run -it --rm -v $(pwd):/app ruby:2.2.10 bash
root@4ab5c614136e:/# ls /app
repro.rb
root@4ab5c614136e:/# cat /app/repro.rb
r1, w1 = IO.pipe
r2, w2 = IO.pipe

t = Thread.new {
puts "start"
w1.write "x"
IO.select([r2])
puts "alive"
}

IO.select([r1])
Process.daemon true, true # comment this line out and everything works
puts Process.pid
w2.write "x"
t.join
puts "done"
root@4ab5c614136e:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.2  0.1  20244  3216 pts/0    Ss   20:45   0:00 bash
root         8  0.0  0.1  17500  2104 pts/0    R+   20:45   0:00 ps aux
root@4ab5c614136e:/# ruby /app/repro.rb
start
root@4ab5c614136e:/# 15
done

root@4ab5c614136e:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1  20244  3216 pts/0    Ss   20:45   0:00 bash
root        18  0.0  0.1  17500  2064 pts/0    R+   20:46   0:00 ps aux
@mattbrictson

This comment has been minimized.

Copy link

commented Aug 19, 2019

@dentarg I tried removing the Process.daemon monkey patch and that didn't seem to change any behavior. The stdout/stderr problem still is an issue.

@montanalow

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2019

This makes sense as an issue. Puma <4.1.0 was previously forcing all STDOUT to be sync. When RAILS_LOG_TO_STDOUT is present on heroku, this preserves timestamps in logs that are written to STDOUT.

Rails dropped sync = true as the default in 5.0 which caused similar issues for non Puma configurations on heroku.

heroku/rails_stdout_logging#31
heroku/rails_stdout_logging#32

The fix may be to require this gem for heroku configurations:
https://github.com/heroku/rails_stdout_logging/blob/master/lib/rails_stdout_logging/rails.rb#L18

@Exupery

This comment has been minimized.

Copy link

commented Sep 1, 2019

@pauloddr I had a similar issue with Puma no longer flushing output correctly after updating recently but was able to workaround it by adding STDOUT.sync = true prior to the logger creation (i.e. logger = ActiveSupport::Logger.new(STDOUT))

@nateberkopec

This comment has been minimized.

Copy link
Member

commented Sep 2, 2019

I'm going to revert #1837. We need to stop messing with STDOUT.sync eventually but we'll revert until someone takes another crack at it.

@nateberkopec

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

4.1.1 will be out within the hour and fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.