Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #26561 - logging stack replaced #61

Merged
merged 2 commits into from
Nov 5, 2020

Conversation

lzap
Copy link
Member

@lzap lzap commented Apr 8, 2019

This replaces the Ruby logging stack with logging gem which provides enhanced features including logging to journald via structured fields.

@lzap
Copy link
Member Author

lzap commented Apr 8, 2019

Inspiration at theforeman/smart-proxy#611

This is vaguely tested at the moment, also the code currently does not take advantage of the exception helper method at all and also no correlation id is taken into account. Help me to integrate this - the idea is that a request in Foreman gets UUID which is then passed to Foreman proxy and logged as well. We need to pass this UUID to Dynflow and log it along with all records.

@adamruzicka
Copy link
Contributor

Smart proxy proxies requests to smart proxy dynflow core. The context needs to be extracted from the incoming request and passed into the outgoing request somewhere around here https://github.com/theforeman/smart_proxy_dynflow/blob/master/lib/smart_proxy_dynflow/callback.rb#L22 .

This context the needs to be read from the incoming request in smart proxy dynflow core's API https://github.com/theforeman/smart_proxy_dynflow/blob/master/lib/smart_proxy_dynflow_core/api.rb , be used for sinatra's logging and then passed down into the action which is being triggered.

Then we will need to introduce some kind of middleware to set/unset the correct logging context when an action is being run. Something like theforeman/foreman-tasks#383 should do the trick.

@iNecas did I miss anything?

@iNecas
Copy link
Member

iNecas commented Apr 9, 2019

If we would be ok with the API calls on smart-proxy-dynflow and smart-proxy-dynflow-core not using the correlation id, but just using it in the core dynflow tasks, the mechanism might be a bit simpler, as it would mean just passing it as input for the proxy tasks and restoring the mdc via middleware.

def call(_severity, _datetime, _prog_name, message)
if message.is_a?(::Exception)
def format(message)
if message.is_a(Exception)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is_a/is_a?

@action_logger = apply_formatters(ProgNameWrapper.new(@logger, ' action'), [ProxyStructuredFormater])
@dynflow_logger = apply_formatters(ProgNameWrapper.new(@logger, 'dynflow'), [ProxyStructuredFormater])
end
end
end
end

class PryAction < ::Dynflow::Action
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go away before merge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could turn it into something like SmartProxyDynflowCore::DummyAction with docs comments on how to use it: it might be quite nice debugging setup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this then live directly in Dynflow so it can be used anywhere?

@lzap
Copy link
Member Author

lzap commented Apr 9, 2019

So I have incorporated Adam's core (world) changes and it does seem to work to some degree. I rebased a WIP version, then I do curl localhost:8008/tasks/ -X POST -d ' { "action_name": "PryAction", "action_input": {} } ' and getting somehow expected result:

2019-04-09T17:10:06 ae37a6bb [D] require_ssl_client_verification: skipping, non-HTTPS request
2019-04-09T17:10:06 ae37a6bb [D] ExecutionPlan 7dc3908b-cd6d-4c51-876b-5e026cbd5692      pending >>  planning
2019-04-09T17:10:06 ae37a6bb [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 1   pending >>   running in phase     Plan PryAction
2019-04-09T17:10:06 ae37a6bb [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 1   running >>   success in phase     Plan PryAction
2019-04-09T17:10:06 ae37a6bb [D] ExecutionPlan 7dc3908b-cd6d-4c51-876b-5e026cbd5692     planning >>   planned
2019-04-09T17:10:06  [D] ExecutionPlan 7dc3908b-cd6d-4c51-876b-5e026cbd5692      planned >>   running
2019-04-09T17:10:06  [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 2   pending >>   running in phase      Run PryAction
2019-04-09T17:10:06  [D] Restoring request id
2019-04-09T17:10:06 ae37a6bb [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 2   running >>   success in phase      Run PryAction
2019-04-09T17:10:06  [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 3   pending >>   running in phase Finalize PryAction
127.0.0.1 - - [09/Apr/2019:17:10:05 CEST] "POST /tasks/ HTTP/1.1" 200 50
2019-04-09T17:10:06  [D] Restoring request id
2019-04-09T17:10:06 ae37a6bb [D]          Step 7dc3908b-cd6d-4c51-876b-5e026cbd5692: 3   running >>   success in phase Finalize PryAction
2019-04-09T17:10:06  [D] close: 127.0.0.1:43802
2019-04-09T17:10:06  [D] ExecutionPlan 7dc3908b-cd6d-4c51-876b-5e026cbd5692      running >>   stopped

Notice that there are some log entries before Restoring request id log is logged, after that the request ID is correctly set. This is likely a problem of Dynflow Middleware ordering, we need to make sure our middleware comes first in order to set the MDC as early as possible.

@lzap
Copy link
Member Author

lzap commented Apr 9, 2019

If I want to test this though proxy, what is the URL? The curl above is giving me 404 (spd is enabled and initialized).

@adamruzicka
Copy link
Contributor

For testing through the proxy you need to have smart_proxy_dynflow plugin enabled there and core_url correctly set to point to where smart proxy dynflow core is running. The just post to $proxy:$proxy_port/dynflow/tasks

@lzap
Copy link
Member Author

lzap commented Apr 12, 2019

Ok I think this is now ready for review and testing. This is how it works, smart proxy log:

2019-04-12T13:11:08  [D] accept: 192.168.1.13:42116
2019-04-12T13:11:08  [D] Rack::Handler::WEBrick is invoked.
2019-04-12T13:11:08 9e428de6 [I] Started GET /console 
2019-04-12T13:11:08 9e428de6 [D] Headers: {"HTTP_HOST"=>"box.home.lan:8448", "HTTP_USER_AGENT"=>"curl/7.61.1", "HTTP_ACCEPT"=>"*/*", "HTTP_VERSION"=>"HTTP/1.1"}
2019-04-12T13:11:08 9e428de6 [D] require_ssl_client_verification: skipping, non-HTTPS request
2019-04-12T13:11:08 9e428de6 [D] verifying remote client box (based on REMOTE_ADDR) against trusted_hosts ["box", "box.home.lan", "localhost", "127.0.0.1", "192.168.122.1", "foreman.local.lan", "foreman.provision.lan"]
2019-04-12T13:11:08 9e428de6 [D] Proxy request from box.home.lan:8448/dynflow/console to http://127.0.0.1:8008/console
2019-04-12T13:11:08 9e428de6 [D] Proxy request status 200 - #<Net::HTTPOK:0x0000555557e3a1a0>
2019-04-12T13:11:08 9e428de6 [I] Finished GET /console with 200 (9.07 ms)
2019-04-12T13:11:08  [D] close: 192.168.1.13:42116
^C2019-04-12T13:11:16  [D] close TCPSocket(0.0.0.0, 8443)

Those debug messages without any request id (9e428de6) are coming from webrick, we have it enabled (hardcoded). For dynflow core I followed the very same approach as in proxy: the rack common and access loggers are both disabled and there is a custom middleware which logs requests along with extra debugging information into the debug mode when enabled:

2019-04-12T13:11:08  [D] accept: 127.0.0.1:39922
2019-04-12T13:11:08  [D] Rack::Handler::WEBrick is invoked.
2019-04-12T13:11:08 9e428de6 [I] Started GET  
2019-04-12T13:11:08 9e428de6 [D] Headers: {"HTTP_ACCEPT_ENCODING"=>"gzip;q=1.0,deflate;q=0.6,identity;q=0.3", "HTTP_ACCEPT"=>"*/*, application/json,version=2", "HTTP_USER_AGENT"=>"Ruby", "HTTP_X_FORWARDED_FOR"=>"box.home.lan:8448", "HTTP_X_REQUEST_ID"=>"9e428de6-b6d1-47b0-8546-f4f364d6d4f6", "HTTP_CONNECTION"=>"close", "HTTP_HOST"=>"127.0.0.1:8008", "HTTP_VERSION"=>"HTTP/1.1"}
2019-04-12T13:11:08 9e428de6 [D] 
2019-04-12T13:11:08 9e428de6 [I] Finished GET  with 200 (4.43 ms)
2019-04-12T13:11:08  [D] close: 127.0.0.1:39922

As you can see the the request ID is passed into the core and logged along with other structured info.

However, there is still one issue described above - dynflow log lines do not contain the request id until the method restore_current_request_id of the provided middleware is called. I see no way of inserting dynflow middleware to the very first position so it's called early before anything else happens. Any idea @iNecas ? Do we need to implement it? I guess so.

@lzap
Copy link
Member Author

lzap commented Apr 18, 2019

So I tested this with STDOUT, JOURNAL and SYSLOG. All works fine, fixed all comments. Here is an example how an entry looks like, not many custom fields except REQUEST id and REMOTE_IP and SYSLOG_IDENTIFIER and PRIORITY (= logging level) for now, but it's a good start.

# journalctl -f -o verbose
    _SOURCE_REALTIME_TIMESTAMP=1555573546705597
Thu 2019-04-18 09:45:46.707400 CEST [s=927adf6bc1964d178dc5ac5cf1f5a58b;i=25c15c;b=44de32f5987c4f1b86e415cad1cfbd5d;m=1168f946b;t=586c930ac69d3;x=14446913b24dc7d2]
    PRIORITY=6
    _TRANSPORT=journal
    _UID=1000
    _GID=1000
    _CAP_EFFECTIVE=0
    _AUDIT_LOGINUID=1000
    _SYSTEMD_OWNER_UID=1000
    _SYSTEMD_SLICE=user-1000.slice
    _SYSTEMD_USER_SLICE=-.slice
    _MACHINE_ID=2a2fb6fa375d459da95c6ada4a67d03a
    _HOSTNAME=box.home.lan
    _AUDIT_SESSION=1
    _SYSTEMD_CGROUP=/user.slice/user-1000.slice/session-1.scope
    _SYSTEMD_SESSION=1
    _SYSTEMD_UNIT=session-1.scope
    _COMM=bundle
    _EXE=/home/lzap/.rbenv/versions/2.5.1/bin/ruby
    _BOOT_ID=44de32f5987c4f1b86e415cad1cfbd5d
    _SYSTEMD_INVOCATION_ID=5d6c9179b23f48489112e16431fb9b6f
    SYSLOG_IDENTIFIER=dynflow-core
    SYSLOG_FACILITY=168
    _PID=31062
    _CMDLINE=bin/smart_proxy_dynflow_core
    REMOTE_IP=127.0.0.1
    REQUEST=f0992764-2d4d-43dd-8e4c-061e82b263d0
    MESSAGE=f0992764 [I] Finished POST /tasks/ with 200 (16.04 ms)

I don't have rsyslog on my Fedora, however I tested this with journald which listens for syslog events as well. When testing this, note that there are no custom fields as syslog protocol is only PRIORITY + string message. That's the limitation of it.

@lzap lzap force-pushed the logging-26561 branch 2 times, most recently from 1dc4097 to cb4fa7f Compare April 24, 2019 06:26
@lzap
Copy link
Member Author

lzap commented Apr 24, 2019

Fixed rubocop issues

@iNecas
Copy link
Member

iNecas commented Apr 24, 2019

This is needed to fix the tests lzap#2

@lzap
Copy link
Member Author

lzap commented Apr 24, 2019

This is a terrible test honestly:


  1) Failure:
SmartProxyDynflowCore::HelpersTest#test_0003_skips client cert authentication if token succeeds [/home/jenkins/workspace/smart-proxy-dynflow-pull-request/ruby/2.3/lib/smart_proxy_dynflow_core/logger_middleware.rb:13]:
unexpected invocation: #<Logging::Logger:0x424d3e0>.debug()
unsatisfied expectations:
- expected exactly once, not yet invoked: #<Logging::Logger:0x424d3e0>.debug("authorized with token")

;-)

@iNecas
Copy link
Member

iNecas commented Apr 24, 2019

@lzap are you going to merge lzap#2, fix it otherwise, or just keep ranting about terrible tests? :)

@lzap
Copy link
Member Author

lzap commented Apr 25, 2019

@lzap are you going to merge lzap#2, fix it otherwise, or just keep ranting about terrible tests? :)

Your patch does not improve the code, why I think it's important is that users can disable debug level for tests (I do it quite often and only enable it when debugging something weird). The change can break behavior of tests. And I was not just complaining, I was going to fix it. But thanks, let's at least move forward with this. I am going to incorporate your PR into my one. :-)

@ares
Copy link
Member

ares commented Apr 29, 2019

What is the status here?

@lzap
Copy link
Member Author

lzap commented Jul 1, 2020

Depends on what we want. Currently, auto log rotating does not work in multithreaded environment correctly. By default, we use logrotate for production, so it's only issue for development environments. Several options: we can wait if TwP fixes the problem. Or we can remove file-logging feature completely from dynflow (and smart proxy and foreman?).

I poked the maintainer once again to see if there are any plans short-term, let's see.

@lzap
Copy link
Member Author

lzap commented Sep 9, 2020

Upstream released, packaging PR: theforeman/foreman-packaging#5783

@lzap
Copy link
Member Author

lzap commented Sep 9, 2020

While testing I am running into this when I do Ctrl-C:

bundler: failed to load command: bin/smart_proxy_dynflow_core (bin/smart_proxy_dynflow_core)
NoMethodError: undefined method `shutdown' for nil:NilClass
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/rack-2.0.5/lib/rack/handler/webrick.rb:48:in `shutdown'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/rack-2.0.5/lib/rack/server.rb:291:in `block in start'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:371:in `close'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:371:in `block in cleanup_listener'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:357:in `each'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:357:in `cleanup_listener'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:205:in `block in start'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:32:in `start'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/2.7.0/webrick/server.rb:157:in `start'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/rack-2.0.5/lib/rack/handler/webrick.rb:34:in `run'
  /home/lzap/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/rack-2.0.5/lib/rack/server.rb:297:in `start'
  /home/lzap/work/smart_proxy_dynflow/lib/smart_proxy_dynflow_core/launcher.rb:22:in `start'
  /home/lzap/work/smart_proxy_dynflow/lib/smart_proxy_dynflow_core/launcher.rb:15:in `launch!'
  bin/smart_proxy_dynflow_core:32:in `<top (required)>'

Is this a known issue?

@lzap
Copy link
Member Author

lzap commented Sep 9, 2020

Anyway, I just repoduced this with 2.2.2 gem, then I updated to the new 2.3.0 release and pkill no longer triggers the problem. This is now ready for re-review. (No changes there except merging ton of conflicts)

Copy link
Contributor

@adamruzicka adamruzicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running as external core I'm getting a blank line out of nowhere in the logs

2020-09-10T13:42:06 795a45de [I] Started GET /tasks/count 
2020-09-10T13:42:06 795a45de [D] Headers: {"HTTP_ACCEPT_ENCODING"=>"gzip;q=1.0,deflate;q=0.6,identity;q=0.3", "HTTP_ACCEPT"=>"*/*, application/json,version=2", "HTTP_USER_AGENT"=>"Ruby", "HTTP_X_FORWARDED_FOR"=>"localhost:8000", "HTTP_X_REQUEST_ID"=>"795a45de-82fd-4813-b103-d50821352e33", "HTTP_CONNECTION"=>"close"
, "HTTP_HOST"=>"127.0.0.1:8008", "HTTP_VERSION"=>"HTTP/1.1"}
2020-09-10T13:42:06 795a45de [D] 
2020-09-10T13:42:06 795a45de [D] require_ssl_client_verification: skipping, non-HTTPS request
2020-09-10T13:42:06 795a45de [I] Finished GET /tasks/count with 200 (1.05 ms)
2020-09-10T13:42:06  [D] close: 127.0.0.1:33212

If I run with dynflow core inside smart proxy (and external_core set to no, the debian way), I'm getting doubled outputs. Also note two request ids even though there was only one request

2020-09-10T13:43:24 5e3f7c2b [I] Started GET /tasks/count 
2020-09-10T13:43:24 5e3f7c2b [I] Started GET /tasks/count 
2020-09-10T13:43:24 5e3f7c2b [D] Headers: {"HTTP_HOST"=>"localhost:8000", "HTTP_USER_AGENT"=>"curl/7.66.0", "HTTP_ACCEPT"=>"*/*", "HTTP_VERSION"=>"HTTP/1.1"}
2020-09-10T13:43:24 5e3f7c2b [D] Headers: {"HTTP_HOST"=>"localhost:8000", "HTTP_USER_AGENT"=>"curl/7.66.0", "HTTP_ACCEPT"=>"*/*", "HTTP_VERSION"=>"HTTP/1.1"}
2020-09-10T13:43:24 5e3f7c2b [D] 
2020-09-10T13:43:24 5e3f7c2b [D] 
2020-09-10T13:43:24 4d345aae [I] Started GET /dynflow/tasks/count 
2020-09-10T13:43:24 4d345aae [D] require_ssl_client_verification: skipping, non-HTTPS request
2020-09-10T13:43:24 4d345aae [D] require_ssl_client_verification: skipping, non-HTTPS request
2020-09-10T13:43:24 4d345aae [I] Finished GET /dynflow/tasks/count with 200 (3.64 ms)
2020-09-10T13:43:24 4d345aae [I] Finished GET /tasks/count with 200 (4.25 ms)
2020-09-10T13:43:24 4d345aae [I] Finished GET /tasks/count with 200 (4.25 ms)

config/settings.yml.example Show resolved Hide resolved
@lzap
Copy link
Member Author

lzap commented Sep 22, 2020

Double lines problem solved, the middleware must not be included when running in standalone mode. Packaging PR still not merged but this should be ready: theforeman/foreman-packaging#5666

@lzap
Copy link
Member Author

lzap commented Sep 22, 2020

Dependency merged into nightly for the record.

Copy link
Contributor

@adamruzicka adamruzicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it works well, left some nitpicks in the code

Comment on lines 10 to 21
module SmartProxyDynflowCore
class TestAction < ::Dynflow::Action
def run
state = 0
while state < input[:debug_logs]
action_logger.debug("Test action counter at #{state}")
state += 1
end
raise "test exception" if input[:exception]
end
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you insist on getting this in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh I am not aware writing this or adding this hunk. Perhaps bad conflict resolution? I can delete this for sure if you think it's not valid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh now I remember :-)

Gemfile Outdated
@@ -32,6 +32,7 @@ else
gem 'rack', '>= 1.1'
gem 'sinatra'
end
gem 'logging-journald', '~> 2.0', :platforms => [:ruby]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be added to gemspec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a way how to add a gemspec dependency only for a particular platform. This one won't work on JRuby or Ruby on Windows. Anyway, I am going to follow what smart-proxy does: https://github.com/theforeman/smart-proxy/blob/develop/smart_proxy.gemspec

Adding runtime dependency "logging" and keeping this line here as well.

@logger.debug { 'Headers: ' + env.select { |k, v| k.start_with? 'HTTP_' }.inspect }
if @logger.debug? && env['rack.input']
body = env['rack.input'].read
@logger.debug body.empty? ? '' : 'Body: ' + body
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there's no body it logs an empty string. Shouldn't we skip it altogether if there's no body?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good point.

@lzap
Copy link
Member Author

lzap commented Sep 24, 2020

Rebased.

@lzap
Copy link
Member Author

lzap commented Oct 14, 2020

We should be all good @adamruzicka ? Or did I miss some comment(s)?

@adamruzicka
Copy link
Contributor

This should be fine, I was running with this for the past week or so. I just need to read it through once more and it will go in

Copy link
Contributor

@adamruzicka adamruzicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing

@@ -3,6 +3,8 @@
require 'smart_proxy_dynflow_core/settings'
require 'sd_notify'

require 'smart_proxy_dynflow_core/test_action'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file no longer exists

Copy link
Contributor

@adamruzicka adamruzicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just drop the last offending require and then we'll get it in. And apparently it also needs a rebase

@lzap
Copy link
Member Author

lzap commented Oct 23, 2020

Rebased, I had a bunch of conflicts in gemspec/gemfiles so please re-review carefully.

@adamruzicka
Copy link
Contributor

👮 is not happy

@lzap
Copy link
Member Author

lzap commented Nov 5, 2020

Well.

lib/smart_proxy_dynflow_core/log.rb:78:16: C: Style/NumericPredicate: Use size.positive? instead of size > 0.
            if size > 0

Ruby/Rubocop community is just swimming against current. It's size > 0 in every major language out there... nonsense. Fixed, my hate against Rubocop just increased a bit once again.

Copy link
Contributor

@adamruzicka adamruzicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

@adamruzicka adamruzicka merged commit d6908a1 into theforeman:master Nov 5, 2020
@adamruzicka
Copy link
Contributor

Thank you @lzap !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants