Feature/log appender per pipeline #11108

andsel · 2019-09-02T12:30:14Z

This PR is related to issue #10427 and introduce the separation of pipeline logs in separate appenders.
It leverage the log4j2 RoutingAppender (https://logging.apache.org/log4j/2.x/manual/appenders.html#RoutingAppender), by default it's switched off and could be enabled with the config flag --pipeline.separate_logs=true/false. To disable this feature it's used a custom PropertiesConfigFactory that simply remove the routing appender at startup phase if it's switched off. It expecte to find a routing appender named pipeline_routing_appender

andsel · 2019-09-02T13:44:01Z

Jenkins test this please

robbavey

I've done some initial testing on this, and confirmed that log entries are added to separate per-pipeline log files if enabled.

Most of my remarks are around tidying up the config, along with a question over duplication of log entries across per-pipeline logs and the main logstash log, and whether that should be required, optional or not done..

robbavey · 2019-09-03T17:58:17Z

config/log4j2.properties

+appender.routing.routes.route_pipelines.rolling.layout.type = PatternLayout
+appender.routing.routes.route_pipelines.rolling.layout.pattern = %d %p %C{1.} [%t] %m%n
+appender.routing.routes.route_pipelines.rolling.policy.type = SizeBasedTriggeringPolicy
+appender.routing.routes.route_pipelines.rolling.policy.size = 500


Needs to be increased before commiting

Right, I'll bring the same configs used for the main log

robbavey · 2019-09-03T18:00:07Z

config/log4j2.properties

+appender.routing.routes.script.type = Script
+appender.routing.routes.script.name = routing_script
+appender.routing.routes.script.language = JavaScript
+appender.routing.routes.script.value = logEvent.getContextData().containsKey("pipeline.id") ? logEvent.getContextMap().get("pipeline.id") : "sink";


Any reason why getContextData and (deprecated) getContextMap are both in use here?

getContextMap was a cut/paste not changed to getContextData

robbavey · 2019-09-03T18:00:40Z

logstash-core/src/main/java/org/logstash/log/LogstashConfigurationFactory.java

+import java.util.Properties;
+
+@Plugin(name = "LogstashConfigurationFactory", category = ConfigurationFactory.CATEGORY)
+@Order(9)


What is the significance of 9?

This is the order that Log4J2 uses to load the plugins. The standard log4 PropertiesConfigurationFactory has Order = 8, higher means more priority

robbavey · 2019-09-03T18:15:04Z

config/log4j2.properties

+appender.routing.routes.route_pipelines.rolling.fileName = ${sys:ls.logs}/pipeline_${ctx:pipeline.id}.log
+appender.routing.routes.route_pipelines.rolling.filePattern = ${sys:ls.logs}/pipeline_${ctx:pipeline.id}.%i.log.gz
+appender.routing.routes.route_pipelines.rolling.layout.type = PatternLayout
+appender.routing.routes.route_pipelines.rolling.layout.pattern = %d %p %C{1.} [%t] %m%n


We should probably try and keep the formatting consistent with that used in other log entries, something like:

[%d{ISO8601}][%-5p][%-25c] %m%n

(although I'm not necessarily averse to introducing the thread to the log)

robbavey · 2019-09-03T18:20:01Z

config/log4j2.properties

+appender.routing.routes.route_pipelines.rolling.filePattern = ${sys:ls.logs}/pipeline_${ctx:pipeline.id}.%i.log.gz
+appender.routing.routes.route_pipelines.rolling.layout.type = PatternLayout
+appender.routing.routes.route_pipelines.rolling.layout.pattern = %d %p %C{1.} [%t] %m%n
+appender.routing.routes.route_pipelines.rolling.policy.type = SizeBasedTriggeringPolicy


Do we also need to setup a rollover strategy here?

robbavey · 2019-09-03T18:35:54Z

logstash-core/lib/logstash/environment.rb

@@ -44,6 +44,7 @@ module Environment
           Setting::Boolean.new("pipeline.java_execution", true),
           Setting::Boolean.new("pipeline.reloadable", true),
           Setting::Boolean.new("pipeline.plugin_classloaders", false),
+           Setting::Boolean.new("pipeline.separate_logs", false),


I notice from testing, that we get log entries produced in the pipeline duplicated in the pipeline logs, and in the main log. This leads me to wonder:

How do we want to handle this:
a) Duplicate all per-pipeline log entries in the per-pipeline logs and the main log
b) Separate all per-pipeline log entries into the per-pipeline logs only.
c) Allow the user to choose main, per-pipeline or both for pipeline log entries.

Thoughts @jsvd, @andsel?

I think that if the log goes in the pipeline logs it shouldn't be present also in the main

++ to b), if you set separate_logs, then pipeline-specific log entries should not land in the main log file

fixed by commit 1ad14ee

robbavey · 2019-09-03T18:37:41Z

logstash-core/src/test/java/org/logstash/log/LogstashConfigurationFactoryTest.java

+        System.setProperty("ls.logs", "build/logs");
+        System.setProperty(LogstashConfigurationFactory.PIPELINE_SEPARATE_LOGS, "true");
+
+        ThreadContext.clearAll();


Do we want to put this into a @after/tearDown to avoid the ThreadContext potentially polluting other tests?

Do you mean to to save the System.properties used + the ThreadContext in the @before and restore it in the @after? To me sounds good, at the end the unit test results as side effect free to the System.properties

Yes, something like that to avoid the test having any unforeseen side effects - we might even be able to use @BeforeClass and @AfterClass
to set and unset the properties, and an @Rule to handle the ThreadContext -

@Rule public ThreadContextRule threadContextRule = new ThreadContextRule();

Yes I do it, I didn't used the @Rule because in JUnit5 does not exists rules as I read, so be open moving to JUnit5 without incur in deprecated stuff.

Ah! I missed that @Rule was a junit5 thing. No worries!

fixed by 3ab931e

andsel · 2019-09-05T16:07:03Z

Jenkins test this please

andsel · 2019-09-06T08:14:50Z

@robbavey this is ready to be reviewed

robbavey

@andsel Couple of concerns - firstly running this on JDK 11 the following error message is emitted:

Warning: Nashorn engine is planned to be removed from a future JDK release

It is emitted every time a pipeline log entry is written if pipeline.separate_logs is set to true, less frequently if it is set to false

Presumably this is due to the use of javascript in the log4j2.properties file.

Secondly, when pipelines.separate_logs is set to true, log entries that are re-directed to separate log files no longer appear in the console logs. I think by default, we would still want to have those entries emitted via the console, and only separated in the log files. What do you think @jsvd?

robbavey · 2019-09-05T14:25:02Z

logstash-core/src/test/resources/log4j2-log-pipeline-test.properties

+appender.routing.routes.route1.type = Route
+appender.routing.routes.route1.list.type = List
+appender.routing.routes.route1.list.name = appender-${mdc:pipeline.id}
+#appender.routing.routes.route1.rolling.type = RollingFile


Please remove commented out code before commit

robbavey · 2019-09-05T14:25:20Z

qa/integration/specs/pipeline_id_log_spec.rb

@@ -38,10 +38,63 @@
    }
    IO.write(@ls.application_settings_file, settings.to_yaml)
    @ls.spawn_logstash("-w", "1" , "-e", config)
-    @ls.wait_for_logstash
-    sleep 2 until @ls.exited?
+    #@ls.wait_for_logstash


Please remove commented out code before commit

robbavey · 2019-09-05T14:25:35Z

qa/integration/specs/pipeline_id_log_spec.rb

+    }
+    IO.write(@ls.application_settings_file, settings.to_yaml)
+    @ls.spawn_logstash("-w", "1" , "-e", config)
+    #@ls.wait_for_logstash


Please remove commented out code before commit

andsel · 2019-09-09T13:17:33Z

@robbavey @jsvd to make the console appender continue printing the log lines also in case the -Dpipeline.separate_log=true it's a quick fix, matter of remove the filtering part from the console.

Regarding the Nashorn warning we could disable with a jvm.option -Dnashorn.args="--no-deprecation-warning" but we have to decide which scripting engine to use after JDK 12, do we need to include Nashorn manually or we change scripting to Groovy or maybe integrate Painless?

robbavey · 2019-09-09T20:54:19Z

@andsel Personally, I think it makes sense to remove the filtering part from the console. And I think removing the Nashorn deprecation warning is ok for now.

It may even be possible to use jruby as a scripting engine, as it looks like log4j supports any JSR-223 scripting language, (https://github.com/jruby/jruby/wiki/Embedding-with-JSR-223) but I've a) not tested it, or b) seen anyone else doing it...

By the way, do you know if there is any sort of performance impact on introducing scripting into the logging routing?

andsel · 2019-09-10T16:38:28Z

@robbavey I've checked with JMH benchmark, and if the benchmark is correct, the output says that we loose not so much:

Result "org.logstash.benchmark.LogPerPipelineBenchmark.logWithoutScriptingCodeToExecute":
  737681.956 ±(99.9%) 129242.396 ops/ms [Average]
  (min, avg, max) = (554301.198, 737681.956, 824737.847), stdev = 85485.842
  CI (99.9%): [608439.561, 866924.352] (assumes normal distribution)


# Run complete. Total time: 00:00:03

Benchmark                                                  Mode  Cnt       Score        Error   Units
LogPerPipelineBenchmark.logWithScriptingCodeToExecute     thrpt   10  704964.359 ±  50375.222  ops/ms
LogPerPipelineBenchmark.logWithoutScriptingCodeToExecute  thrpt   10  737681.956 ± 129242.396  ops/ms

andsel · 2019-09-11T14:57:28Z

Jenkins test this please

robbavey

Couple of questions on the benchmarks. Everything else is good, I think

robbavey · 2019-09-11T20:58:38Z

logstash-core/benchmarks/build.gradle

@@ -45,6 +45,10 @@ dependencies {
  compile 'commons-io:commons-io:2.5'
  runtime 'joda-time:joda-time:2.8.2'
  compile "org.jruby:jruby-core:$jrubyVersion"
+//  compile 'org.apache.logging.log4j:log4j-api:2.11.1'


Why did these need to be commented out?

Was a try to fix a console log in benchmark.
Launch the benchmark with ./gradlew jmh -Pinclude="org.logstash.benchmark.LogPerPipelineBenchmark.*"

On the console I've many lines like this:

ERROR StatusLogger Unrecognized conversion specifier [d] starting at position 16 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [thread] ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [level] ERROR StatusLogger Unrecognized conversion specifier [level] starting at position 35 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [logger] ERROR StatusLogger Unrecognized conversion specifier [logger] starting at position 47 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [msg] ERROR StatusLogger Unrecognized conversion specifier [msg] starting at position 54 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [n] ERROR StatusLogger Unrecognized conversion specifier [n] starting at position 56 in conversion pattern. 2019-09-13 10:49:14,358 org.logstash.benchmark.LogPerPipelineBenchmark.logWithScriptingCodeToExecute-jmh-worker-1 ERROR Unable to locate plugin type for Loggers 2019-09-13 10:49:14,363 org.logstash.benchmark.LogPerPipelineBenchmark.logWithScriptingCodeToExecute-jmh-worker-1 ERROR Unable to locate plugin type for Appenders

And I don't know why

robbavey · 2019-09-11T21:02:06Z

logstash-core/benchmarks/src/main/java/org/logstash/benchmark/LogPerPipelineBenchmark.java

+
+    @Benchmark
+    @OperationsPerInvocation(EVENTS_PER_INVOCATION)
+    public final void logWithoutScriptingCodeToExecute() throws Exception {


Nit: These two tests are basically the same test with only the System property changing, so redundant code can be removed.

robbavey · 2019-09-11T21:05:51Z

logstash-core/benchmarks/src/main/resources/log4j2-with-script.properties

+#appender.console.avoid_pipelined_filter.script.type = Script
+#appender.console.avoid_pipelined_filter.script.name = filter_no_pipelined
+#appender.console.avoid_pipelined_filter.script.language = JavaScript
+#appender.console.avoid_pipelined_filter.script.value = ${sys:ls.pipeline.separate_logs} == false || !(logEvent.getContextData().containsKey("pipeline.id"))


Does uncommenting this block to enable the filter have any effect on the performance?

I think there might be 3 tests - no scripting in log4j files, scripting with separate_logs = false and scripting with separate_logs = true.

robbavey

One issue with the nashorn args in jvm.options

robbavey · 2019-10-02T20:28:17Z

config/jvm.options

@@ -79,3 +79,6 @@

 # Copy the logging context from parent threads to children
 -Dlog4j2.isThreadContextMapInheritable=true
+
+# Avoid Nashorn deprecation logs in JDK > 11
+-Dnashorn.args="--no-deprecation-warning"


Testing this locally did not work with the quotes in place, but did without

Fixed, I was testing with flag passed directly in my IDE

robbavey

@andsel One more thing! Can we add a commented out section to logstash.yml explaining the new setting?

robbavey

LGTM. Can you please merge this into master and 7.x

…logs per pipelines - use log4j RoutingAppender - avoid output to main log files when log per pipeline is enabled - closes 10427

elasticsearch-bot · 2019-10-08T14:07:20Z

Andrea Selva merged this into the following branches!

Branch	Commits
master	`e58a6e0`
7.x	`fc9c0e0`

…logs per pipelines - use log4j RoutingAppender - avoid output to main log files when log per pipeline is enabled - closes 10427 Fixes #11108

andsel force-pushed the feature/log_appender_per_pipeline branch from e4dd715 to 2a9ed3c Compare September 3, 2019 15:54

andsel changed the title ~~[WIP] Feature/log appender per pipeline~~ Feature/log appender per pipeline Sep 3, 2019

robbavey reviewed Sep 3, 2019

View reviewed changes

robbavey mentioned this pull request Sep 5, 2019

[Meta Issue] Logging Improvements #11074

Closed

8 tasks

robbavey requested changes Sep 6, 2019

View reviewed changes

robbavey reviewed Sep 11, 2019

View reviewed changes

jsvd mentioned this pull request Sep 12, 2019

Ability to use path.logs under each pipeline #10427

Closed

andsel force-pushed the feature/log_appender_per_pipeline branch 2 times, most recently from 574dba9 to 67ba711 Compare September 13, 2019 10:08

andsel force-pushed the feature/log_appender_per_pipeline branch from 67ba711 to 356cef2 Compare September 20, 2019 07:22

andsel requested a review from robbavey September 20, 2019 14:21

andsel force-pushed the feature/log_appender_per_pipeline branch from 0d41698 to cce5431 Compare September 30, 2019 08:55

robbavey requested changes Oct 2, 2019

View reviewed changes

andsel force-pushed the feature/log_appender_per_pipeline branch from cce5431 to c02fcb1 Compare October 3, 2019 09:35

andsel requested a review from robbavey October 7, 2019 13:08

andsel force-pushed the feature/log_appender_per_pipeline branch from c02fcb1 to cce5431 Compare October 7, 2019 15:46

robbavey reviewed Oct 7, 2019

View reviewed changes

robbavey approved these changes Oct 8, 2019

View reviewed changes

Added LS configuration variable 'pipeline.separate_logs' to separate …

3028c37

…logs per pipelines - use log4j RoutingAppender - avoid output to main log files when log per pipeline is enabled - closes 10427

andsel force-pushed the feature/log_appender_per_pipeline branch from ae8fdf9 to 3028c37 Compare October 8, 2019 14:03

elasticsearch-bot closed this in e58a6e0 Oct 8, 2019

jsvd added v7.5.0 enhancement logging improvements labels Nov 14, 2019

yaauie mentioned this pull request Jan 31, 2020

deprecated config setting document_type warning #10485

Open

andsel mentioned this pull request Nov 25, 2020

Log levels could be specified per pipeline #12473

Open

andsel mentioned this pull request May 21, 2021

Log4j don't rollover nor delete files after the introduction of log per pipeline #12921

Closed

andsel mentioned this pull request Jun 7, 2021

Explicitate the type of log format in appender's names else it breaks… #12964

Merged

7 tasks

Feature/log appender per pipeline #11108

Feature/log appender per pipeline #11108

Conversation

andsel commented Sep 2, 2019

andsel commented Sep 2, 2019

robbavey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andsel commented Sep 5, 2019

andsel commented Sep 6, 2019

robbavey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andsel commented Sep 9, 2019 • edited

robbavey commented Sep 9, 2019

andsel commented Sep 10, 2019

andsel commented Sep 11, 2019

robbavey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robbavey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robbavey left a comment

Choose a reason for hiding this comment

robbavey left a comment

Choose a reason for hiding this comment

elasticsearch-bot commented Oct 8, 2019

andsel commented Sep 9, 2019 •

edited