Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][CSV Output] Logstash CSV output not flushing to disk #14869

Closed
lduvnjak opened this issue Feb 3, 2023 · 5 comments
Closed

[BUG][CSV Output] Logstash CSV output not flushing to disk #14869

lduvnjak opened this issue Feb 3, 2023 · 5 comments

Comments

@lduvnjak
Copy link

lduvnjak commented Feb 3, 2023

Logstash information:

Please include the following information:

  • Logstash version (e.g. bin/logstash --version)
    logstash 8.5.0

  • Logstash installation source (e.g. built from source, with a package manager: DEB/RPM, expanded from tar or zip archive, docker)
    yum install logstash

  • How is Logstash being run (e.g. as a service/service manager: systemd, upstart, etc. Via command line, docker/kubernetes)
    CLI : nohup /usr/share/logstash/bin/logstash -f /root/logstash/exporter/exporter.conf > f.out 2> f.err < /dev/null &

Plugins installed:

logstash-codec-avro (3.4.0)
logstash-codec-cef (6.2.5)
logstash-codec-collectd (3.1.0)
logstash-codec-dots (3.0.6)
logstash-codec-edn (3.1.0)
logstash-codec-edn_lines (3.1.0)
logstash-codec-es_bulk (3.1.0)
logstash-codec-fluent (3.4.1)
logstash-codec-graphite (3.0.6)
logstash-codec-json (3.1.0)
logstash-codec-json_lines (3.1.0)
logstash-codec-line (3.1.1)
logstash-codec-msgpack (3.1.0)
logstash-codec-multiline (3.1.1)
logstash-codec-netflow (4.2.2)
logstash-codec-plain (3.1.0)
logstash-codec-rubydebug (3.1.0)
logstash-filter-aggregate (2.10.0)
logstash-filter-anonymize (3.0.6)
logstash-filter-cidr (3.1.3)
logstash-filter-clone (4.2.0)
logstash-filter-csv (3.1.1)
logstash-filter-date (3.1.15)
logstash-filter-de_dot (1.0.4)
logstash-filter-dissect (1.2.5)
logstash-filter-dns (3.1.5)
logstash-filter-drop (3.0.5)
logstash-filter-elasticsearch (3.12.0)
logstash-filter-fingerprint (3.4.1)
logstash-filter-geoip (7.2.12)
logstash-filter-grok (4.4.2)
logstash-filter-http (1.4.1)
logstash-filter-json (3.2.0)
logstash-filter-kv (4.7.0)
logstash-filter-memcached (1.1.0)
logstash-filter-metrics (4.0.7)
logstash-filter-mutate (3.5.6)
logstash-filter-prune (3.0.4)
logstash-filter-ruby (3.1.8)
logstash-filter-sleep (3.0.7)
logstash-filter-split (3.1.8)
logstash-filter-syslog_pri (3.1.1)
logstash-filter-throttle (4.0.4)
logstash-filter-translate (3.4.0)
logstash-filter-truncate (1.0.5)
logstash-filter-urldecode (3.0.6)
logstash-filter-useragent (3.3.3)
logstash-filter-uuid (3.0.5)
logstash-filter-xml (4.2.0)
logstash-input-azure_event_hubs (1.4.4)
logstash-input-beats (6.4.1)
└── logstash-input-elastic_agent (alias)
logstash-input-couchdb_changes (3.1.6)
logstash-input-dead_letter_queue (2.0.0)
logstash-input-elasticsearch (4.16.0)
logstash-input-exec (3.6.0)
logstash-input-file (4.4.4)
logstash-input-ganglia (3.1.4)
logstash-input-gelf (3.3.2)
logstash-input-generator (3.1.0)
logstash-input-graphite (3.0.6)
logstash-input-heartbeat (3.1.1)
logstash-input-http (3.6.0)
logstash-input-http_poller (5.4.0)
logstash-input-imap (3.2.0)
logstash-input-jms (3.2.2)
logstash-input-pipe (3.1.0)
logstash-input-redis (3.7.0)
logstash-input-snmp (1.3.1)
logstash-input-snmptrap (3.1.0)
logstash-input-stdin (3.4.0)
logstash-input-syslog (3.6.0)
logstash-input-tcp (6.3.0)
logstash-input-twitter (4.1.0)
logstash-input-udp (3.5.0)
logstash-input-unix (3.1.1)
logstash-integration-aws (7.0.0)
 ├── logstash-codec-cloudfront
 ├── logstash-codec-cloudtrail
 ├── logstash-input-cloudwatch
 ├── logstash-input-s3
 ├── logstash-input-sqs
 ├── logstash-output-cloudwatch
 ├── logstash-output-s3
 ├── logstash-output-sns
 └── logstash-output-sqs
logstash-integration-elastic_enterprise_search (2.2.1)
 ├── logstash-output-elastic_app_search
 └──  logstash-output-elastic_workplace_search
logstash-integration-jdbc (5.3.0)
 ├── logstash-input-jdbc
 ├── logstash-filter-jdbc_streaming
 └── logstash-filter-jdbc_static
logstash-integration-kafka (10.12.0)
 ├── logstash-input-kafka
 └── logstash-output-kafka
logstash-integration-rabbitmq (7.3.0)
 ├── logstash-input-rabbitmq
 └── logstash-output-rabbitmq
logstash-output-csv (3.0.8)
logstash-output-elasticsearch (11.9.0)
logstash-output-email (4.1.1)
logstash-output-file (4.3.0)
logstash-output-graphite (3.1.6)
logstash-output-http (5.5.0)
logstash-output-lumberjack (3.1.9)
logstash-output-nagios (3.0.6)
logstash-output-null (3.0.5)
logstash-output-pipe (3.0.6)
logstash-output-redis (5.0.0)
logstash-output-stdout (3.1.4)
logstash-output-tcp (6.1.1)
logstash-output-udp (3.2.0)
logstash-output-webhdfs (3.0.6)
logstash-patterns-core (4.3.4)

OS Version:
Linux removed 4.18.0-425.3.1.el8.x86_64 #1 SMP Tue Nov 8 14:08:25 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

  • Expected
    Logstash should flush to disk according to the value of flush_interval
  • Actual
    Logstash never flushes to disk, and eventually runs out of memory

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including (e.g.) pipeline definition(s), settings, locale, etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.

  1. Logstash Elasticsearch input for any index
  2. Logstash CSV output with any fields
  3. Logstash crash due to java.lang.OutOfMemoryError: Java heap space

Current Result:
Logstash crashes, and all the messages processed are gone.

Expected result:
Messages are processed and flushed to disk periodically

Provide logs (if relevant):

  • f.out (before 8.5)
Using bundled JDK: /usr/share/logstash/jdk
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2023-02-03 09:49:40.180 [main] runner - NOTICE: Running Logstash as superuser is not recommended and won't be allowed in the future. Set 'allow_superser' to 'false' to avoid startup errors in future releases.
[INFO ] 2023-02-03 09:49:40.189 [main] runner - Starting Logstash {"logstash.version"=>"8.4.0", "jruby.version"=>"jruby 9.3.6.0 (2.6.8) 2022-06-27 7a2cbcd376OpenJDK 64-Bit Server VM 17.0.4+8 on 17.0.4+8 +indy +jit [x86_64-linux]"}
[INFO ] 2023-02-03 09:49:40.191 [main] runner - JVM bootstrap flags: [-Xms4g, -Xmx4g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableAD=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=fle:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-xports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.ools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMD, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opes=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[WARN ] 2023-02-03 09:49:40.369 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2023-02-03 09:49:41.000 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2023-02-03 09:49:41.496 [Converge PipelineAction::Create<main>] Reflections - Reflections took 63 ms to scan 1 urls, producing 125 keys and 434 value
[INFO ] 2023-02-03 09:49:41.765 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` seting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2023-02-03 09:49:41.839 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/root/logstash/exporter/exporter.conf"], :thread=>"#<Thread:0x77495840run>"}
[INFO ] 2023-02-03 09:49:42.192 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.35}
[INFO ] 2023-02-03 09:49:42.688 [[main]-pipeline-manager] elasticsearch - ECS compatibility is enabled but `target` option was not specified. This may cause ields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to void potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[INFO ] 2023-02-03 09:49:42.691 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-02-03 09:49:42.717 [[main]|input|elasticsearch|slice_0] elasticsearch - Slice starting {:slice_id=>0, :slices=>4}
[INFO ] 2023-02-03 09:49:42.722 [[main]|input|elasticsearch|slice_2] elasticsearch - Slice starting {:slice_id=>2, :slices=>4}
[INFO ] 2023-02-03 09:49:42.729 [[main]|input|elasticsearch|slice_3] elasticsearch - Slice starting {:slice_id=>3, :slices=>4}
[INFO ] 2023-02-03 09:49:42.730 [[main]|input|elasticsearch|slice_1] elasticsearch - Slice starting {:slice_id=>1, :slices=>4}
[INFO ] 2023-02-03 09:49:42.772 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2023-02-03 09:49:45.489 [[main]>worker0] csv - Opening file {:path=>"/root/logstash/exporter/*removed*.csv"}
  • f.out (after 8.5)
Using bundled JDK: /usr/share/logstash/jdk
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2023-02-03 09:52:34.364 [main] runner - NOTICE: Running Logstash as superuser is not recommended and won't be allowed in the future. Set 'allow_superuser' to 'false' to avoid startup errors in future releases.
[INFO ] 2023-02-03 09:52:34.373 [main] runner - Starting Logstash {"logstash.version"=>"8.5.0", "jruby.version"=>"jruby 9.3.8.0 (2.6.8) 2022-09-13 98d69c9461 OpenJDK 64-Bit Server VM 17.0.4+8 on 17.0.4+8 +indy +jit [x86_64-linux]"}
[INFO ] 2023-02-03 09:52:34.375 [main] runner - JVM bootstrap flags: [-Xms4g, -Xmx4g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[WARN ] 2023-02-03 09:52:34.548 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2023-02-03 09:52:35.179 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2023-02-03 09:52:35.651 [Converge PipelineAction::Create<main>] Reflections - Reflections took 59 ms to scan 1 urls, producing 125 keys and 438 values
[INFO ] 2023-02-03 09:52:36.200 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2023-02-03 09:52:36.277 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/root/logstash/exporter/exporter.conf"], :thread=>"#<Thread:0x46ac922e run>"}
[INFO ] 2023-02-03 09:52:36.624 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.35}
[INFO ] 2023-02-03 09:52:37.163 [[main]-pipeline-manager] elasticsearch - ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[INFO ] 2023-02-03 09:52:37.185 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-02-03 09:52:37.232 [[main]|input|elasticsearch|slice_1] elasticsearch - Slice starting {:slice_id=>1, :slices=>4}
[INFO ] 2023-02-03 09:52:37.235 [[main]|input|elasticsearch|slice_0] elasticsearch - Slice starting {:slice_id=>0, :slices=>4}
[INFO ] 2023-02-03 09:52:37.261 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2023-02-03 09:52:37.270 [[main]|input|elasticsearch|slice_2] elasticsearch - Slice starting {:slice_id=>2, :slices=>4}
[INFO ] 2023-02-03 09:52:37.279 [[main]|input|elasticsearch|slice_3] elasticsearch - Slice starting {:slice_id=>3, :slices=>4}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid12882.hprof ...
Heap dump file created [6021510285 bytes in 52.002 secs]
[FATAL] 2023-02-03 09:55:35.506 [Agent thread] Logstash - uncaught error (in thread Agent thread)
java.lang.OutOfMemoryError: Java heap space

Additional info:
This issue happens on 8.6 as well as 8.5. Every other version between 7.17 and onwards till 8.5 works without issues.
You can find more info on exporter.conf and everything else here.

@lduvnjak
Copy link
Author

lduvnjak commented Feb 7, 2023

The issue seems to be in the elasticsearch input plugin.
Logstash 8.4 - default version of logstash-input-elasticsearch plugin is 4.14.0
Upgrading just the input plugin to 4.16.0 breaks it.

@pbabik-cen88278
Copy link

@elastic, any ETA on the fix?

@lduvnjak
Copy link
Author

@elastic, the issue has been stale for almost 2 months. Can anyone take a look?

@robbavey
Copy link
Member

robbavey commented May 2, 2023

This should be fixed by logstash-plugins/logstash-input-elasticsearch#189, available in version 4.17.1 of the elasticsearch input.

Please reopen if the issue reoccurs.

@robbavey robbavey closed this as completed May 2, 2023
@lduvnjak
Copy link
Author

lduvnjak commented May 4, 2023

Thank you, I've verified it's fixed. When will the plugin version 4.17.1 be included by default in the Logstash rpm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants