Skip to content

logstash repeatedly processes a corrupted compressed file in and endless loop  #261

@xo4n

Description

@xo4n

We had a case where several corrupted compressed files ended up in the input directory of the logstash pipeline. What happens after is that logstash reads and processes some of the lines of the corrupted files, sends some corrupted data to the output, and throws an error before finishing reading.

Because it didn't finish properly, the file is not marked as processed and it is picked up again over and over. The result is a continuous stream of corrupted data being sent to the output.

logstash_1  | [2020-02-28T14:00:21,027][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.5.2"}
logstash_1  | [2020-02-28T14:00:24,963][INFO ][org.reflections.Reflections] Reflections took 100 ms to scan 1 urls, producing 20 keys and 40 values
logstash_1  | [2020-02-28T14:00:27,790][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
logstash_1  | [2020-02-28T14:00:27,798][INFO ][logstash.javapipeline    ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/etc/logstash/conf.d/logstash_akamai.conf"], :thread=>"#<Thread:0x5cd406bb run>"}
logstash_1  | [2020-02-28T14:00:28,139][INFO ][logstash.javapipeline    ] Pipeline started {"pipeline.id"=>"main"}
logstash_1  | [2020-02-28T14:00:28,228][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
logstash_1  | [2020-02-28T14:00:28,233][INFO ][filewatch.observingread  ] START, creating Discoverer, Watch with file and sincedb collections
logstash_1  | [2020-02-28T14:00:28,723][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
logstash_1  | [2020-02-28T14:00:29,708][INFO ][logstash.outputs.file    ] Opening file {:path=>"/var/log/akamai_messages.log"}
logstash_1  | [2020-02-28T14:02:03,053][ERROR][filewatch.readmode.handlers.readzipfile] Cannot decompress the gzip file at path: /var/log/akamai/trv_imgcy_698321.esw3ccs_ghostip_S.202002271000-1100-4.gz
logstash_1  | [2020-02-28T14:03:32,609][ERROR][filewatch.readmode.handlers.readzipfile] Cannot decompress the gzip file at path: /var/log/akamai/trv_imgcy_698321.esw3ccs_ghostip_S.202002271000-1100-4.gz
logstash_1  | [2020-02-28T14:04:57,801][ERROR][filewatch.readmode.handlers.readzipfile] Cannot decompress the gzip file at path: /var/log/akamai/trv_imgcy_698321.esw3ccs_ghostip_S.202002271000-1100-4.gz
logstash_1  | [2020-02-28T14:06:21,140][ERROR][filewatch.readmode.handlers.readzipfile] Cannot decompress the gzip file at path: /var/log/akamai/trv_imgcy_698321.esw3ccs_ghostip_S.202002271000-1100-4.gz
^CGracefully stopping... (press Ctrl+C again to force)

The input configuration

input
{
  file
  {
    path => "/var/log/akamai/trv_imgcy_698321.esw3ccs_ghostip_S.202002271000-1100-4.gz"
    mode => "read"
    start_position => "beginning"
    file_completed_action => "delete"
    sincedb_path => "/dev/null"
  }
}

Tried with 2 different versions ( 7.5.2 and 7.1.1 ) and the issue was reproduceable in both cases

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions