Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash spikes cpu @100% on grok parse failure #37

Closed
jordansissel opened this issue May 18, 2015 · 3 comments
Closed

Logstash spikes cpu @100% on grok parse failure #37

jordansissel opened this issue May 18, 2015 · 3 comments
Labels

Comments

@jordansissel
Copy link
Contributor

(This issue was originally filed by @sujanks at elastic/logstash#2619)


Hi,

We have logstash 1.4.2 agents running consuming logs from SQS on ElasticSearch 1.4.2.

Every time we run the box sometimes it lasts for 2 mins to 5 mins then the cpu spikes to 100%, no matter how big boxes (EC2: m3Xlarge).

After spending lot of time, found out the it is due to the grok, but not clear why?

Following is our grok in the config file.

grok {
      match => [ "Message", "%{DATE:date} %{GREEDYDATA:time} \[%{GREEDYDATA:cloudhubapp}\] %{DATA:loglevel} %{DATA:application}\.%{DATA:component}\.%{DATA:subcomponent}\.%{DATA:position} - %{GREEDYDATA:keyvalue} Message=%{GREEDYDATA:message}" ]
}

These are the sample log message

2015-02-11 05:03:01,209 [[digital-methode-subscriber].connector.http.mule.default.receiver.2104] INFO  apache.component.content-status-notification-publisher-http-handler.other - transactionID=160246_8aba3078-b1a2-11e4-90ab-ef3fd79aaa94 Message=Content Status Notification message successfully.

2015-02-13 03:09:07,813 [[digital-methode-subscriber].connector.http.mule.default.receiver.31] ERROR org.apache.retry.notifiers.ConnectNotifier - Failed to connectreconnect: Work Descriptor. Root Exception was: One or more parameters are invalid. Reason: Message must be shorter than 262144 bytes.. Type: class com.amazonaws.AmazonServiceException

In the grok if I remove after hyphen (-), the logstash is ok and run on cpu about 20%, on EC2 m1.large.

Removed part

- %{GREEDYDATA:keyvalue} Message=%{GREEDYDATA:message}"

Any idea?

Sujan

@jordansissel
Copy link
Contributor Author

The problem is likely that the use of GREEDYDATA doesn't inform the execution about how to match your data efficiently. You can read more about this kind of phenomenon on what Wikipedia calls ReDoS. Basically, my recommendation is to use the most specific patterns you can, and GREEDYDATA is not very specific (It will match anything or nothing), and such ambiguity can cause the regular expression engine to get bogged down trying to match things.

@jordansissel
Copy link
Contributor Author

I do believe this to be an example of a pathological regexp that causes your parsing to be so slow or simply appear to be doing nothing but consuming 100% cpu.

@jsvd
Copy link
Contributor

jsvd commented Sep 15, 2016

closing this.
two notes about this issue in particular:

  1. having two consecutive %{GREEDYDATA} patterns makes no sense. if your matching string is something like Once upon a time, there was a mouse., then the first GREEDYDATA will consume "Once upon a time, there was a", and the second will be "mouse." There's no way for the regex engine to figure out where to start and stop each GREEDYDATA, so it gives as much as possible to the first one.
  2. adding guards to the regex makes parse failure much faster (2x to 5x in this case).

@jsvd jsvd closed this as completed Sep 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants