New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert check skipped, NPE #4045

Closed
hc4 opened this Issue Aug 2, 2017 · 10 comments

Comments

Projects
None yet
3 participants
@hc4
Contributor

hc4 commented Aug 2, 2017

Just upgraded to 2.3.0 and got lots of errors in log

2017-08-02T10:54:55.727+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:55:54.576+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:56:56.852+03:00 ERROR [AlertScanner] Skipping alert check <***/03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d>: null (NullPointerException)
2017-08-02T10:57:02.803+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:57:55.756+03:00 ERROR [AlertScanner] Skipping alert check <***/03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d>: null (NullPointerException)
2017-08-02T10:57:55.881+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)

Your Environment

  • Graylog Version: 2.3.0
  • Elasticsearch Version: 2.4.5

@hc4 hc4 changed the title from Allert skipped, NPE to Allert check skipped, NPE Aug 2, 2017

@joschi

This comment has been minimized.

Contributor

joschi commented Aug 2, 2017

@hc4 Are there any other exceptions in the logs of your Graylog node(s)?
Are you using 3rd party plugins and are all plugins compatible with Graylog 2.3.0?

@hc4

This comment has been minimized.

Contributor

hc4 commented Aug 2, 2017

No other errors in logs (also surprised that there is no stacktrace).
image

@hc4

This comment has been minimized.

Contributor

hc4 commented Aug 2, 2017

It looks like problem caused by broken stream routing (another problem, that I'm looking right now)
So there is no any data in alert stream.

@dennisoelkers dennisoelkers changed the title from Allert check skipped, NPE to Alert check skipped, NPE Aug 2, 2017

@dennisoelkers

This comment has been minimized.

Member

dennisoelkers commented Aug 2, 2017

Hey @hc4, can you dump your alert condition configurations?

@dennisoelkers dennisoelkers self-assigned this Aug 2, 2017

@hc4

This comment has been minimized.

Contributor

hc4 commented Aug 2, 2017

{
  "total": 2,
  "conditions": [
    {
      "id": "f0192668-c8c8-4eb6-a765-f66a1919431c",
      "type": "message_count",
      "creator_user_id": "***",
      "created_at": "2016-09-07T10:32:25.889+0000",
      "parameters": {
        "grace": 1440,
        "threshold_type": "LESS",
        "threshold": 1,
        "time": 60,
        "backlog": 0
      },
      "in_grace": false,
      "title": "***"
    },
    {
      "id": "988cee81-212a-44c3-9477-289e64e530cf",
      "type": "field_value",
      "creator_user_id": "***",
      "created_at": "2016-10-17T08:31:20.739+0000",
      "parameters": {
        "backlog": 0,
        "repeat_notifications": false,
        "field": "load_percents",
        "grace": 60,
        "threshold": 90,
        "threshold_type": "HIGHER",
        "time": 5,
        "type": "MEAN"
      },
      "in_grace": false,
      "title": "***"
    }
  ]
}
@dennisoelkers

This comment has been minimized.

Member

dennisoelkers commented Aug 2, 2017

Can you please:

  • also dump the config for alert condition 03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d
  • switch the AlertScanner class to debug for one run, using curl -XPUT -u $user:$pass http://$graylog/api/system/loggers/org.graylog2.alerts.AlertScanner/DEBUG (replace http with https if necessary)

When switched to debug, the whole exception is logged. Otherwise it is only the exception message and class name, but no stack trace, to avoid log flooding. You can switch back the log level of the class by doing the same call, replacing DEBUG with INFO.

@hc4

This comment has been minimized.

Contributor

hc4 commented Aug 2, 2017

Both alerts are copies of each other.

Stackstrace:

java.lang.NullPointerException: null
        at org.graylog2.indexer.results.FieldStatsResult.<init>(FieldStatsResult.java:50) ~[graylog.jar:?]
        at org.graylog2.indexer.searches.Searches.fieldStats(Searches.java:508) ~[graylog.jar:?]
        at org.graylog2.alerts.types.FieldValueAlertCondition.runCheck(FieldValueAlertCondition.java:175) ~[graylog.jar:?]
        at org.graylog2.alerts.types.FieldValueAlertCondition.runCheck(FieldValueAlertCondition.java:53) ~[graylog.jar:?]
        at org.graylog2.alerts.AlertScanner.checkAlertCondition(AlertScanner.java:64) ~[graylog.jar:?]
        at org.graylog2.periodical.AlertScannerThread.lambda$doRun$0(AlertScannerThread.java:63) ~[graylog.jar:?]
        at java.util.ArrayList.forEach(ArrayList.java:1249) [?:1.8.0_131]
        at org.graylog2.periodical.AlertScannerThread.doRun(AlertScannerThread.java:63) [graylog.jar:?]
        at org.graylog2.plugin.periodical.Periodical.run(Periodical.java:77) [graylog.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_131]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_131]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_131]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
@dennisoelkers

This comment has been minimized.

Member

dennisoelkers commented Aug 2, 2017

Thanks. This is a known issue (#4038) and will be fixed in the next version.

@dennisoelkers dennisoelkers added this to the 2.3.1 milestone Aug 2, 2017

@dennisoelkers dennisoelkers added the bug label Aug 2, 2017

@dennisoelkers

This comment has been minimized.

Member

dennisoelkers commented Aug 2, 2017

@hc4: I have backported the fix to 2.3, so the fix should go into 2.3.1 if everything goes well. Thanks for reporting this!

joschi added a commit that referenced this issue Aug 2, 2017

Return NaN for non-present fields of FieldStatsResult (#4046)
Before this change, an `ExtendedStatsAggregation` could include an
arbitrary number of fields that are null. Assigning them a non-boxed
field type leads to an NPE and a 500 is being returned to the caller
when the result of a extended field stats widget is requested.

This change properly assigns a valid value for those fields, so a result
(albeit possibly containing NaN for one or more fields) is being
returned to the caller.

Fixes #4026
Fixes #4045

(cherry picked from commit 882727e)
@joschi

This comment has been minimized.

Contributor

joschi commented Aug 2, 2017

Fixed in #4046

@joschi joschi closed this Aug 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment