Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert check skipped, NPE #4045

Closed
hc4 opened this issue Aug 2, 2017 · 10 comments
Closed

Alert check skipped, NPE #4045

hc4 opened this issue Aug 2, 2017 · 10 comments
Assignees
Labels
Milestone

Comments

@hc4
Copy link
Contributor

hc4 commented Aug 2, 2017

Just upgraded to 2.3.0 and got lots of errors in log

2017-08-02T10:54:55.727+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:55:54.576+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:56:56.852+03:00 ERROR [AlertScanner] Skipping alert check <***/03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d>: null (NullPointerException)
2017-08-02T10:57:02.803+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)
2017-08-02T10:57:55.756+03:00 ERROR [AlertScanner] Skipping alert check <***/03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d>: null (NullPointerException)
2017-08-02T10:57:55.881+03:00 ERROR [AlertScanner] Skipping alert check <***/988cee81-212a-44c3-9477-289e64e530cf>: null (NullPointerException)

Your Environment

  • Graylog Version: 2.3.0
  • Elasticsearch Version: 2.4.5
@hc4 hc4 changed the title Allert skipped, NPE Allert check skipped, NPE Aug 2, 2017
@joschi
Copy link
Contributor

joschi commented Aug 2, 2017

@hc4 Are there any other exceptions in the logs of your Graylog node(s)?
Are you using 3rd party plugins and are all plugins compatible with Graylog 2.3.0?

@hc4
Copy link
Contributor Author

hc4 commented Aug 2, 2017

No other errors in logs (also surprised that there is no stacktrace).
image

@hc4
Copy link
Contributor Author

hc4 commented Aug 2, 2017

It looks like problem caused by broken stream routing (another problem, that I'm looking right now)
So there is no any data in alert stream.

@dennisoelkers dennisoelkers changed the title Allert check skipped, NPE Alert check skipped, NPE Aug 2, 2017
@dennisoelkers
Copy link
Member

Hey @hc4, can you dump your alert condition configurations?

@dennisoelkers dennisoelkers self-assigned this Aug 2, 2017
@hc4
Copy link
Contributor Author

hc4 commented Aug 2, 2017

{
  "total": 2,
  "conditions": [
    {
      "id": "f0192668-c8c8-4eb6-a765-f66a1919431c",
      "type": "message_count",
      "creator_user_id": "***",
      "created_at": "2016-09-07T10:32:25.889+0000",
      "parameters": {
        "grace": 1440,
        "threshold_type": "LESS",
        "threshold": 1,
        "time": 60,
        "backlog": 0
      },
      "in_grace": false,
      "title": "***"
    },
    {
      "id": "988cee81-212a-44c3-9477-289e64e530cf",
      "type": "field_value",
      "creator_user_id": "***",
      "created_at": "2016-10-17T08:31:20.739+0000",
      "parameters": {
        "backlog": 0,
        "repeat_notifications": false,
        "field": "load_percents",
        "grace": 60,
        "threshold": 90,
        "threshold_type": "HIGHER",
        "time": 5,
        "type": "MEAN"
      },
      "in_grace": false,
      "title": "***"
    }
  ]
}

@dennisoelkers
Copy link
Member

Can you please:

  • also dump the config for alert condition 03cd63aa-94c8-4d1c-8e47-c02bed2f2e7d
  • switch the AlertScanner class to debug for one run, using curl -XPUT -u $user:$pass http://$graylog/api/system/loggers/org.graylog2.alerts.AlertScanner/DEBUG (replace http with https if necessary)

When switched to debug, the whole exception is logged. Otherwise it is only the exception message and class name, but no stack trace, to avoid log flooding. You can switch back the log level of the class by doing the same call, replacing DEBUG with INFO.

@hc4
Copy link
Contributor Author

hc4 commented Aug 2, 2017

Both alerts are copies of each other.

Stackstrace:

java.lang.NullPointerException: null
        at org.graylog2.indexer.results.FieldStatsResult.<init>(FieldStatsResult.java:50) ~[graylog.jar:?]
        at org.graylog2.indexer.searches.Searches.fieldStats(Searches.java:508) ~[graylog.jar:?]
        at org.graylog2.alerts.types.FieldValueAlertCondition.runCheck(FieldValueAlertCondition.java:175) ~[graylog.jar:?]
        at org.graylog2.alerts.types.FieldValueAlertCondition.runCheck(FieldValueAlertCondition.java:53) ~[graylog.jar:?]
        at org.graylog2.alerts.AlertScanner.checkAlertCondition(AlertScanner.java:64) ~[graylog.jar:?]
        at org.graylog2.periodical.AlertScannerThread.lambda$doRun$0(AlertScannerThread.java:63) ~[graylog.jar:?]
        at java.util.ArrayList.forEach(ArrayList.java:1249) [?:1.8.0_131]
        at org.graylog2.periodical.AlertScannerThread.doRun(AlertScannerThread.java:63) [graylog.jar:?]
        at org.graylog2.plugin.periodical.Periodical.run(Periodical.java:77) [graylog.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_131]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_131]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_131]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

@dennisoelkers
Copy link
Member

Thanks. This is a known issue (#4038) and will be fixed in the next version.

@dennisoelkers
Copy link
Member

@hc4: I have backported the fix to 2.3, so the fix should go into 2.3.1 if everything goes well. Thanks for reporting this!

joschi pushed a commit that referenced this issue Aug 2, 2017
Before this change, an `ExtendedStatsAggregation` could include an
arbitrary number of fields that are null. Assigning them a non-boxed
field type leads to an NPE and a 500 is being returned to the caller
when the result of a extended field stats widget is requested.

This change properly assigns a valid value for those fields, so a result
(albeit possibly containing NaN for one or more fields) is being
returned to the caller.

Fixes #4026
Fixes #4045

(cherry picked from commit 882727e)
@joschi
Copy link
Contributor

joschi commented Aug 2, 2017

Fixed in #4046

@joschi joschi closed this as completed Aug 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants