Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

DO NOT MERGE METRON-745: Create Error Dashboards #469

Closed
wants to merge 21 commits into from

Conversation

justinleet
Copy link
Contributor

DO NOT MERGE

Summary

Based on Ryan's work in #453, I went ahead and created some a Kibana dashboard for tracking errors. That PR is not finalized in master so this should not be merged! However, the data flowing to the index is pretty final, so unless the actual fields or field names change, it doesn't really affect this.

All we care about here is the dashboard itself, but unfortunately the 453 changes get pulled along for the ride until that's in.

It's nothing too complicated, essentially just some high level overviews of the various fields output by Ryan (some counts, etc.), along with a pane for viewing the actual errors along with all their fields. Note that they include both raw and unique message counts (via the hash fields) in most things.

I've attached some screenshots, but this can be also be spun up on an Ambari cluster (and will eventually have to be to be validated, given that the file isn't in a readable format).

I'm basically looking for feedback on what else would be useful and if we want to adjust anything. Keep in mind, we don't actually have a lot of fields to work with (because if everything was good, we wouldn't be here in the first place!). See error_index.template for the fields we have.

Notes

  • I'm really not convinced the 'hostname' visualizations are needed. The field is there and useful, but given that it's populated with the Storm host that failed, it seems like it's probably useless most of the time.
  • Kibana occasionally rearranges the order of the visualizations (usually swapping a couple of the charts). If I recall correctly, that's a known Kibana bug that we're stuck with.
  • The graph teaches a lesson of "Don't load all your data at once if you want a pretty graph". Still, it's just a basic graph of the error counts over time.
  • Keep in mind the graph shifts by the viewing window. So last 15 minutes vs last 7 days all updates accordingly.

errordashboard_top

errordashboard_middle

errordashboard_bottom

The bottom pane extends further down, but we've all seen a table of data before.

For all changes:

  • [] Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • [] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?
  • Have you included steps or a guide to how the change may be verified and tested manually?
  • Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via:
mvn -q clean integration-test install && build_utils/verify_licenses.sh 
  • Have you written or updated unit tests and or integration tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html.
cd site-book
bin/generate-md.sh
mvn site:site

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommened that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

rmerriman added 10 commits February 6, 2017 15:19
# Conflicts:
#	metron-platform/metron-enrichment/src/test/java/org/apache/metron/enrichment/integration/EnrichmentIntegrationTest.java
#	metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
#	metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/bolt/ParserBoltTest.java
@justinleet
Copy link
Contributor Author

Alternative, and more sensical/readable approach, to the over time errors.
error dashboard histogram

@justinleet
Copy link
Contributor Author

I'm going to just close this and open a new, much, much cleaner one.

@justinleet justinleet closed this Mar 7, 2017
@justinleet justinleet deleted the dashboards-695 branch April 4, 2017 12:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants