-
Notifications
You must be signed in to change notification settings - Fork 23
Xena: Add basic monitoring & alerting stack #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
61b5070
Enable Kolla features
cityofships 6940e0e
Elasticsearch memory tuning
cityofships a8deca1
Add standard alerting rules and dashboards
cityofships 5b7330f
Use sane defaults for basic monitoring stack
cityofships 9a9efd2
Consider agents that are auto-downed
jovial 670aca0
Indicate source of the alert rules
cityofships 916a936
Add monitoring group
cityofships 56911ce
Use shorter regexp for catching physical network cards
cityofships 6b5d1ca
Build Grafana with additional panel plugin
cityofships dd50e66
Merge branch 'stackhpc/xena' into xena_monitoring
markgoddard File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,9 @@ | ||
# A single all-in-one controller/compute host. | ||
# A single all-in-one controller/compute/monitoring host. | ||
[controllers] | ||
controller0 | ||
|
||
[compute:children] | ||
controllers | ||
|
||
[monitoring:children] | ||
controllers |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
316 changes: 316 additions & 0 deletions
316
etc/kayobe/kolla/config/grafana/dashboards/ceph/ceph_mds.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,316 @@ | ||
{% raw %} | ||
markgoddard marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
"annotations": { | ||
"list": [ | ||
{ | ||
"builtIn": 1, | ||
"datasource": "-- Grafana --", | ||
"enable": true, | ||
"hide": true, | ||
"iconColor": "rgba(0, 211, 255, 1)", | ||
"name": "Annotations & Alerts", | ||
"type": "dashboard" | ||
} | ||
] | ||
}, | ||
"editable": true, | ||
"gnetId": null, | ||
"graphTooltip": 0, | ||
"id": 179, | ||
"iteration": 1616694078267, | ||
"links": [], | ||
"panels": [ | ||
{ | ||
"collapsed": false, | ||
"datasource": null, | ||
"gridPos": { | ||
"h": 1, | ||
"w": 24, | ||
"x": 0, | ||
"y": 0 | ||
}, | ||
"id": 10, | ||
"panels": [], | ||
"title": "MDS Performance", | ||
"type": "row" | ||
}, | ||
{ | ||
"aliasColors": {}, | ||
"bars": false, | ||
"dashLength": 10, | ||
"dashes": false, | ||
"datasource": "$datasource", | ||
"fill": 0, | ||
"fillGradient": 0, | ||
"gridPos": { | ||
"h": 9, | ||
"w": 12, | ||
"x": 0, | ||
"y": 1 | ||
}, | ||
"hiddenSeries": false, | ||
"id": 2, | ||
"legend": { | ||
"avg": false, | ||
"current": false, | ||
"max": false, | ||
"min": false, | ||
"show": true, | ||
"total": false, | ||
"values": false | ||
}, | ||
"lines": true, | ||
"linewidth": 1, | ||
"links": [], | ||
"nullPointMode": "null", | ||
"options": { | ||
"dataLinks": [] | ||
}, | ||
"percentage": false, | ||
"pointradius": 5, | ||
"points": false, | ||
"renderer": "flot", | ||
"seriesOverrides": [ | ||
{ | ||
"alias": "/.*Reads/", | ||
"transform": "negative-Y" | ||
} | ||
], | ||
"spaceLength": 10, | ||
"stack": true, | ||
"steppedLine": false, | ||
"targets": [ | ||
{ | ||
"expr": "sum(irate(ceph_objecter_op_r{ceph_daemon=~\"($mds_servers).*\"}[10m]))", | ||
"format": "time_series", | ||
"intervalFactor": 1, | ||
"legendFormat": "Read Ops", | ||
"refId": "A" | ||
}, | ||
{ | ||
"expr": "sum(irate(ceph_objecter_op_w{ceph_daemon=~\"($mds_servers).*\"}[10m]))", | ||
"format": "time_series", | ||
"intervalFactor": 1, | ||
"legendFormat": "Write Ops", | ||
"refId": "B" | ||
} | ||
], | ||
"thresholds": [], | ||
"timeFrom": null, | ||
"timeRegions": [], | ||
"timeShift": null, | ||
"title": "MDS Workload - $mds_servers", | ||
"tooltip": { | ||
"shared": true, | ||
"sort": 2, | ||
"value_type": "individual" | ||
}, | ||
"type": "graph", | ||
"xaxis": { | ||
"buckets": null, | ||
"mode": "time", | ||
"name": null, | ||
"show": true, | ||
"values": [] | ||
}, | ||
"yaxes": [ | ||
{ | ||
"format": "none", | ||
"label": "Reads(-) / Writes (+)", | ||
"logBase": 1, | ||
"max": null, | ||
"min": "0", | ||
"show": true | ||
}, | ||
{ | ||
"format": "short", | ||
"label": null, | ||
"logBase": 1, | ||
"max": null, | ||
"min": null, | ||
"show": true | ||
} | ||
], | ||
"yaxis": { | ||
"align": false, | ||
"alignLevel": null | ||
} | ||
}, | ||
{ | ||
"aliasColors": {}, | ||
"bars": false, | ||
"dashLength": 10, | ||
"dashes": false, | ||
"datasource": "$datasource", | ||
"fill": 0, | ||
"fillGradient": 0, | ||
"gridPos": { | ||
"h": 9, | ||
"w": 12, | ||
"x": 12, | ||
"y": 1 | ||
}, | ||
"hiddenSeries": false, | ||
"id": 4, | ||
"legend": { | ||
"avg": false, | ||
"current": false, | ||
"max": false, | ||
"min": false, | ||
"show": true, | ||
"total": false, | ||
"values": false | ||
}, | ||
"lines": true, | ||
"linewidth": 1, | ||
"links": [], | ||
"nullPointMode": "null", | ||
"options": { | ||
"dataLinks": [] | ||
}, | ||
"percentage": false, | ||
"pointradius": 5, | ||
"points": false, | ||
"renderer": "flot", | ||
"seriesOverrides": [], | ||
"spaceLength": 10, | ||
"stack": false, | ||
"steppedLine": false, | ||
"targets": [ | ||
{ | ||
"expr": "irate(ceph_mds_server_handle_client_request{ceph_daemon=~\"($mds_servers).*\"}[30m])", | ||
"format": "time_series", | ||
"intervalFactor": 1, | ||
"legendFormat": "{{ceph_daemon}}", | ||
"refId": "A" | ||
} | ||
], | ||
"thresholds": [], | ||
"timeFrom": null, | ||
"timeRegions": [], | ||
"timeShift": null, | ||
"title": "Client Request Load - $mds_servers", | ||
"tooltip": { | ||
"shared": true, | ||
"sort": 2, | ||
"value_type": "individual" | ||
}, | ||
"type": "graph", | ||
"xaxis": { | ||
"buckets": null, | ||
"mode": "time", | ||
"name": null, | ||
"show": true, | ||
"values": [] | ||
}, | ||
"yaxes": [ | ||
{ | ||
"format": "none", | ||
"label": "Client Requests", | ||
"logBase": 1, | ||
"max": null, | ||
"min": "0", | ||
"show": true | ||
}, | ||
{ | ||
"format": "short", | ||
"label": null, | ||
"logBase": 1, | ||
"max": null, | ||
"min": null, | ||
"show": false | ||
} | ||
], | ||
"yaxis": { | ||
"align": false, | ||
"alignLevel": null | ||
} | ||
} | ||
], | ||
"refresh": false, | ||
"schemaVersion": 22, | ||
"style": "dark", | ||
"tags": [], | ||
"templating": { | ||
"list": [ | ||
{ | ||
"current": { | ||
"text": "Prometheus", | ||
"value": "Prometheus" | ||
}, | ||
"hide": 0, | ||
"includeAll": false, | ||
"label": "Data Source", | ||
"multi": false, | ||
"name": "datasource", | ||
"options": [], | ||
"query": "prometheus", | ||
"refresh": 1, | ||
"regex": "", | ||
"skipUrlSync": false, | ||
"type": "datasource" | ||
}, | ||
{ | ||
"allValue": null, | ||
"current": { | ||
"selected": false, | ||
"text": "All", | ||
"value": "$__all" | ||
}, | ||
"datasource": "$datasource", | ||
"definition": "", | ||
"hide": 0, | ||
"includeAll": true, | ||
"label": "MDS Server", | ||
"multi": false, | ||
"name": "mds_servers", | ||
"options": [], | ||
"query": "label_values(ceph_mds_inodes, ceph_daemon)", | ||
"refresh": 1, | ||
"regex": "", | ||
"skipUrlSync": false, | ||
"sort": 0, | ||
"tagValuesQuery": "", | ||
"tags": [], | ||
"tagsQuery": "", | ||
"type": "query", | ||
"useTags": false | ||
} | ||
] | ||
}, | ||
"time": { | ||
"from": "now-3h", | ||
"to": "now" | ||
}, | ||
"timepicker": { | ||
"refresh_intervals": [ | ||
"5s", | ||
"10s", | ||
"15s", | ||
"30s", | ||
"1m", | ||
"5m", | ||
"15m", | ||
"30m", | ||
"1h", | ||
"2h", | ||
"1d" | ||
], | ||
"time_options": [ | ||
"5m", | ||
"15m", | ||
"1h", | ||
"6h", | ||
"12h", | ||
"24h", | ||
"2d", | ||
"7d", | ||
"30d" | ||
] | ||
}, | ||
"timezone": "", | ||
"title": "Ceph MDS", | ||
"uid": "tbO9LAiZz", | ||
"version": 5 | ||
} | ||
{% endraw %} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.