-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sdm_health_exporter.py - v0.0.1 #10
Conversation
This script serves as an example exporter that can monitor the health of resources ("Infrastructure") and nodes ("Gateways/Relays"). The script uses the following workflow: - Make an API call to strongDM's API to retrive information about resources and nodes. The frequency of the API call is configurable by updating the "update_interval" variable in "main()" - Collect data about any resource or node that is tagged with <alert_tag> in strongDM. This tag is configurable by updating the "alert_tag" variable in "main()" - Export metrics to a prometheus endpoint as a "Gauge" (0 for healthy, 1 for unhealthy)
bcebf42
to
4512734
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @cigoldstein
This is really good stuff. Many other customers have asked for the same functionality, so definitively very useful!
We left some comments there for your consideration. Once you make the changes, we'll be notified and merge the code.
Thanks again for your contrib
api_id = os.environ['SDM_API_ID'] | ||
api_secret = os.environ['SDM_API_SECRET'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename:
- SDM_API_ID to SDM_API_ACCESS_KEY
- SDM_API_SECRET to SDM_API_SECRET_KEY
|
||
# node health is returned as "started" or "stopped" | ||
# make sure that the "state" value we received is expected | ||
if node.state not in ("started", "stopped"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really care about any other value than started
, so not sure if we want to keep this validation. In case you really want to keep it, we need to include new
:
if node.state not in ("started", "stopped", "new"):
- Renamed API key/secret environment variables - Removed validation check for states being "started" or "stopped" - Added the "health" and "state" prometheus labels for resources and nodes respectively to make it easy to see the current health in prometheus
I've incorporated the comments above and also added some additional Prometheus labels. Thank you for accepting this into the contrib repo! |
Nice job @cigoldstein ! Thanks for the contribution, we're pretty sure it's going to be highly appreciated by the strongDM community. |
sdm_health_exporter.py - v0.0.1
This script serves as an example exporter that can monitor the
health of resources ("Infrastructure") and nodes ("Gateways/Relays").
The script uses the following workflow:
Make an API call to strongDM's API to retrive information about resources and
nodes. The frequency of the API call is configurable by updating the "update_interval"
variable in "main()"
Collect data about any resource or node that is tagged with <alert_tag> in strongDM.
This tag is configurable by updating the "alert_tag" variable in "main()"
Export metrics to a prometheus endpoint as a "Gauge" (0 for healthy, 1 for unhealthy)