On-Call/DevOps Assistant - Get a head start on fixing alerts with AI investigation
-
Updated
Jul 28, 2024 - Python
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
On-Call/DevOps Assistant - Get a head start on fixing alerts with AI investigation
OpenShift Guide. Learn about the Red Hat OpenShift Container Platform, Data Science, Code Ready Containers, Podman, Buildah, and Kubernetes.
Welcome To The World of DevOps. An ongoing & curated collection of awesome software, libraries, learning tutorials, tools and resources and cool stuff about DevOps.
The agent of Komlog, a PaaS for helping observability teams to better understand their systems.
Deterministic Subsetting as defined in the SRE book
An agent to monitor the uptime of multiple sites and response time without any external libraries.
Oncall Dashboard