A Chaos Engineering Platform for Kubernetes.
-
Updated
May 6, 2024 - Go
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
A Chaos Engineering Platform for Kubernetes.
An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)
Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
Chaos testing, network emulation, and stress testing tool for containers
A chaos engineering platform for supporting the complete fault drill lifecycle.
The Skinny Distributed Lock Service
Endpoint monitoring and DNS failover agent written in Go
[INACTIVE] Terraform provider for Arachnys' Cabot. Create, manage, and manipulate status checks, and alerts for services.
Keep Kubernetes Deployments up-to-date with the `latest` container images
Control health checks and toggle upstream node status in load balancers with ease.
Demo repository for "Go for Operations" workshop.
DevOps E / SRE 업무를 하면서 전문성을 갖추기 위하여 공부한 자료를 업로드하는 공간입니다. 개인적인 공부이지만 참고할 부분이 될 수 있었으면 좋겠습니다.
Maia is a CLI that allows you to execute remote commands on multiple machines at once.
Simple way to test connection to memcached
Capstone project of the Udacity's Cloud Native Application Architecture Nanodegree
External Node Classifier written in Go