A curated list of awesome site reliability tools, frameworks, libraries, best practices, and other things. Inspired by awesome-python and awesome-go.
Please look over the contribution guidelines first.
If you see an entry that is no longer maintained or is not a good fit, please submit a pull request to improve this file. Thank you!
Build automation tools and systems
Compute resource management and job control
Tools for managing host level and runtime configuration.
Logging and logs analysis
Tools and systems for monitoring and instrumentation of services.
Orchestration Tools and Services
Testing Tools and Systems