Skip to content

jnoller/kubernaughty

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

kubernaughty

This is a collection of documentation, how-tos, tools and other information on debugging and identifying Kubernetes/container workload failures, performance and reliability considerations.

Initially this investigation started as user-reported failures at the DNS, networking and application levels, however through the analysis the actual causes for these failures we due to severe resource saturation & contention, IO throttling, kernel panics, etc. For an overview, see Part 1: Summary.

Through the investigation, I've discovered a lack of operational / systems knowledge, tracking and general awareness of the worker nodes / linux hosts that comprise kubernetes clusters (including filesystem incompatibility).

There are many gotchas, mud pits and blind spots running distributed systems, and kubernetes is no different. My goal with this is to step through the past 20 years of my career (eg, showing everyone my mistakes and learnings from the past).

Hopefully, this stuff helps you and your team.

This is an ongoing project / labor of love. It is not complete by any means

Roadmap

Contents:

Screencasts

Kubernaughty 1: IO saturation and throttling

About

IO, resource contention notes, docs and tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published