# Kubernetes Resource Management: Complete Technical Summary

## The Core Concepts

| Concept | What It Does | Analogy | Technical Impact |
|---------|--------------|---------|------------------|
| **Requests** | Minimum guaranteed resources | Reserved seat on the node | Scheduler uses this for placement; node reserves this much |
| **Limits** | Maximum allowed resources | Hard ceiling the pod cannot exceed | Memory: OOMKill if exceeded; CPU: throttled if exceeded |
| **LimitRange** | Namespace-level pod constraints | Department policy for all pods | Enforces min/max/defaults; rejects non-compliant pods |
| **ResourceQuota** | Namespace total budget | Department spending cap | Blocks new pods if total would exceed quota |

---

## The 3-Layer Defense System

```
CLUSTER LEVEL
     │
     ▼
NAMESPACE LEVEL ──► ResourceQuota (Total team budget)
     │              LimitRange    (Individual pod rules)
     │
     ▼
POD LEVEL      ──► requests (Guaranteed minimum)
                   limits   (Hard maximum)
```

---

## QoS Classes (Quality of Service)

| QoS Class | Definition | Stability | Use Case |
|-----------|------------|-----------|----------|
| **Guaranteed** | requests = limits for ALL resources | Most stable, last to be evicted | Databases, stateful apps, critical services |
| **Burstable** | requests < limits for at least one resource | Medium stability, can be evicted if exceeding requests | Web apps, microservices with traffic spikes |
| **BestEffort** | No requests or limits set | Least stable, first to be killed | Batch jobs, non-critical tests, can be interrupted |

---

## The Golden Rules

### Golden Rule #1: Always Set Resources
```yaml
# NEVER do this (BestEffort - will be killed first during pressure)
containers:
- name: bad
  image: nginx
  # No resources section!
```

### Golden Rule #2: Memory Must Have Limits
```yaml
# Memory is incompressible - hitting limit = dead pod
# CPU is compressible - hitting limit = slow pod

# GOOD
resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "1Gi"      # Memory ALWAYS needs a limit
    cpu: "500m"

# DANGEROUS
resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    cpu: "500m"        # Missing memory limit! Pod can eat all node RAM
```

### Golden Rule #3: Match Critical App Types to Patterns

| App Type | Request vs Limit | Rationale |
|----------|-----------------|-----------|
| **Database** | requests = limits | Predictable performance, no bursting |
| **Web App (peak loads)** | requests < limits (2-3x ratio) | Burst during traffic, borrow idle capacity |
| **Batch Job** | requests << limits (5-10x ratio) | Opportunistic use of spare capacity |
| **Caching Layer** | requests < limits (with HPA) | Scales horizontally, individual pods can burst |

### Golden Rule #4: Use All Three Layers Together
```yaml
# Layer 1: Pod - defines actual needs
resources:
  requests:
    memory: "1Gi"
  limits:
    memory: "2Gi"

# Layer 2: LimitRange - enforces namespace policy
apiVersion: v1
kind: LimitRange
metadata:
  name: policy
spec:
  limits:
  - max:
      memory: "4Gi"        # No pod > 4Gi
    min:
      memory: "256Mi"       # No pod < 256Mi
    default:
      memory: "1Gi"         # Default if missing

# Layer 3: ResourceQuota - total namespace budget
apiVersion: v1
kind: ResourceQuota
metadata:
  name: budget
spec:
  hard:
    requests.memory: "20Gi"  # Total team request budget
    limits.memory: "40Gi"     # Total team limit budget
    pods: "25"                # Max pods
```

---

## Common Problems & Solutions

| Problem | Symptom | Solution |
|---------|---------|----------|
| **Greedy Pod** | One pod consuming all node resources | Set **limits** + **ResourceQuota** at namespace level |
| **Starved Pod** | App needs 8GB at peak but limit is 6GB | Use **VPA** or set requests < limits with **Burstable** QoS |
| **Noisy Neighbor** | Other team's pods slowing you down | **ResourceQuotas** per team + **PriorityClasses** |
| **Forgotten Resources** | Pods with no requests/limits get killed first | **LimitRange** with defaults |
| **Memory Leak** | Pod slowly eats memory until OOM | **Limits** contain blast radius + **livenessProbe** to restart |
| **CPU Throttling** | App slow, seeing throttling metrics | Increase CPU requests or scale horizontally with **HPA** |
| **Unschedulable Pods** | Pending pods with insufficient nodes | Check **ResourceQuota** usage and node capacity |

---

## The Resource Management Cheat Sheet

### When to Use What

```
Situation                                      → Recommended Action
─────────────────────────────────────────────────────────────────────
New namespace for team                         → Create LimitRange + ResourceQuota immediately
Critical database                               → Guaranteed QoS (requests = limits)
Web app with variable traffic                    → Burstable QoS (requests < limits) + HPA
Batch job that can be interrupted                → BestEffort or low requests with high limits
Team constantly going over budget                 → Tighter ResourceQuota + monitoring alerts
One app crashes during peak                       → Check VPA recommendations, increase limits
Memory leak in production                          → Limits to contain + fix code + faster livenessProbe
Node constantly under pressure                     → Check for pods without limits, add LimitRange
```

### YAML Template Library

**1. Standard Web App Pattern**
```yaml
resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "1Gi"
    cpu: "1"
```

**2. Critical Database Pattern**
```yaml
resources:
  requests:
    memory: "8Gi"
    cpu: "2"
  limits:
    memory: "8Gi"        # Equal to requests
    cpu: "2"             # Equal to requests
```

**3. Batch Job Pattern**
```yaml
resources:
  requests:
    memory: "256Mi"
    cpu: "100m"
  limits:
    memory: "2Gi"         # Can burst 8x
    cpu: "2"              # Can burst 20x
```

**4. Comprehensive LimitRange**
```yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: production-policy
spec:
  limits:
  - max:
      memory: "8Gi"
      cpu: "4"
    min:
      memory: "256Mi"
      cpu: "100m"
    default:
      memory: "1Gi"
      cpu: "500m"
    defaultRequest:
      memory: "512Mi"
      cpu: "250m"
    maxLimitRequestRatio:
      memory: "2"        # Limit can't be more than 2x request
      cpu: "4"           # CPU can burst more aggressively
    type: Container
```

**5. Team ResourceQuota**
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
spec:
  hard:
    requests.memory: "40Gi"
    requests.cpu: "20"
    limits.memory: "80Gi"
    limits.cpu: "40"
    persistentvolumeclaims: "10"
    pods: "30"
    configmaps: "50"
    secrets: "50"
    services: "20"
```

---

## Quick Decision Flowchart

```
Is the pod critical? (DB, message queue, etc.)
├─ YES → Use Guaranteed QoS (requests = limits)
│        Monitor actual usage, adjust manually
│
└─ NO  → Is traffic variable?
         ├─ YES → Use Burstable QoS (requests < limits)
         │        Add HPA based on memory/CPU
         │        Consider VPA for automatic adjustment
         │
         └─ NO  → Is it a batch/background job?
                  ├─ YES → Use low requests, high limits
                  │        Can be BestEffort if interruptible
                  │
                  └─ NO  → Standard Burstable with 2x ratio
```

## The One-Page Summary

```
┌─────────────────────────────────────────────────────────────────┐
│                    KUBERNETES RESOURCE MANAGEMENT                │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  REQUESTS  = "I need this much guaranteed"                       │
│  LIMITS    = "I can't exceed this much"                          │
│                                                                   │
│  ┌─────────────────────────────────────────────────────────┐     │
│  │  GOLDEN RULES                                            │     │
│  ├─────────────────────────────────────────────────────────┤     │
│  │  ✓ EVERY container needs requests and limits            │     │
│  │  ✓ Memory MUST have limits (or pod can kill node)       │     │
│  │  ✓ Critical apps: requests = limits (Guaranteed QoS)    │     │
│  │  ✓ Web apps: requests < limits (Burstable QoS)          │     │
│  │  ✓ ALWAYS use LimitRange + ResourceQuota per namespace  │     │
│  └─────────────────────────────────────────────────────────┘     │
│                                                                   │
│  DEFENSE LAYERS:                                                  │
│  Level 1: Pod Spec          → "I need 1GB, max 2GB"             │
│  Level 2: LimitRange        → "No pod > 4GB in this namespace"  │
│  Level 3: ResourceQuota     → "Total team budget: 40GB"         │
│                                                                   │
│  REMEMBER:                                                        │
│  • Without limits: One greedy pod can starve others              │
│  • Without requests: Scheduler overcommits, pods get evicted     │
│  • Without LimitRange: Forgetful devs create BestEffort pods     │
│  • Without ResourceQuota: One team can consume the whole cluster │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘
```