TykTechnologies · sedkis · Aug 18, 2025 · Aug 19, 2025 · Aug 19, 2025 · Aug 19, 2025
diff --git a/tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md b/tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md
@@ -1,33 +1,53 @@
 ---
-title: "Liveness Health Checks"
+title: "Health Checks"
 date: 2025-02-10
-keywords: ["health check", "liveness health check", "Tyk Gateway", "Tyk Dashboard", "MDCB", "load balancer", "Kubernetes liveness probe"]
-description: "How to set up liveness health checks for the Tyk Gateway to ensure high availability and monitor the status of components like Redis, Dashboard, and RPC."
+keywords: ["health check", "liveness health check", "readiness health check", "Tyk Gateway", "Tyk Dashboard", "MDCB", "load balancer", "Kubernetes liveness probe", "Kubernetes readiness probe"]
+description: "How to set up liveness and readiness health checks for the Tyk Gateway to ensure high availability and monitor the status of components like Redis, Dashboard, and RPC."
 aliases:
   - /tyk-rest-api/health-checking
 ---
 
-## Set Up Liveness Health Checks
+## Overview
 
-Health checks are extremely important in determining the status of an
-application - in this instance, the Tyk Gateway. Without them, it can be hard to
-know the actual state of the Gateway.
+Tyk Gateway provides two health check endpoints to help you monitor and manage your API gateway:
 
-Depending on your configuration, the Gateway could be using a few components:
+### Quick Reference
 
-- The Tyk Dashboard.
-- RPC
-- Redis (compulsory).
+| Endpoint | Purpose | When to Use | HTTP Response |
+|----------|---------|-------------|---------------|
+| `/hello` | **Liveness check** | Load balancers, basic monitoring | Always 200 OK |
+| `/ready` | **Readiness check** | Kubernetes, traffic routing decisions | 200 OK when ready, 503 when not |
 
-Any of these components could go down at any given point and it is useful to
-know if the Gateway is currently usable or not. A good usage of the health
-check endpoint is for the configuration of a load balancer to multiple instances of the Gateway or
-as a Kubernetes liveness probe.
+### Which endpoint should I use?
 
-The following component status will not be returned:
+- **Use `/hello` for**: Load balancers, basic uptime monitoring, general health checks
+- **Use `/ready` for**: Kubernetes readiness probes, deciding when to route traffic to a new Gateway instance
+
+## What Gets Monitored
+
+The health check endpoints monitor these critical Gateway dependencies:
+
+✅ **Monitored Components:**
+- **Redis** (required) - Data storage and caching
+- **Tyk Dashboard** (if configured) - API management interface  
+- **RPC connection** (for MDCB setups) - Multi-data center communication
+
+
+### Kubernetes Deployments
+```yaml
+# Liveness probe - restarts pod if Gateway process is dead
+livenessProbe:
+  httpGet:
+    path: /hello
+    port: 8080
+
+# Readiness probe - removes from service when not ready
+readinessProbe:
+  httpGet:
+    path: /ready
+    port: 8080
+```
 
-* MongDB or SQL
-* Tyk Pump
 
 {{< note success >}}
 **Note**  
@@ -73,173 +93,167 @@ The following status levels can be returned in the JSON response.
 
 - **fail**: Indicates that Redis AND the Tyk Dashboard are unavailable, and can and indicate other failures. The impact is high (i.e. no configuration changes are available for API/policies/keys, no quotas are applied, and no analytics).
 
-## Configure health check
+## The `/ready` Endpoint (Readiness Check)
+
+Use this endpoint when you need to know if the Gateway is **actually ready** to handle API traffic.
+
+### What it checks
+- ✅ Redis is connected and working
+- ✅ APIs have been loaded successfully at least once
 
-By default, the liveness health check runs on the `/hello` path. But
-it can be configured to run on any path you want to set. For example:
+### How it responds
+- **Gateway is ready**: Returns `HTTP 200 OK` 
+- **Gateway is NOT ready**: Returns `HTTP 503 Service Unavailable`
 
+### When to use `/ready`
+- **Kubernetes readiness probes** - Removes pod from service when not ready
+- **Graceful Terminations** - Removes pod from service when Gateway is shutting down
+- **New deployments** - Wait for 200 response before routing traffic
+- **Automated scaling** - Verify new instances are ready before adding to pool
+
+### Configuration
+The endpoint runs on `/ready` by default. To change it:
 
 ```yaml
-health_check_endpoint_name: "status"
+readiness_check_endpoint_name: "status-ready"
 ```
 
-This configures the health check to run on `/status` instead of `/hello`.
+[config ref](https://tyk.io/docs/tyk-oss-gateway/configuration/#readiness_check_endpoint_name)
 
-**Refresh Interval**
+## The `/hello` Endpoint (Liveness Check)
 
-The Health check endpoint will refresh every 10 seconds.
+Use this endpoint for basic health monitoring and load balancer health checks.  This check returns 200 when the Gateway has started and is attempting to or has arrived to a stable condition.
 
-**HTTP error code**
-The Health check endpoint will always return a `HTTP 200 OK` response if the polled health check endpoint is available on your Tyk Gateway. If `HTTP 200 OK` is not returned, your Tyk Gateway is in an error state.
+### How it responds
+- **Always returns `HTTP 200 OK`** (even when components are failing).  
+- **Check the response body** to see which components are healthy or failing
 
+### When to use `/hello`
+- **Load balancers** - Route traffic to instances that respond
+- **Basic monitoring** - Simple uptime checks
+- **MDCB setups** - Monitor both Management and Worker Gateways
 
-For MDCB installations the `/hello` endpoint can be polled in either your Management or Worker Gateways. It is recommended to use the `/hello` endpoint behind a load balancer for HA purposes.
+### Configuration
+The endpoint runs on `/hello` by default. To change it:
 
-## Health check examples
+```yaml
+health_check_endpoint_name: "status"
+```
 
-The following examples show how the Health check endpoint returns
+[Config Ref](https://tyk.io/docs/tyk-oss-gateway/configuration/#health_check_endpoint_name)
 
+### Important Notes
+- **Updates every 10 seconds** - Health status is cached and refreshed automatically
+- **Always responds with 200** - Even when Redis or Dashboard are down (check response body for details)
+- **Use for load balancers** - Perfect for HAProxy, NGINX, AWS ALB health checks
 
-### Pass Status
+## Testing the Health Check Endpoints
 
-The following is returned for a `pass` status level for the Open Source Gateway:
+### Quick Health Check
+```bash
+# Check if Gateway is alive (always returns 200)
+curl http://localhost:8080/hello
 
+# Check if Gateway is ready to serve traffic
+curl http://localhost:8080/ready
 ```
-$ http :8080/hello
+
+### `/ready` Endpoint Examples
+
+**✅ Gateway is ready** (returns `HTTP 200 OK`):
+```bash
+$ curl -i http://localhost:8080/ready
 HTTP/1.1 200 OK
-Content-Length: 156
-Content-Type: application/json
-Date: Wed, 14 Apr 2021 17:36:09 GMT
 
 {
+  "status": "pass",
   "description": "Tyk GW",
   "details": {
-    "redis": {
-      "componentType": "datastore",
-      "status": "pass",
-      "time": "2021-04-14T17:36:03Z"
-    }
-  },
-  "status": "pass",
-  "version": "v3.1.1"
+    "redis": { "status": "pass" }
+  }
 }
 ```
 
-### Redis outage
-
-```
-$ http :8080/hello
-HTTP/1.1 200 OK
-Content-Length: 303
-Content-Type: application/json
-Date: Wed, 14 Apr 2021 14:58:06 GMT
+**❌ Gateway is NOT ready** (returns `HTTP 503 Service Unavailable`):
+```bash
+$ curl -i http://localhost:8080/ready  
+HTTP/1.1 503 Service Unavailable
 
 {
-  "description": "Tyk GW",
+  "status": "fail",
+  "description": "Tyk GW", 
   "details": {
-    "dashboard": {
-      "componentType": "system",
-      "status": "pass",
-      "time": "2021-04-14T14:58:03Z"
-    },
-    "redis": {
-      "componentType": "datastore",
-      "output": "storage: Redis is either down or was not configured",
+    "redis": { 
       "status": "fail",
-      "time": "2021-04-14T14:58:03Z"
+      "output": "Redis is down or not configured"
     }
-  },
-  "status": "warn",
-  "version": "v3.1.2"
+  }
 }
 ```
 
-### Dashboard outage
-
-```
-$ http :8080/hello
-HTTP/1.1 200 OK
-Content-Length: 292
-Content-Type: application/json
-Date: Wed, 14 Apr 2021 15:52:47 GMT
+### `/hello` Endpoint Examples
 
+**✅ All systems healthy** (always returns `HTTP 200 OK`):
+```bash
+$ curl http://localhost:8080/hello
 {
+  "status": "pass",
   "description": "Tyk GW",
   "details": {
-    "dashboard": {
-      "componentType": "system",
-      "output": "dashboard is down? Heartbeat is failing",
-      "status": "fail",
-      "time": "2021-04-14T15:52:43Z"
-    },
-    "redis": {
-      "componentType": "datastore",
-      "status": "pass",
-      "time": "2021-04-14T15:52:43Z"
-    }
-  },
-  "status": "warn",
-  "version": "v3.1.2"
+    "redis": { "status": "pass" },
+    "dashboard": { "status": "pass" }
+  }
 }
 ```
-### Dashboard and Redis outage
-
-```
-$ http :8080/hello
-HTTP/1.1 200 OK
-Content-Length: 354
-Content-Type: application/json
-Date: Wed, 14 Apr 2021 17:53:33 GMT
 
+**⚠️ Redis is down** (still returns `HTTP 200 OK`):
+```bash
+$ curl http://localhost:8080/hello  
 {
+  "status": "warn",
   "description": "Tyk GW",
   "details": {
-    "dashboard": {
-      "componentType": "system",
-      "output": "dashboard is down? Heartbeat is failing",
+    "redis": { 
       "status": "fail",
-      "time": "2021-04-14T17:53:33Z"
+      "output": "Redis is down or not configured" 
     },
-    "redis": {
-      "componentType": "datastore",
-      "output": "storage: Redis is either down or was not configured",
-      "status": "fail",
-      "time": "2021-04-14T17:53:33Z"
-    }
-  },
-  "status": "fail",
-  "version": "v3.1.2"
+    "dashboard": { "status": "pass" }
+  }
 }
 ```
 
-
-### MDCB Worker Gateway RPC outage
-
-```
-$  http :8080/hello
-HTTP/1.1 200 OK
-Content-Length: 333
-Content-Type: application/json
-Date: Wed, 14 Apr 2021 17:21:24 GMT
-
+**❌ Multiple components down** (still returns `HTTP 200 OK`):
+```bash
+$ curl http://localhost:8080/hello
 {
+  "status": "fail", 
   "description": "Tyk GW",
   "details": {
-    "redis": {
-      "componentType": "datastore",
-      "output": "storage: Redis is either down or was not configured",
-      "status": "fail",
-      "time": "2021-04-14T17:21:16Z"
-    },
-    "rpc": {
-      "componentType": "system",
-      "output": "Could not connect to RPC",
-      "status": "fail",
-      "time": "2021-04-14T17:21:16Z"
-    }
-  },
-  "status": "fail",
-  "version": "v3.1.2"
+    "redis": { "status": "fail" },
+    "dashboard": { "status": "fail" }
+  }
 }
 ```
 
+## Troubleshooting with Health Checks
+
+### Understanding Status Levels
+
+| Status | Meaning | What to do |
+|--------|---------|------------|
+| `pass` | All components healthy | ✅ Gateway is working normally |
+| `warn` | Some components down | ⚠️ Gateway works but with reduced functionality |
+| `fail` | Critical components down | ❌ Gateway may not work properly |
+
+### Common Issues
+
+**Redis connection failed**:
+- Check Redis is running: `redis-cli ping`
+- Verify connection settings in Gateway config
+- Check network connectivity to Redis
+
+**Dashboard connection failed**:
+- Verify Dashboard is running and accessible
+- Check Dashboard URL in Gateway config
+- Test connectivity: `curl http://dashboard:3000/hello`
+