bug: Health check state lost and checker not working after upstream node changes

### Current Behavior


When upstream nodes change (e.g., Kubernetes pod scaling, service discovery update, or DNS resolution change), the health checker has two critical issues:

1. **Health check status lost**: Previously detected unhealthy nodes reset to healthy after node changes
2. **Health check not running**: There is a probability that the health checker stops checking after node changes

### Root Cause

APISIX uses a **full destroy-and-rebuild** strategy for health checkers when upstream nodes change. The core flow is:

1. Node change → `_nodes_ver` increments → `resource_version` changes
2. `fetch_checker()` detects version mismatch → adds to `waiting_pool`, returns `nil` (no checker during this period)
3. Timer (1s interval) destroys old checker → calls `delayed_clear()` to **clear all health status from shared dict**
4. Creates brand new checker → **all nodes start as healthy**

### Impact

- Traffic routed to unhealthy nodes during the window between checker rebuild and next active check cycle
- Removed nodes remain in the health checker's target list, consuming resources and potentially affecting health check results
- In high-frequency node change scenarios, the checker may never be successfully created due to version race conditions

### Suggested Fix

Implement **incremental target update** instead of full rebuild:

1. When nodes change but `checks` config remains the same, only add/remove targets on the existing checker
2. Use `target.hostname` from `get_target_list()` when calling `remove_target()` to ensure the correct target is matched
3. Only do full rebuild when `checks` configuration changes

Key changes in `healthcheck_manager.lua`:

- Add `update_checker_targets()`: incrementally adds new targets and removes stale ones
- Add `checks_config_equal()`: compares checks config to decide incremental vs full rebuild
- Fix `remove_target()` hostname: use stored `target.hostname` instead of `checks.active.host`
- Save `checks` config in working pool for later comparison

### Expected Behavior

1. Existing nodes should retain their health status (healthy/unhealthy) when nodes are added/removed
2. Removed nodes should be properly cleaned up from the health checker
3. Health checking should not have gaps during node changes

### Error Logs

_No response_

### Steps to Reproduce

1. Create a route with health check enabled and multiple upstream nodes
2. Wait for one node to be detected as unhealthy
3. Add a new node to the upstream (or trigger service discovery update)
4. Observe that the previously unhealthy node resets to healthy
5. Check health checker target list — removed nodes may still be present

### Environment

Environment
APISIX version: 3.16.0
lua-resty-healthcheck-api7: 3.2.1-0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Health check state lost and checker not working after upstream node changes #13282

Current Behavior

Root Cause

Impact

Suggested Fix

Expected Behavior

Error Logs

Steps to Reproduce

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: Health check state lost and checker not working after upstream node changes #13282

Description

Current Behavior

Root Cause

Impact

Suggested Fix

Expected Behavior

Error Logs

Steps to Reproduce

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions