Skip to content

indexer-agent reconciliation loop overrides manual graphman pause #1171

@madumas

Description

@madumas

Summary

When using graphman pause <deployment> to manually pause a subgraph, the indexer-agent's reconcileDeployments loop automatically resumes it within minutes by calling subgraph_deploy on the graph-node JSON-RPC admin API. This makes it impossible to keep a subgraph paused for maintenance operations such as graphman rewind.

Steps to reproduce

  1. graphman pause <IPFS_HASH>
  2. Wait 2-5 minutes (one reconciliation cycle)
  3. graphman info <IPFS_HASH> --status → shows Paused: false

The subgraph has decisionBasis: always in indexer rules.

Expected behavior

The reconciliation loop should check the paused_at field before calling subgraph_deploy. If a deployment was explicitly paused via graphman, the agent should not resume it.

Impact

  • Cannot perform graphman rewind safely — the subgraph resumes indexing during the rewind, causing data inconsistencies
  • Any maintenance operation requiring a temporary stop is compromised
  • Operators must resort to workarounds like reassigning to a non-existent node (e.g. graphman reassign <hash> maintenance_node_0) to prevent the agent from resuming the subgraph

Root cause

In packages/indexer-agent/src/agent.ts, the reconcileDeployments function calculates target deployments based on indexer rules and calls this.graphNode.ensure() for each one. There is no check for whether a deployment is currently paused before calling ensure. If the deployment is in the target list (active allocation or decisionBasis: always/offchain), it will be re-deployed, which implicitly resumes it.

Suggested fix

Before calling ensure for a deployment, check if it is currently paused (paused_at IS NOT NULL). If paused, skip the ensure call. Optionally, add a --force flag to graphman pause or a new indexer rule field (e.g. maintenancePause: true) that the agent explicitly respects.

Related issues

Current workaround

Reassign the subgraph to a non-existent node before maintenance:

graphman pause <IPFS_HASH>
graphman reassign <IPFS_HASH> maintenance_node_0
# perform maintenance (rewind, reindex, etc.)
graphman reassign <IPFS_HASH> index_node_0
graphman resume <IPFS_HASH>

The indexer-agent cannot resume the subgraph because maintenance_node_0 does not exist.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    🗃️ Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions