Skip to content

ichichuang/hermes-old

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hermes

Hermes is a local orchestration runtime for the ~/.hermes workspace. The current documented operating surface is the implemented upgrade control plane: a single-host, persistence-backed coordination system that plans, gates, executes, observes, and replays repository upgrade cycles.

Hermes should not be described as a generic AI framework. In the current repository state, the control-plane path is infrastructure code with explicit state files, deterministic decision hashing, lease semantics, remote federation, rollback planning, and event-sourced observability.

Current Maturity

single-host deterministic control-plane prototype
  with runtime intelligence,
  lease-backed coordination,
  federation-aware remote selection,
  integrity-gated execution,
  event-sourced replay,
  and real cron-path simulations

This is not a distributed consensus system. Hermes currently does not implement Raft, cross-host quorum, multi-host lease replication, autonomous remote provisioning, or a self-healing distributed mesh.

Source Of Truth

The current control-plane behavior is implemented in:

  • hermes-agent/cron/scheduler.py
  • scripts/auto_safe_upgrade.py
  • hermes-agent/hermes_cli/observability_events.py
  • hermes-agent/hermes_cli/observability_graph.py
  • hermes-agent/hermes_cli/observability_api.py
  • scripts/runtime_intelligence_simulation.py
  • config/runtime_upgrade_policies.json

Historical docs/evolution/*, older README/* sections, and broad core/* narratives are useful for archaeology, but they are not the source of truth for the implemented upgrade control plane.

Architecture Layers

cron.scheduler.tick()
  -> due-job selection and .tick.lock
  -> script path preflight and structured argv execution
  -> scripts/auto_safe_upgrade.py run_two_phase_cycle()
  -> repo probe and remote failure classification
  -> GlobalOrchestrationPlanner (plan_upgrade_queue)
  -> AgentCoordinator reservations and leases
  -> runtime federation topology
  -> RuntimeTopologyGraph snapshot
  -> control-plane integrity gate
  -> execute / defer / block / rollback planning
  -> HermesEvent ledger
  -> EventLedger -> GraphProjectionEngine -> GraphSnapshot
  -> state/*.json current-control-plane views

Scheduler

cron.scheduler.tick() is the runtime entrypoint for scheduled upgrade jobs. It acquires a local tick lock, resolves due jobs, validates script paths before execution, preserves script_args as structured argv, records truthful success or failure status, and appends runtime events when configured.

Upgrade Runtime

scripts/auto_safe_upgrade.py owns the two-phase cycle:

probe
  -> plan
  -> build topology
  -> validate integrity
  -> execute, defer, or block
  -> finalize state

It handles repository inspection, remote degradation, risk and strategy selection, lease-backed coordination, canary checks, guarded apply, rollback planning, summary persistence, and control-plane artifact writes.

Runtime Intelligence

Runtime intelligence is persisted local memory, not an LLM planner. It records repo stability, rollback rate, failure probability, risk hours, confidence, and remote health signals. The planner consumes these signals together with current resource telemetry and policy configuration to stabilize execution decisions.

Coordination

The AgentCoordinator semantics are implemented by reservation and lease functions in scripts/auto_safe_upgrade.py. Coordination is distributed in shape but single-host in implementation:

  • one active lease per repo key
  • active unexpired foreign leases win conflicts
  • dependencies block downstream repos until prerequisites are granted
  • shared remote budgets limit concurrent use of the same remote
  • resource pressure can reduce parallelism or defer all lanes

Leases are persisted in state/coordination_leases.json.

Federation

Runtime federation selects among primary, origin, mirror, cache, and fallback remotes. Promotion is logical: Hermes records the healthier selected endpoint in federation state, but it does not rewrite git remote configuration or provision new remotes.

Observability And Replay

Runtime events flow through:

HermesEvent
  -> EventLedger(state/runtime_events.jsonl)
  -> observability_graph.project()
  -> GraphSnapshot / GraphPatch
  -> observability_api readers

The graph projection is derived from the event ledger. It is not a second mutable source of historical truth.

Deterministic Guarantees

Hermes provides bounded deterministic guarantees for the local host and the current persisted state set:

  • planner ordering is stable by dependency depth, recommendation priority, risk score, and repo name
  • decision_hash is derived from normalized payloads with volatile fields removed
  • granted reservations must have matching active leases
  • active leases are unique by repo key
  • integrity failures block repository mutation
  • replay graph state is projected from the append-only event stream

These guarantees do not prove correctness across multiple machines.

Persisted State

Primary control-plane files live under state/:

  • auto_safe_upgrade_summary.json
  • auto_safe_upgrade_state.json
  • auto_safe_upgrade_state_history.jsonl
  • remote_health.json
  • runtime_intelligence.json
  • runtime_resource_state.json
  • runtime_resource_telemetry.json
  • agent_coordination_state.json
  • coordination_leases.json
  • global_orchestration_plan.json
  • runtime_consensus_state.json
  • runtime_topology_graph.json
  • remote_federation_state.json
  • federation_topology_state.json
  • control_plane_integrity.json
  • rollback_intelligence.json
  • live_runtime_control_plane.json
  • runtime_events.jsonl

See docs/runtime_state_files.md for the complete map and ownership notes.

Documentation Map

Start here:

  1. docs/README.md
  2. docs/control_plane_architecture.md
  3. docs/pipeline.md
  4. docs/distributed_coordination_semantics.md
  5. docs/runtime_topology_graph.md
  6. docs/runtime_federation.md
  7. docs/replay_and_determinism.md
  8. docs/control_plane_integrity.md
  9. docs/rollback_intelligence.md
  10. docs/runtime_simulation_lab.md

Operator entries:

Explicit Limits

Hermes currently does not provide:

  • full Raft or quorum consensus
  • cross-host lock replication
  • multi-host split-brain recovery
  • autonomous mirror or cache provisioning
  • automatic distributed worker placement
  • a guaranteed always-on dashboard service
  • complete telemetry coverage for every runtime subsystem

The implemented architecture is precise but bounded: deterministic local coordination over persisted state.

About

Hermes Agent

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors