gen-graph — accessor-based graph query combinators for Nix

Pure graph query combinators for Nix. Queries take accessor functions as arguments — not node maps. The graph structure is supplied by the caller; gen-graph only answers questions about it.

Overview

gen-graph works with an accessor record: an attrset of functions that the caller provides to describe graph structure. Queries destructure only the accessors they need.

# Define accessors over your data
g = {
  edges    = id: myData.${id}.deps or [];      # id → [id]
  parent   = id: myData.${id}.parent or null;  # id → id | null
  nodes    = builtins.attrNames myData;         # [id]
  nodeData = id: myData.${id};                  # id → attrset
};

# Query
graph.reachableFrom g "web"      # → [ "api" "cache" "database" ]
graph.dependents    g "database" # → [ "api" "web" ]
graph.roots         g            # → [ "web" ]
graph.cycles        g            # → []

The four accessor fields:

Field	Type	Used by
`edges`	`id → [id]`	traversal, global analysis, fixpoint
`parent`	`id → id \| null`	`ancestorsOf`, `materializeParents`
`nodes`	`[id]`	global analysis, enumeration, materialization
`nodeData`	`id → attrset`	`select`

Functions that only need traversal destructure { edges, ... }. Functions that need global analysis also take nodes. Functions that need parent walks take parent. No function requires all four.

Gen Ecosystem

Library	Role
gen-algebra	Pure primitives (search, record, identity)
gen-schema	Typed registries (kinds, instances, collections, refs)
gen-aspects	Aspect types (traits, classification, dispatch)
gen-graph	Graph queries (combinators, traversals, fixpoint)
gen-scope	Scope graphs (construction, evaluation, resolution)
gen-select	Selector algebra (pattern matching over graph positions)
gen-bind	Module binding (inject args into NixOS modules)
gen-derive	Rule dispatch (stratified phases, fixpoint, conflict resolution)

Quick Start

As a flake input

{
  inputs.gen-graph.url = "github:sini/gen-graph";
  outputs = { gen-graph, nixpkgs, ... }:
    let
      lib   = nixpkgs.lib;
      graph = gen-graph { inherit lib; };
    in { /* use graph.reachableFrom, graph.roots, etc. */ };
}

Without flakes

let
  lib   = (import <nixpkgs> {}).lib;
  graph = import ./path/to/gen-graph { inherit lib; };
in
graph.reachableFrom { edges = id: deps.${id} or []; } "start"

Design Principles

Queries take accessor functions, not node maps. The caller owns the data; gen-graph never stores it.
Traversal is lazy. reachableFrom, ancestorsOf, and pathsBetween only visit nodes reachable from the start — they never enumerate nodes.
Global operations materialize internally. cycles, dependents, transpose, transitiveClosure, and transitiveReduction call materialize once, then work on the resulting edge map.
Edge maps are always deduplicated. materialize calls lib.unique on each target list. unionEdges calls lib.unique on merged lists.
Set operations use attrset membership. Intersection and difference build a target attrset for O(1) per-edge lookups.

API Reference

Traversal (lazy)

These functions visit only the nodes they reach. They do not require nodes.

reachableFrom  : { edges, ... } → id → [id]
reachableWhere : { edges, ... } → id → (id → bool) → [id]
canReach       : { edges, ... } → id → id → bool
selfReachable  : { edges, ... } → id → bool
ancestorsOf    : { parent, ... } → id → [id]
pathsBetween   : { edges, ... } → id → id → [[id]]

reachableFrom g startId — all nodes transitively reachable from startId via edges, excluding startId itself. C-level BFS via builtins.genericClosure.

graph.reachableFrom g "web"
# → [ "api" "cache" "database" ]

reachableWhere g startId pred — reachableFrom filtered by pred id.

graph.reachableWhere g "web" (id: lib.hasPrefix "cache" id)
# → [ "cache" ]

canReach g fromId toId — point query: can fromId transitively reach toId? O(reachable from fromId). Does not require materializing the full graph.

graph.canReach g "web" "database"   # → true
graph.canReach g "database" "web"   # → false

selfReachable g id — is id reachable from itself (i.e., in a cycle)? C-level BFS. Used internally by cycles.

graph.selfReachable cyclicGraph "a"   # → true
graph.selfReachable dagGraph "a"      # → false

ancestorsOf g startId — walks parent links upward. Returns the chain from immediate parent to root. Cycle-safe: stops if a visited id is seen again.

graph.ancestorsOf g "grandchild"
# → [ "child1" "root" ]

pathsBetween g startId endId — all acyclic paths from startId to endId. Each path is a list of ids including both endpoints.

graph.pathsBetween g "a" "d"
# → [ [ "a" "b" "d" ] [ "a" "c" "d" ] ]   # diamond

Global Analysis (materializes internally)

These functions enumerate all nodes. They require both edges and nodes.

cycles       : { edges, nodes, ... } → [id]
dependents   : { edges, nodes, ... } → id → [id]
dependentsOf : { edges, nodes, ... } → id → [id]
impactOf     : { edges, nodes, ... } → id → [id]   # alias for dependentsOf
transpose    : { edges, nodes, ... } → { edges, nodes }

cycles g — nodes that appear in any cycle (self-reachable). Uses C-level BFS per node via selfReachable — no full transitive closure materialization needed. Returns a sorted list.

graph.cycles g   # → [] for a DAG, → [ "a" "b" "c" ] for a → b → c → a

dependents g targetId — all nodes that transitively reach targetId (reverse reachability). Uses full transitive closure + transpose. O(n²) setup, O(1) lookup. Best for multi-target queries (amortized).

graph.dependents g "database"   # → [ "api" "web" "worker" ]

dependentsOf g targetId — same result as dependents, but uses reverse traversal: builds reverse edge index O(n), then C-level BFS from target. O(n + reachable). Preferred for single-target queries on large graphs.

graph.dependentsOf g "database"   # → [ "api" "cache" "web" "worker" ]

impactOf — alias for dependentsOf. "What breaks if this node changes?"

transpose g — returns a new accessor record { edges, nodes } with all edges reversed.

rev = graph.transpose g;
graph.reachableFrom rev "database"   # → nodes that depend on database

Enumeration

These functions scan all nodes. They require nodes.

roots  : { edges, nodes, ... } → [id]
leaves : { edges, nodes, ... } → [id]
select : { nodes, nodeData, ... } → (attrset → bool) → [id]

roots g — nodes with no incoming edges (not a target of any edge). Sorted.

leaves g — nodes with no outgoing edges (edges id == []). Sorted.

select g pred — ids where pred (nodeData id) is true.

graph.select g (d: d.type == "backend")   # → [ "api" "worker" ]

Materialization

materialize        : { edges, nodes, ... } → { id → [id] }
materializeParents : { parent, nodes, ... } → { id → id }

materialize g — builds an edge map { nodeId = [targetId ...]; } for all nodes. Deduplicates each target list via lib.unique.

materializeParents g — builds { nodeId = parentId; } for nodes where parent id != null.

Fixpoint

fixpoint            : { seed, step, maxIter? } → edgeMap
compose             : edgeMap → edgeMap → edgeMap
transitiveClosure   : { edges, nodes, ... } → edgeMap
transitiveReduction : { edges, nodes, ... } → edgeMap

fixpoint { seed, step, maxIter? } — iterates step on seed until the result stabilizes (next == current). Throws if the step is non-monotonic (result shrinks) or exceeds maxIter (default 1000).

closure = graph.fixpoint {
  seed = graph.materialize g;
  step = current: graph.unionEdges current (graph.compose current (graph.materialize g));
};

compose e1 e2 — relational composition of two edge maps. For each a → b in e1 and b → c in e2, emits a → c.

transitiveClosure g — full transitive closure as an edge map. Materializes g, then iterates compose to fixpoint.

transitiveReduction g — minimal edge map preserving reachability. Removes edge a → c when a → b → c exists for some b.

Edge Map Operations

These operate on materialized edge maps { id → [id] }, not on accessor records.

unionEdges      : edgeMap → edgeMap → edgeMap
intersectEdges  : edgeMap → edgeMap → edgeMap
differenceEdges : edgeMap → edgeMap → edgeMap
selectEdges     : (id → id → bool) → edgeMap → edgeMap

unionEdges a b — merged edge map; target lists are deduplicated.

intersectEdges a b — only edges present in both maps. Empty target lists are dropped.

differenceEdges a b — edges in a not in b. Empty target lists are dropped.

Mock Utility

graph.mock provides test helpers for constructing accessor records from declarative edge lists.

mkGraph     : { edges?, parents?, nodeData? } → accessorRecord
fromNodeMap : { id → { imports?, parent?, ... } } → accessorRecord
fixtures    : { diamond, chain, cyclic, tree, serviceGraph, disconnected }

mkGraph — takes edge lists and returns a valid accessor record with all four fields populated.

g = graph.mock.mkGraph {
  edges = [
    { from = "a"; to = "b"; }
    { from = "b"; to = "c"; }
  ];
  nodeData = {
    a = { label = "start"; };
    c = { label = "end"; };
  };
};

graph.reachableFrom g "a"             # → [ "b" "c" ]
graph.select g (d: d ? label)         # → [ "a" "c" ]

fixtures — pre-built accessor records for common graph shapes:

Name	Shape
`diamond`	`a → b,c → d`
`chain`	`a → b → c → d`
`cyclic`	`a → b → c → a`
`tree`	parent chain: grandchild → child1 → root
`serviceGraph`	web/api/worker/db/cache/queue with nodeData
`disconnected`	a → b plus isolated `island` node

Usage Example

{ nixpkgs, gen-graph }:
let
  lib   = nixpkgs.lib;
  graph = gen-graph { inherit lib; };

  # Your data
  services = {
    web    = { deps = [ "api" ];         type = "frontend";  };
    api    = { deps = [ "db" "cache" ];  type = "backend";   };
    worker = { deps = [ "db" "queue" ];  type = "backend";   };
    db     = { deps = [];                type = "datastore";  };
    cache  = { deps = [];                type = "datastore";  };
    queue  = { deps = [];                type = "datastore";  };
  };

  # Accessor record
  g = {
    edges    = id: services.${id}.deps or [];
    parent   = _: null;
    nodes    = builtins.attrNames services;
    nodeData = id: services.${id};
  };
in {
  entryPoints  = graph.roots g;                               # [ "web" "worker" ]
  datastores   = graph.leaves g;                              # [ "cache" "db" "queue" ]
  webDeps      = graph.reachableFrom g "web";                 # [ "api" "cache" "db" ]
  dbImpact     = graph.dependents g "db";                     # [ "api" "web" "worker" ]
  backendNodes = graph.select g (d: d.type == "backend");     # [ "api" "worker" ]
  hasCycles    = graph.cycles g != [];                        # false
}

Performance

Operation	Complexity	Notes
`reachableFrom`	O(reachable)	C-level BFS via `builtins.genericClosure`
`reachableWhere`	O(reachable)	same C-level BFS, filter applied after
`canReach`	O(reachable from source)	C-level BFS, stops exploring from target
`selfReachable`	O(reachable from node)	C-level BFS checking self-reappearance
`ancestorsOf`	O(depth)	single-path walk
`pathsBetween`	O(paths × depth)	exponential in path count; use on small subgraphs
`materialize`	O(nodes × avg degree)	one-time scan
`transitiveClosure`	O(nodes² × iterations)	fixpoint over materialized map
`transitiveReduction`	O(nodes² × degree)	needs full closure; O(1) membership via attrsets
`cycles`	O(nodes × reachable)	per-node C-level BFS (no full closure needed)
`dependents`	O(nodes²)	full transitive closure + transpose
`dependentsOf`	O(nodes + reachable)	reverse index + C-level BFS
`roots` / `leaves`	O(nodes × avg degree)	single scan of all edges
`select`	O(nodes)	one pass over node list
`unionEdges` / `intersectEdges` / `differenceEdges`	O(edges)	attrset membership O(1) per edge

Lazy traversal (reachableFrom, canReach, ancestorsOf, pathsBetween) visits only what is reachable. Global operations (cycles, dependents, transpose, transitiveClosure, transitiveReduction) scan all nodes.

Performance Optimizations

gen-graph is designed to support large infrastructure graphs (1000+ nodes) without forcing performance regressions onto the underlying evaluator.

C-Level BFS via `builtins.genericClosure`

All reachability queries use Nix's native builtins.genericClosure — a C-level builtin with built-in dedup. This is ~4-5x faster than equivalent Nix-level BFS on 5000-node graphs:

No Nix-level queue management (list concatenation is O(n²) for BFS queues)
Native hash-based dedup (not attrset // per visited node)
Constant-factor advantage of compiled C vs interpreted Nix

Accessor Pattern + gen-scope Memoization

When gen-graph's accessor functions are wired to gen-scope's result.get id "imports":

Each edges id call hits gen-scope's memoized _eval → O(1) after first evaluation
Traversal operations only trigger attribute evaluation for VISITED nodes
Global operations trigger evaluation for ALL nodes, but each evaluates exactly once

This means gen-graph never causes redundant evaluation in gen-scope. The accessor pattern is the zero-cost bridge:

# gen-scope evaluates each node's imports ONCE; gen-graph reads the cached result
graphLib.reachableFrom { edges = id: result.get id "imports"; } "host:igloo"

Choosing the Right Operation

Need	Use	Don't use
"Can A reach B?"	`canReach` (O(reachable))	`dependents` (O(n²))
"What depends on X?" (one target)	`dependentsOf` (O(n + reachable))	`dependents` (O(n²))
"What depends on X, Y, Z?" (multi-target)	`dependents` (O(n²) amortized)	`dependentsOf` × 3 (rebuilds index 3×)
"Is there a cycle?"	`cycles` (O(n × reachable), C-level)	`transitiveClosure` (O(n²))
"All paths between A and B"	`pathsBetween` (DFS)	Only for small subgraphs
"Full closure for analysis"	`transitiveClosure`	— (use when you genuinely need it)
"Minimal graph for diagrams"	`transitiveReduction`	— (O(n²), needs closure)

Partitioning for Fleet Scale

For 10,000+ node fleets, partition the graph by environment/datacenter before running global operations:

# Instead of:
graph.cycles { edges; nodes = ALL_10K_NODES; }  # O(10K × reachable)

# Partition first:
lib.concatMap (partition:
  graph.cycles { inherit edges; nodes = partition; }
) (partitionByEnvironment allNodes)  # 20 × O(500 × reachable)

Cross-partition edges are rare in practice. Per-partition analysis is typically 100-400x faster than whole-fleet.

Testing

nix flake check --override-input gen-graph . ./ci

Theoretical Foundations

The algorithms and design principles draw from:

Mokhov (2017) — Algebraic Graphs with Class. Informed by. Algebraic graph construction primitives (overlay, connect, vertex, empty) and the compositional approach to graph representation inform gen-graph's edge map operations and structural combinators. Edge map set operations (unionEdges, intersectEdges, differenceEdges) and transitive reduction are gen-graph's own contribution built on this algebraic foundation. Transpose follows Mokhov 2017 §4.3 directly.
Arntzenius & Krishnaswami (2016) — Datafun: A Functional Datalog. Implements. Monotone fixpoint iteration with convergence guarantees. The fixpoint operator enforces monotonicity (edge count must not shrink between iterations), matching Datafun's requirement that fixpoint computations operate over monotone functions on semilattices. Reverse reachability in dependents/dependentsOf follows the Datafun reverse-query pattern.
Neron et al. (2015) — A Theory of Name Resolution. Implements. Parent-chain traversal (ancestorsOf) follows scope graph P-edge resolution: walking the parent partial function upward through scopes corresponds to following P-edges in the resolution calculus (Neron 2015 §2.3). Silent cycle termination chosen over throwing for composability, matching the well-foundedness requirement on the parent relation.
Kahn (1974) — The Semantics of a Simple Language for Parallel Programming. Informed by. Continuous functions over streams with deterministic dataflow semantics. gen-graph's lazy accessor pattern — traversal only forces nodes it visits — aligns conceptually with Kahn's model where computing stations produce output incrementally as input arrives, and monotonicity ensures that receiving more input can only provoke more output (Kahn 1974 §2.2.4).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
ci		ci
examples/demo		examples/demo
lib		lib
.gitignore		.gitignore
GUIDE.md		GUIDE.md
LICENSE		LICENSE
README.md		README.md
default.nix		default.nix
flake.nix		flake.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gen-graph — accessor-based graph query combinators for Nix

Table of Contents

Overview

Gen Ecosystem

Quick Start

As a flake input

Without flakes

Design Principles

API Reference

Traversal (lazy)

Global Analysis (materializes internally)

Enumeration

Materialization

Fixpoint

Edge Map Operations

Mock Utility

Usage Example

Performance

Performance Optimizations

C-Level BFS via `builtins.genericClosure`

Accessor Pattern + gen-scope Memoization

Choosing the Right Operation

Partitioning for Fleet Scale

Testing

Theoretical Foundations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gen-graph — accessor-based graph query combinators for Nix

Table of Contents

Overview

Gen Ecosystem

Quick Start

As a flake input

Without flakes

Design Principles

API Reference

Traversal (lazy)

Global Analysis (materializes internally)

Enumeration

Materialization

Fixpoint

Edge Map Operations

Mock Utility

Usage Example

Performance

Performance Optimizations

C-Level BFS via builtins.genericClosure

Accessor Pattern + gen-scope Memoization

Choosing the Right Operation

Partitioning for Fleet Scale

Testing

Theoretical Foundations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

C-Level BFS via `builtins.genericClosure`

Packages