Skip to content

chore: Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files#2621

Merged
krlmlr merged 7 commits into
mainfrom
categories
May 24, 2026
Merged

chore: Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files#2621
krlmlr merged 7 commits into
mainfrom
categories

Conversation

@schochastics
Copy link
Copy Markdown
Contributor

Summary

Stimulus generates a single ~14,800-line R/aaa-auto.R containing every C igraph wrapper. This PR introduces a categorization layer that splits that monolithic output into 26 per-category files (R/aaa-basicigraph.R, R/aaa-cliques.R, …, R/aaa-visitors.R) so navigating the generated wrappers aligns with how igraph groups functions in its reference manual. Subcategories appear inside each file as banner comments.

Why do this? Today a developer grepping for bfs_impl lands in the middle of a 14.8k-line file with no navigational cues; afterwards they land at the top of R/aaa-visitors.R under the # ==== breadth-first-search ==== banner.

The split happens as a post-processing step on the stimulus output; stimulus itself is unchanged (it doesn't support multi-file output natively). A new tools/aaa-categories.yaml is the single source of truth for which function goes where, and two new tools keep everything reconciled.

What's in the diff

Change Purpose
tools/aaa-categories.yaml (new) Authoritative map: category → subcategory → list of igraph_* C functions. 491 entries across 26 categories, covering every R_igraph_* symbol .Call()'d in the generated wrappers.
tools/rebuild-cats.R (new) Reconciles the YAML against whatever R/aaa-*.R files are present. Idempotent; fails loudly if an ungrouped function appears in the generated wrappers.
tools/split-aaa-auto.R (new) Parses the stimulus output, looks up each _impl wrapper's category, and writes one file per category with subcategory banners. Preserves each wrapper's source byte-for-byte.
Makefile-cigraph Stimulus now writes to .build/aaa-auto.R (ignored), and the split script produces the in-repo R/aaa-<cat>.R files. New phony target r_wrappers covers the full pipeline.
R/aaa-auto.RR/aaa-<cat>.R × 26 The actual split output. All existing .Call() semantics unchanged — it is a purely organizational change.
.gitignore / .Rbuildignore Ignore .build/.

The closure-normalization rule

Nine .Call() targets in the generated wrappers end in _closure (e.g. R_igraph_bfs_closure). These are R-binding helpers defined in src/rcallback.c that wrap an underlying C function with SEXP-callback support — they are not standalone C library functions. rebuild-cats.R encodes the 9-entry whitelist and maps them back to their semantic names (e.g. igraph_bfs_closureigraph_bfs) so each wrapper lands where a reader would expect. R_igraph_transitive_closure is not affected — there "closure" is a graph-theory term, not a wrapper suffix.

Categorization highlights

The initial YAML layout mirrored igraph's legacy docbook sections. Several cleanups were applied:

  • Retired the undocumented category — all 8 entries moved to real homes:
    • igraph_residual_graph, igraph_reverse_residual_graphflows/maximum-flows
    • igraph_hrg_sample_manyhrg/hrg-sampling
    • igraph_has_attribute_table, igraph_finalizernongraph/internal
    • igraph_eigen_adjacencystructural/spectral-properties
    • igraph_eigen_matrix, igraph_eigen_matrix_symmetric, igraph_solve_lsapnongraph/linear-algebra (new subcategory)
  • Typo/case fixes: regular-structre-generatorsregular-structure-generators; Sparsifierssparsifiers; motifs/uncategorizedmotifs/graph-census.
  • Semantic relocations: igraph_transitive_closure and igraph_transitive_closure_dag moved from structural/graph-componentsoperators/miscellaneous-operators (they produce a derived graph, not component analysis).
  • Split oversized buckets:
    • structural/shortest-path-related-functions (34 entries) → distances-and-metrics (22) + shortest-paths (12).
    • structural/other-operations (11) → matrix-representations (5) + mutual-edges (3) + summary-statistics (3).

Developer workflow

After a stimulus upgrade or new igraph C function landing upstream:

make -f Makefile-cigraph r_wrappers   # regenerates the split files
Rscript tools/rebuild-cats.R          # validates/updates the categories YAML

The second step fails loudly with the exact names that need adding if aaa-categories.yaml drifts from the generated wrappers.

Validation performed

  • All 26 R/aaa-*.R files parse cleanly.
  • 490 _impl wrappers distributed across the files, zero duplicates.
  • 491 unique R_igraph_* symbols preserved (the 491st being R_igraph_finalizer, which appears in every impl's on.exit but has no wrapper of its own).
  • tools/rebuild-cats.R produces byte-identical output on re-run (idempotent).
  • tools/split-aaa-auto.R produces byte-identical output on re-run from the same source (idempotent).

Test plan

  • devtools::load_all(".") succeeds
  • R CMD check / CI passes
  • make -f Makefile-cigraph r_wrappers round-trips cleanly on a machine with the stimulus venv
  • Spot-check that at least one wrapper from each of the 26 category files still behaves correctly (e.g. the existing testthat suite covers igraph_* wrappers broadly, so a green test run is the main check)

cc @maelle for review — this is purely an organizational/tooling change; no behavior should change, but the restructuring is substantial so a second pair of eyes on the categorization choices would be welcome.

🤖 Generated with Claude Code

schochastics and others added 2 commits April 23, 2026 14:34
Stimulus generates one monolithic R/aaa-auto.R (~14.8k lines) covering
every C igraph wrapper. This commit introduces a categorization layer that
splits the generated output into 26 per-category files matching how the
functions are grouped in the igraph reference manual, with subcategory
banner comments inside each file.

- tools/aaa-categories.yaml: authoritative category -> subcategory -> fn
  mapping, reconciled against every R_igraph_* symbol .Call()'d from the
  generated wrappers (491 entries; 8 closure wrappers mapped back to their
  underlying C functions via the src/rcallback.c whitelist)
- tools/rebuild-cats.R: idempotent reconciliation tool; fails loudly if
  new functions appear in the generated wrappers without a categorization
- tools/split-aaa-auto.R: post-processes stimulus output into R/aaa-<cat>.R
- Makefile-cigraph: stimulus now writes to .build/aaa-auto.R (ignored), the
  split script produces the in-repo R/ files. Phony target r_wrappers
  covers the full pipeline
@maelle
Copy link
Copy Markdown
Contributor

maelle commented Apr 30, 2026

A developper grepping for a function name?! Like, not using IDE navigation?!

Anyway awesome, I'll review this now, thanks a ton!

Copy link
Copy Markdown
Contributor

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! A few comments 😄

Comment thread tools/aaa-categories.yaml Outdated
@@ -0,0 +1,669 @@
# Functions ordered by category
basicigraph:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
basicigraph:
basic-igraph:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 8c7fc6b — renamed basicigraphbasic-igraph in the YAML, in tools/rebuild-cats.R, and renamed R/aaa-basicigraph.RR/aaa-basic-igraph.R.


Generated by Claude Code

Comment thread tools/aaa-categories.yaml
Comment thread Makefile-cigraph

# R files that are generated/copied

RGEN = R/aaa-auto.R src/rinterface.c \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we add the post-processing step to this makefile so that generating the functions remain a single call?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already is — the .build/r-wrappers.stamp target (lines 128–147) runs both stimulus and tools/split-aaa-auto.R in sequence, so make -f Makefile-cigraph r_wrappers is a single call that produces the split files. The PR description references R/aaa-auto.R as a make target by mistake (the actual target is the stamp file).


Generated by Claude Code

Comment thread Makefile-cigraph
-t $(vendored_srcdir)/interfaces/types.yaml \
-t tools/stimulus/types-RR.yaml \
-l RR
Rscript tools/split-aaa-auto.R .build/aaa-auto.R
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaah ok, this was wrong in the PR description

Comment thread R/aaa-cycles.R
callback
) {
# Argument checks
ensure_igraph(graph)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no specific opinion on categories but before we validate this (and split the test file), could you please add some stats to the PR thread: min/median/max number of lines per aaa file?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stats for the 28 R/aaa-*.R files (after the basicigraph rename and status-into-progress merge):

lines
min 20 (aaa-error.R, aaa-spatial.R)
median 352
mean 540
max 2417 (aaa-structural.R)
total 15,118

Full distribution, sorted:

file lines
aaa-error.R 20
aaa-spatial.R 20
aaa-progress.R 40
aaa-separators.R 92
aaa-embedding.R 104
aaa-graphlets.R 106
aaa-coloring.R 119
aaa-processes.R 199
aaa-trees.R 211
aaa-motifs.R 221
aaa-hrg.R 245
aaa-nongraph.R 256
aaa-bipartite.R 286
aaa-visitors.R 316
aaa-cycles.R 389
aaa-foreign.R 436
aaa-operators.R 529
aaa-basic-igraph.R 534
aaa-cliques.R 536
aaa-flows.R 714
aaa-community.R 731
aaa-generators.R 881
aaa-isomorphism.R 957
aaa-layout.R 1000
aaa-centrality.R 1071
aaa-games.R 1145
aaa-paths.R 1543
aaa-structural.R 2417

aaa-structural.R and aaa-paths.R are large enough that splitting them by subcategory is being considered as a follow-up.


Generated by Claude Code

aaa-structural.R was 160 functions / 5,237 lines — too unwieldy for IDE
navigation. Promote three natural sub-clusters to top-level categories,
shrinking aaa-structural.R to 84 functions / 2,417 lines:

  - aaa-paths.R       (38 fns) — distances, shortest paths, widest paths
  - aaa-centrality.R  (30 fns) — centrality measures + centralization
  - aaa-trees.R        (8 fns) — spanning trees and tree unfolding

Implementation: tools/rebuild-cats.R gains a `category_moves` mechanism
that relocates whole (cat, sub) groups to a new top-level on the
flattened table. The structural/trees subcategory is renamed to
spanning-trees-and-forests for clarity inside the new trees category.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@schochastics
Copy link
Copy Markdown
Contributor Author

A developper grepping for a function name?! Like, not using IDE navigation?!

Anyway awesome, I'll review this now, thanks a ton!

Claude knows that I dont use IDE navigation I guess 😆

@schochastics
Copy link
Copy Markdown
Contributor Author

aaa-structural.R 84 2,417
aaa-paths.R 38 1,544
aaa-games.R 36 1,145
aaa-generators.R 35 881
aaa-centrality.R 30 1,071
aaa-layout.R 28 1,000
aaa-operators.R 24 529
aaa-basicigraph.R 23 534
aaa-cliques.R 21 536
aaa-community.R 21 731
aaa-flows.R 21 714
aaa-isomorphism.R 21 957
aaa-foreign.R 18 436
aaa-cycles.R 13 389
aaa-bipartite.R 10 286
aaa-hrg.R 10 245
aaa-nongraph.R 10 258
aaa-motifs.R 9 221
aaa-trees.R 8 211
aaa-coloring.R 5 115
aaa-processes.R 5 199
aaa-separators.R 5 92
aaa-visitors.R 5 316
aaa-embedding.R 3 104
aaa-graphlets.R 3 106
aaa-error.R 1 20
aaa-progress.R 1 22
aaa-spatial.R 1 20
aaa-status.R 1 20
total 490 15,119

cc @maelle. Not very even but I guess that is hard to achieve with a good categorization anyway 🤷

Copy link
Copy Markdown
Contributor

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! Looking forward to it.

Comment thread tools/aaa-categories.yaml Outdated
nongraph-spatial:
- igraph_convex_hull_2d

status:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge with progress?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8c7fc6bstatus: removed as a top-level category, igraph_status moved under progress: as invoking-the-status-handler. R/aaa-status.R deleted; both impls now live in R/aaa-progress.R.


Generated by Claude Code

Comment thread tools/aaa-categories.yaml Outdated
Comment on lines +74 to +78
- igraph_is_bipartite_coloring
- igraph_is_edge_coloring
- igraph_is_perfect
- igraph_is_vertex_coloring
- igraph_vertex_coloring_greedy
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- igraph_is_bipartite_coloring
- igraph_is_edge_coloring
- igraph_is_perfect
- igraph_is_vertex_coloring
- igraph_vertex_coloring_greedy
detect:
- igraph_is_bipartite_coloring
- igraph_is_edge_coloring
- igraph_is_perfect
- igraph_is_vertex_coloring
compute:
- igraph_vertex_coloring_greedy

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 8c7fc6bcoloring now has detect (4 fns) and compute (1 fn) subcategories, and R/aaa-coloring.R has the corresponding banner comments.


Generated by Claude Code

@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented May 24, 2026

This now has conflicts, can you resolve please?

@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented May 24, 2026

Can we do subcategories for structural and paths?

claude and others added 4 commits May 24, 2026 17:13
Previously split-aaa-auto.R and rebuild-cats.R both stop() if the
generated wrappers contain a function missing from
tools/aaa-categories.yaml, which would break the build whenever a new
igraph C function landed upstream. Both now warn instead and put
unassigned functions under a top-level uncategorized: bucket
(emitted as R/aaa-uncategorized.R by the splitter) so the package
keeps building while the warning lists exactly what needs a home.
@krlmlr krlmlr changed the title Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files chore: Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files May 24, 2026
@krlmlr krlmlr enabled auto-merge (squash) May 24, 2026 17:56
@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented May 24, 2026

Done for now, no further action needed.

@krlmlr
Copy link
Copy Markdown
Contributor

krlmlr commented May 24, 2026

Thanks for working on this!

@krlmlr krlmlr merged commit 455cb71 into main May 24, 2026
5 of 6 checks passed
@krlmlr krlmlr deleted the categories branch May 24, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants