chore: Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files#2621
Conversation
Stimulus generates one monolithic R/aaa-auto.R (~14.8k lines) covering every C igraph wrapper. This commit introduces a categorization layer that splits the generated output into 26 per-category files matching how the functions are grouped in the igraph reference manual, with subcategory banner comments inside each file. - tools/aaa-categories.yaml: authoritative category -> subcategory -> fn mapping, reconciled against every R_igraph_* symbol .Call()'d from the generated wrappers (491 entries; 8 closure wrappers mapped back to their underlying C functions via the src/rcallback.c whitelist) - tools/rebuild-cats.R: idempotent reconciliation tool; fails loudly if new functions appear in the generated wrappers without a categorization - tools/split-aaa-auto.R: post-processes stimulus output into R/aaa-<cat>.R - Makefile-cigraph: stimulus now writes to .build/aaa-auto.R (ignored), the split script produces the in-repo R/ files. Phony target r_wrappers covers the full pipeline
|
A developper grepping for a function name?! Like, not using IDE navigation?! Anyway awesome, I'll review this now, thanks a ton! |
maelle
left a comment
There was a problem hiding this comment.
Thank you! A few comments 😄
| @@ -0,0 +1,669 @@ | |||
| # Functions ordered by category | |||
| basicigraph: | |||
There was a problem hiding this comment.
| basicigraph: | |
| basic-igraph: |
There was a problem hiding this comment.
Applied in 8c7fc6b — renamed basicigraph → basic-igraph in the YAML, in tools/rebuild-cats.R, and renamed R/aaa-basicigraph.R → R/aaa-basic-igraph.R.
Generated by Claude Code
|
|
||
| # R files that are generated/copied | ||
|
|
||
| RGEN = R/aaa-auto.R src/rinterface.c \ |
There was a problem hiding this comment.
why can't we add the post-processing step to this makefile so that generating the functions remain a single call?
There was a problem hiding this comment.
It already is — the .build/r-wrappers.stamp target (lines 128–147) runs both stimulus and tools/split-aaa-auto.R in sequence, so make -f Makefile-cigraph r_wrappers is a single call that produces the split files. The PR description references R/aaa-auto.R as a make target by mistake (the actual target is the stamp file).
Generated by Claude Code
| -t $(vendored_srcdir)/interfaces/types.yaml \ | ||
| -t tools/stimulus/types-RR.yaml \ | ||
| -l RR | ||
| Rscript tools/split-aaa-auto.R .build/aaa-auto.R |
There was a problem hiding this comment.
aaah ok, this was wrong in the PR description
| callback | ||
| ) { | ||
| # Argument checks | ||
| ensure_igraph(graph) |
There was a problem hiding this comment.
I have no specific opinion on categories but before we validate this (and split the test file), could you please add some stats to the PR thread: min/median/max number of lines per aaa file?
There was a problem hiding this comment.
Stats for the 28 R/aaa-*.R files (after the basicigraph rename and status-into-progress merge):
| lines | |
|---|---|
| min | 20 (aaa-error.R, aaa-spatial.R) |
| median | 352 |
| mean | 540 |
| max | 2417 (aaa-structural.R) |
| total | 15,118 |
Full distribution, sorted:
| file | lines |
|---|---|
| aaa-error.R | 20 |
| aaa-spatial.R | 20 |
| aaa-progress.R | 40 |
| aaa-separators.R | 92 |
| aaa-embedding.R | 104 |
| aaa-graphlets.R | 106 |
| aaa-coloring.R | 119 |
| aaa-processes.R | 199 |
| aaa-trees.R | 211 |
| aaa-motifs.R | 221 |
| aaa-hrg.R | 245 |
| aaa-nongraph.R | 256 |
| aaa-bipartite.R | 286 |
| aaa-visitors.R | 316 |
| aaa-cycles.R | 389 |
| aaa-foreign.R | 436 |
| aaa-operators.R | 529 |
| aaa-basic-igraph.R | 534 |
| aaa-cliques.R | 536 |
| aaa-flows.R | 714 |
| aaa-community.R | 731 |
| aaa-generators.R | 881 |
| aaa-isomorphism.R | 957 |
| aaa-layout.R | 1000 |
| aaa-centrality.R | 1071 |
| aaa-games.R | 1145 |
| aaa-paths.R | 1543 |
| aaa-structural.R | 2417 |
aaa-structural.R and aaa-paths.R are large enough that splitting them by subcategory is being considered as a follow-up.
Generated by Claude Code
aaa-structural.R was 160 functions / 5,237 lines — too unwieldy for IDE navigation. Promote three natural sub-clusters to top-level categories, shrinking aaa-structural.R to 84 functions / 2,417 lines: - aaa-paths.R (38 fns) — distances, shortest paths, widest paths - aaa-centrality.R (30 fns) — centrality measures + centralization - aaa-trees.R (8 fns) — spanning trees and tree unfolding Implementation: tools/rebuild-cats.R gains a `category_moves` mechanism that relocates whole (cat, sub) groups to a new top-level on the flattened table. The structural/trees subcategory is renamed to spanning-trees-and-forests for clarity inside the new trees category. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude knows that I dont use IDE navigation I guess 😆 |
cc @maelle. Not very even but I guess that is hard to achieve with a good categorization anyway 🤷 |
krlmlr
left a comment
There was a problem hiding this comment.
Nice, thanks! Looking forward to it.
| nongraph-spatial: | ||
| - igraph_convex_hull_2d | ||
|
|
||
| status: |
There was a problem hiding this comment.
Done in 8c7fc6b — status: removed as a top-level category, igraph_status moved under progress: as invoking-the-status-handler. R/aaa-status.R deleted; both impls now live in R/aaa-progress.R.
Generated by Claude Code
| - igraph_is_bipartite_coloring | ||
| - igraph_is_edge_coloring | ||
| - igraph_is_perfect | ||
| - igraph_is_vertex_coloring | ||
| - igraph_vertex_coloring_greedy |
There was a problem hiding this comment.
| - igraph_is_bipartite_coloring | |
| - igraph_is_edge_coloring | |
| - igraph_is_perfect | |
| - igraph_is_vertex_coloring | |
| - igraph_vertex_coloring_greedy | |
| detect: | |
| - igraph_is_bipartite_coloring | |
| - igraph_is_edge_coloring | |
| - igraph_is_perfect | |
| - igraph_is_vertex_coloring | |
| compute: | |
| - igraph_vertex_coloring_greedy |
There was a problem hiding this comment.
Applied in 8c7fc6b — coloring now has detect (4 fns) and compute (1 fn) subcategories, and R/aaa-coloring.R has the corresponding banner comments.
Generated by Claude Code
|
This now has conflicts, can you resolve please? |
|
Can we do subcategories for structural and paths? |
… merge status into progress
Previously split-aaa-auto.R and rebuild-cats.R both stop() if the generated wrappers contain a function missing from tools/aaa-categories.yaml, which would break the build whenever a new igraph C function landed upstream. Both now warn instead and put unassigned functions under a top-level uncategorized: bucket (emitted as R/aaa-uncategorized.R by the splitter) so the package keeps building while the warning lists exactly what needs a home.
R/aaa-auto.R into per-category R/aaa-<cat>.R files
|
Done for now, no further action needed. |
|
Thanks for working on this! |
Summary
Stimulus generates a single ~14,800-line
R/aaa-auto.Rcontaining every C igraph wrapper. This PR introduces a categorization layer that splits that monolithic output into 26 per-category files (R/aaa-basicigraph.R,R/aaa-cliques.R, …,R/aaa-visitors.R) so navigating the generated wrappers aligns with how igraph groups functions in its reference manual. Subcategories appear inside each file as banner comments.Why do this? Today a developer grepping for
bfs_impllands in the middle of a 14.8k-line file with no navigational cues; afterwards they land at the top ofR/aaa-visitors.Runder the# ==== breadth-first-search ====banner.The split happens as a post-processing step on the stimulus output; stimulus itself is unchanged (it doesn't support multi-file output natively). A new
tools/aaa-categories.yamlis the single source of truth for which function goes where, and two new tools keep everything reconciled.What's in the diff
tools/aaa-categories.yaml(new)igraph_*C functions. 491 entries across 26 categories, covering everyR_igraph_*symbol.Call()'d in the generated wrappers.tools/rebuild-cats.R(new)tools/split-aaa-auto.R(new)_implwrapper's category, and writes one file per category with subcategory banners. Preserves each wrapper's source byte-for-byte.Makefile-cigraph.build/aaa-auto.R(ignored), and the split script produces the in-repoR/aaa-<cat>.Rfiles. New phony targetr_wrapperscovers the full pipeline.R/aaa-auto.R→R/aaa-<cat>.R× 26.Call()semantics unchanged — it is a purely organizational change..gitignore/.Rbuildignore.build/.The closure-normalization rule
Nine
.Call()targets in the generated wrappers end in_closure(e.g.R_igraph_bfs_closure). These are R-binding helpers defined in src/rcallback.c that wrap an underlying C function with SEXP-callback support — they are not standalone C library functions.rebuild-cats.Rencodes the 9-entry whitelist and maps them back to their semantic names (e.g.igraph_bfs_closure→igraph_bfs) so each wrapper lands where a reader would expect.R_igraph_transitive_closureis not affected — there "closure" is a graph-theory term, not a wrapper suffix.Categorization highlights
The initial YAML layout mirrored igraph's legacy docbook sections. Several cleanups were applied:
undocumentedcategory — all 8 entries moved to real homes:igraph_residual_graph,igraph_reverse_residual_graph→flows/maximum-flowsigraph_hrg_sample_many→hrg/hrg-samplingigraph_has_attribute_table,igraph_finalizer→nongraph/internaligraph_eigen_adjacency→structural/spectral-propertiesigraph_eigen_matrix,igraph_eigen_matrix_symmetric,igraph_solve_lsap→nongraph/linear-algebra(new subcategory)regular-structre-generators→regular-structure-generators;Sparsifiers→sparsifiers;motifs/uncategorized→motifs/graph-census.igraph_transitive_closureandigraph_transitive_closure_dagmoved fromstructural/graph-components→operators/miscellaneous-operators(they produce a derived graph, not component analysis).structural/shortest-path-related-functions(34 entries) →distances-and-metrics(22) +shortest-paths(12).structural/other-operations(11) →matrix-representations(5) +mutual-edges(3) +summary-statistics(3).Developer workflow
After a stimulus upgrade or new igraph C function landing upstream:
The second step fails loudly with the exact names that need adding if
aaa-categories.yamldrifts from the generated wrappers.Validation performed
R/aaa-*.Rfiles parse cleanly._implwrappers distributed across the files, zero duplicates.R_igraph_*symbols preserved (the 491st beingR_igraph_finalizer, which appears in every impl'son.exitbut has no wrapper of its own).tools/rebuild-cats.Rproduces byte-identical output on re-run (idempotent).tools/split-aaa-auto.Rproduces byte-identical output on re-run from the same source (idempotent).Test plan
devtools::load_all(".")succeedsR CMD check/ CI passesmake -f Makefile-cigraph r_wrappersround-trips cleanly on a machine with the stimulus venvigraph_*wrappers broadly, so a green test run is the main check)cc @maelle for review — this is purely an organizational/tooling change; no behavior should change, but the restructuring is substantial so a second pair of eyes on the categorization choices would be welcome.
🤖 Generated with Claude Code