Split large Rust source files into one-item-per-file submodules — preserving comments and the public API, with the compiler as the safety net.
cargo install cargo-split-modules
cargo split-modules src/big.rs # split one file
cargo split-modules --recursive src # split every oversized file in a crate
cargo split-modules -n src/big.rs # dry run: show what would happenIt's published on crates.io:
cargo install cargo-split-modulesThis installs a cargo subcommand, so you invoke it as cargo split-modules … (or call
the cargo-split-modules binary directly). It needs a Rust toolchain with cargo on
PATH; rustfmt is used if present but optional.
Install the agent skill with npx skills (works
with Claude Code, Codex, Cursor, OpenCode, and others):
npx skills add zenide/rust-split-modulesThis drops a SKILL.md into your agent's skills directory telling the agent when and how to
use the tool (and that it's safe to run because every change is compiler-verified and rolled
back on failure). The agent still needs the binary on PATH — cargo install cargo-split-modules — which the skill instructs it to do. Once installed, an agent can run:
cargo split-modules --recursive src # safe: verified + auto-rollbackTurn this:
src/parser.rs # 1500 lines: 12 structs, 30 fns, 20 impls
into this:
src/parser.rs # module index: `mod` decls + `pub use` re-exports
src/parser/
token.rs # struct Token + its impls
lexer.rs # struct Lexer + its impls
parse_expr.rs # fn parse_expr
...
…and your crate still compiles and passes its tests, unchanged.
Large, monolithic files are a tax on humans and an outright hazard for AI coding agents. Atomicity and modular structure stop being style preferences and start being a correctness and throughput concern. One item per file gives you:
-
Parallel edits without merge conflicts. When several agents (or several teammates) work a codebase at once, two changes to two functions that live in the same 1,500-line file collide; the same two changes to two separate files don't. Small files turn "serialize everything through one hot file" into "edit independently, merge cleanly." For fleets of agents working concurrently, this is the difference between scaling out and constantly stepping on each other.
-
Atomic, low-blast-radius replacements. An agent rewriting a whole file has to reproduce everything it isn't changing — and any slip corrupts unrelated code. When a function owns its own file, "replace this function" is "replace this file": the unit of change matches the unit of meaning, so a full-file rewrite touches exactly one item and nothing else. Smaller files also mean smaller diffs and smaller, cheaper context windows per edit.
-
The filesystem is the search index.
src/parser/parse_expr.rstells you whereparse_exprlives without parsing a single token. Listing files is a free, always-current symbol index — no AST tooling, no language server, noctags, no semantic database to build or keep in sync.find,ls, and a fuzzy file-opener get you to any definition directly, and an agent can locate code with a cheap directory read instead of an expensive whole-file scan. -
Searchability and locality. Grepping a name surfaces its definition file by path, not buried at line 1,142 of a grab-bag module. Reading one item means opening one short file instead of loading a giant one and scrolling to the relevant region — less noise for a reviewer and far less irrelevant context for a model.
The catch has always been that splitting files by hand is tedious and error-prone — exactly the kind of mechanical refactor that breaks imports and visibility. This tool does it mechanically and proves it didn't break anything (see Why it's safe), so you get the structure without the risk.
Most "move code around" tools risk breaking your build. This one is built so it cannot leave your project in a broken state:
-
The public API is preserved by construction. Each item moves into a child file, and the parent module re-exports it at its original visibility (
pub use child::Foo;,pub(crate) use …, privateuse …). Every path anywhere in your project that referencedcrate::parser::Tokenstill resolves — no call sites are rewritten. -
Children see everything via
use super::*;. All the originaluseimports stay in the parent, and child modules glob-import them along with their siblings. No import analysis, no guessing. -
Member visibility is widened safely. Moving a struct deeper would hide its private fields from sibling modules that relied on the old nesting, so private members are widened to
pub(crate)— a superset of any in-crate audience, which can never break compiling code and never changes the external API. -
Module-relative paths are rewritten with scope awareness.
super::X→super::super::Xandself::X→super::X, but only at the item's own module depth (paths inside nestedmod {}blocks are left alone). -
The compiler verifies every split. After writing files,
cargo split-modulesrunscargo check. If anything fails to compile, it rolls back the entire split, restoring the original file byte-for-byte and removing generated files. You either get a working split or no change at all.
This has been validated by splitting real crates end to end and confirming their full test suites still pass — see Validation below.
Each crate below was cloned, split recursively (--recursive src), and had its own test
suite run before and after. In every case the test counts are identical — behaviour is
preserved. The few files that couldn't be split safely were rolled back automatically and
left untouched.
| crate | files before → after | avg LOC/file before → after | files split | rolled back | tests before → after |
|---|---|---|---|---|---|
| semver | 8 → 65 | 264 → 36 | 8 | 0 | 38 → 38 ✅ |
| bytes | 19 → 129 | 518 → 79 | 9 | 0 | 1303 → 1303 ✅ |
| anyhow | 12 → 64 | 326 → 64 | 9 | 1 | 96 → 96 ✅ |
| httparse | 9 → 65 | 457 → 67 | 5 | 2 | 368 → 368 ✅ |
| base64 | 21 → 198 | 340 → 39 | 16 | 0 | 222 → 222 ✅ |
| memchr | 45 → 223 | 350 → 74 | 28 | 0 | 136 → 136 ✅ |
| bitflags | 44 → 128 | 133 → 49 | 31 | 0 | 74 → 74 ✅ |
| heck | 9 → 43 | 96 → 23 | 9 | 0 | 128 → 128 ✅ |
| total | 167 → 915 | 297 → 54 | 115 | 3 | 2365 → 2365 ✅ |
Across ~50k lines of third-party code, 115 files were split and not one test changed its result — the 3 unsplittable files were safely rolled back.
Reproduce this table yourself with scripts/bench-real-crates.sh
(needs cargo, git, and network access).
- Doc-comments (
///,//!) and#[derive]/attribute lines — they're part of each item's span and move with it. - Plain
//comments directly above an item, and trailing same-line comments. #[cfg(...)]attributes — replicated onto the generated re-export.- Generics,
unsafe,async, lifetimes,whereclauses — the item's source text is sliced verbatim, never reformatted away.
One file per item, named after it (snake_case):
| Item | Goes to |
|---|---|
struct / enum / union / type / trait |
name.rs |
free fn |
name.rs |
const / static |
name.rs |
impl Foo / impl Trait for Foo |
co-located in foo.rs (with Foo) |
A same-named const, type alias, and struct merge into one file. impl blocks for an
external/complex self type land in impls.rs.
Things that stay in the parent: use, mod, extern crate, macro_rules!,
anonymous (const _) and _-prefixed side-effect items.
foo.rs→ a siblingfoo/directory is created andfoo.rsbecomes the module index.lib.rs/main.rs/mod.rs→ generated files go in the same directory (these already own a directory module).
cargo split-modules <PATH> [OPTIONS]
PATH A .rs file to split, or a directory/crate to process recursively.
-r, --recursive Process a directory recursively (implied when PATH is a directory).
Splits every file that would yield 2+ module files.
-n, --dry-run Show what would happen without writing anything.
--no-verify Skip the cargo check + rollback safety step (faster, not advised).
--no-fmt Don't run rustfmt on generated files.
--min-groups N Minimum number of resulting module files for a split (default 2).
A file is safely skipped (rolled back, never broken) when a split would not compile —
in practice this means paths hidden inside macro token streams (some_macro!(super::X)),
or other constructs the AST can't see. You lose nothing: the file is left exactly as it
was, and the tool tells you which files it skipped.
Licensed under either of Apache-2.0 or MIT at your option.
