Background
pyrer.solve is already very fast (~36× rez on the 188-case benchmark on the same machine, see README). But the end-to-end rez env <pkgs> path on a real repository still spends a substantial chunk of wall-clock time on filesystem operations before the solver ever runs — well outside what we can speed up by tuning the solver further.
Profile sketch from rez/src/rezplugins/package_repository/filesystem.py:
| Operation |
Where |
Per resolve (typical) |
| Family enumeration |
_get_family_dirs — one os.listdir(root) + os.path.isdir per entry |
1 |
| Version enumeration |
_get_version_dirs(family) — multiple os.listdir per family, checking for .ignore* / .building* |
1 per family touched |
| Package file probe |
_is_valid_package_directory → _get_file — up to 4 sequential os.path.isfile calls per version dir, checking package.py / package.yaml |
1 per version dir |
For a 50-package resolve on a repo with hundreds of families, that's hundreds of Python-level syscalls (per resolve, even cached the cache is only as warm as memcached is configured). It dominates the pre-solve wall-clock for cold resolves.
What this issue covers
A Rust-side directory walker that produces the enumeration output (which families exist, which versions, where the package file lives) much faster than rez's Python loop — without taking on package.py parsing.
Out of scope: see the dedicated section below.
Proposed user-facing API
Low-level: pyrer.scan_repo(paths) -> list[ScannedPackage]
import pyrer
for entry in pyrer.scan_repo(["/sw/pkg", "/sw/site"]):
entry.family # str — package family name
entry.version # str — version directory name (as on disk)
entry.format # "py" | "yaml" | "txt"
entry.path # str — absolute path to the package file
Pure data, no package.py evaluation. Callers feed each entry.path through rez's existing loader (rez.serialise.load_py / etc.) to get a real Package object.
High-level: pyrer.solve(requests, *, package_paths=[...])
import pyrer
result = pyrer.solve(["maya-2024", "nuke-14"], package_paths=["/sw/pkg"])
Internally:
scan_repo(package_paths) (Rust walk)
- For each entry, ask rez to load it —
rez.serialise.load_py(entry.path) or equivalent
- Convert each loaded
Package via PackageData.from_rez(pkg)
- Hand the list to the existing solver core
- Return a
SolveResult exactly as today
Rez stays the loader. Rust is only the walker.
The existing pyrer.solve(requests, packages=[...]) (which takes list[PackageData] directly) stays unchanged — package_paths= is purely additive.
Implementation sketch
Rust crate (new module in rer-resolver or a new sibling crate)
pub struct ScannedPackage {
pub family: String,
pub version: String,
pub format: PackageFormat, // Py | Yaml | Txt
pub path: PathBuf,
}
pub fn scan_repo(paths: &[PathBuf]) -> Result<Vec<ScannedPackage>, ScanError>;
For each input path:
std::fs::read_dir(path) to list family directories (one syscall).
- For each family dir:
- One
read_dir to list versions.
- One
read_dir again only if .ignore* / .building* filtering is enabled (or fold into the version scan — single pass).
- For each version dir: single
stat-equivalent check that picks the first existing of package.py / package.yaml / package.txt via DirEntry::file_type rather than four sequential isfile calls.
- Skip families/versions matching rez's standard ignore patterns (
.ignore*, .building*, leading underscore, etc. — match rez's _is_valid_package_directory semantics exactly).
The Rust walk should match rez's _get_family_dirs / _get_version_dirs / _is_valid_package_directory byte-for-byte in terms of which entries it surfaces, so a pyrer.solve(..., package_paths=...) resolves against exactly the same set of packages as rez env would.
PyO3 binding (in rer-python)
#[pyclass] ScannedPackage mirroring the Rust struct.
#[pyfunction] scan_repo(paths: Vec<PathBuf>) -> PyResult<Vec<ScannedPackage>>.
solve(...) gains a package_paths: Option<Vec<PathBuf>> keyword that triggers the walk + rez-loader path (Python-side glue, since the loader call lives in Python).
Python shim layer (in rer-python's wheel)
A small Python file shipped alongside the cdylib that:
- Imports rez lazily (guarded
try: import rez; ... except ImportError: raise RuntimeError(\"package_paths= requires rez to be installed\")).
- Wraps
solve(*, package_paths=...): calls scan_repo, iterates entries, calls rez.serialise.load_<format>(entry.path), converts via PackageData.from_rez, delegates to the Rust solver.
- This is the only place where
pyrer touches rez at all — keeps the Rust core rez-free.
Acceptance criteria
Out of scope — B-deep
This issue is explicitly not about porting package.py evaluation to Rust.
rez's package.py files in production routinely contain arbitrary Python:
- non-literal expressions (
requires = ["python-" + sys.platform]),
@early() decorators (build-time-evaluated),
@late() decorators (resolve-context-aware),
- platform-conditional
variants,
- runtime imports, helper functions defined in the file body.
A Rust literal-AST parser cannot evaluate any of that, and shipping a real CPython embedding in the cdylib is a massive scope expansion. rez's exec()-based loader stays on the Python side. B-medium captures the FS-walk win without taking that on.
Related
Background
pyrer.solveis already very fast (~36× rez on the 188-case benchmark on the same machine, see README). But the end-to-endrez env <pkgs>path on a real repository still spends a substantial chunk of wall-clock time on filesystem operations before the solver ever runs — well outside what we can speed up by tuning the solver further.Profile sketch from
rez/src/rezplugins/package_repository/filesystem.py:_get_family_dirs— oneos.listdir(root)+os.path.isdirper entry_get_version_dirs(family)— multipleos.listdirper family, checking for.ignore*/.building*_is_valid_package_directory→_get_file— up to 4 sequentialos.path.isfilecalls per version dir, checkingpackage.py/package.yamlFor a 50-package resolve on a repo with hundreds of families, that's hundreds of Python-level syscalls (per resolve, even cached the cache is only as warm as memcached is configured). It dominates the pre-solve wall-clock for cold resolves.
What this issue covers
A Rust-side directory walker that produces the enumeration output (which families exist, which versions, where the package file lives) much faster than rez's Python loop — without taking on
package.pyparsing.Out of scope: see the dedicated section below.
Proposed user-facing API
Low-level:
pyrer.scan_repo(paths) -> list[ScannedPackage]Pure data, no
package.pyevaluation. Callers feed eachentry.paththrough rez's existing loader (rez.serialise.load_py/ etc.) to get a realPackageobject.High-level:
pyrer.solve(requests, *, package_paths=[...])Internally:
scan_repo(package_paths)(Rust walk)rez.serialise.load_py(entry.path)or equivalentPackageviaPackageData.from_rez(pkg)SolveResultexactly as todayRez stays the loader. Rust is only the walker.
The existing
pyrer.solve(requests, packages=[...])(which takeslist[PackageData]directly) stays unchanged —package_paths=is purely additive.Implementation sketch
Rust crate (new module in
rer-resolveror a new sibling crate)For each input path:
std::fs::read_dir(path)to list family directories (one syscall).read_dirto list versions.read_diragain only if.ignore*/.building*filtering is enabled (or fold into the version scan — single pass).stat-equivalent check that picks the first existing ofpackage.py/package.yaml/package.txtviaDirEntry::file_typerather than four sequentialisfilecalls..ignore*,.building*, leading underscore, etc. — match rez's_is_valid_package_directorysemantics exactly).The Rust walk should match rez's
_get_family_dirs/_get_version_dirs/_is_valid_package_directorybyte-for-byte in terms of which entries it surfaces, so apyrer.solve(..., package_paths=...)resolves against exactly the same set of packages asrez envwould.PyO3 binding (in
rer-python)#[pyclass] ScannedPackagemirroring the Rust struct.#[pyfunction] scan_repo(paths: Vec<PathBuf>) -> PyResult<Vec<ScannedPackage>>.solve(...)gains apackage_paths: Option<Vec<PathBuf>>keyword that triggers the walk + rez-loader path (Python-side glue, since the loader call lives in Python).Python shim layer (in
rer-python's wheel)A small Python file shipped alongside the cdylib that:
try: import rez; ... except ImportError: raise RuntimeError(\"package_paths= requires rez to be installed\")).solve(*, package_paths=...): callsscan_repo, iterates entries, callsrez.serialise.load_<format>(entry.path), converts viaPackageData.from_rez, delegates to the Rust solver.pyrertouches rez at all — keeps the Rust core rez-free.Acceptance criteria
pyrer.scan_repo(paths)produces the same set of(family, version, format, path)tuples that rez'sFileSystemPackageRepositorywould, on a representative repo. Validated by a diff test against rez's own enumeration.pyrer.solve(requests, package_paths=...)produces the same resolution aspyrer.solve(requests, packages=[...])built from rez's existingiter_package_familiesfor the same paths.pyrer.solve(reqs, package_paths=...)is measurably faster than the same path through rez's enumeration. Target: 50%+ reduction in pre-solve wall-clock on cold runs.pyrer.scan_repoalone works without rez installed;solve(..., package_paths=...)raises a clear error if rez is missing.rez-integration.mdto show the new path-basedsolve(...)as the recommended integration shape.Out of scope — B-deep
This issue is explicitly not about porting
package.pyevaluation to Rust.rez's
package.pyfiles in production routinely contain arbitrary Python:requires = ["python-" + sys.platform]),@early()decorators (build-time-evaluated),@late()decorators (resolve-context-aware),variants,A Rust literal-AST parser cannot evaluate any of that, and shipping a real CPython embedding in the cdylib is a massive scope expansion. rez's
exec()-based loader stays on the Python side. B-medium captures the FS-walk win without taking that on.Related
rez-integration.mddocumentation guide already describes B-shallow (caller writes the walk loop manually).PackageData.from_rez(pkg)(shipped in feat(python): addPackageData.from_rez(pkg)convenience #79) is the per-package conversion this issue would use.rez envcold paths — discussed in the docs/engineering notes context.