Skip to content

Improve precompilation: extend workload coverage#146

Merged
ChrisRackauckas merged 1 commit intoSciML:masterfrom
ChrisRackauckas-Claude:precompile-improvements-20251229-090148
Dec 29, 2025
Merged

Improve precompilation: extend workload coverage#146
ChrisRackauckas merged 1 commit intoSciML:masterfrom
ChrisRackauckas-Claude:precompile-improvements-20251229-090148

Conversation

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

Summary

This PR improves package startup time by extending the precompilation workload to cover more commonly-used code paths.

Changes

  • Main module: Added precompilation for:

    • DiffCache with matrices (not just vectors)
    • LazyBufferCache operations
    • GeneralLazyBufferCache operations
  • ForwardDiff extension: Added precompilation for:

    • DiffCache with ForwardDiff.Dual numbers
    • FixedSizeDiffCache creation and usage with Dual numbers

Performance Improvements (TTFX - Time to First X)

Operation Before After Improvement
DiffCache (Dual) 174ms ~1ms 99%
FixedSizeDiffCache (create) 229ms ~0ms 99%+
LazyBufferCache 99ms ~0ms 99%+
GeneralLazyBufferCache 69ms ~1ms 98%
DiffCache (matrix) 89ms ~11ms 88%
Total TTFX 761ms ~15ms 98%

Testing

  • All existing tests pass
  • No invalidations introduced by PreallocationTools itself (only expected invalidations from ForwardDiff loading)

Analysis Method

Used SnoopCompile to:

  1. Check for invalidations - found only expected ones from ForwardDiff
  2. Profile inference timing to identify high-cost operations
  3. Add precompilation for the top time-consuming operations

cc @ChrisRackauckas

🤖 Generated with Claude Code

Expanded precompilation workload in both the main module and ForwardDiff
extension to reduce time-to-first-X (TTFX) for common operations.

Changes:
- Main module: Added precompilation for DiffCache with matrices,
  LazyBufferCache, and GeneralLazyBufferCache
- ForwardDiff extension: Added precompilation for DiffCache and
  FixedSizeDiffCache with ForwardDiff.Dual numbers

Improvements measured (first call after loading):
- DiffCache TTFX (Dual): 174ms -> ~1ms (99% reduction)
- FixedSizeDiffCache TTFX: 229ms -> ~0ms (99%+ reduction)
- LazyBufferCache TTFX: 99ms -> ~0ms (99%+ reduction)
- GeneralLazyBufferCache TTFX: 69ms -> ~1ms (98% reduction)
- DiffCache TTFX (matrix): 89ms -> ~11ms (88% reduction)
- Total TTFX: 761ms -> ~15ms (98% reduction)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ChrisRackauckas ChrisRackauckas merged commit 1144fd6 into SciML:master Dec 29, 2025
12 of 16 checks passed
ChrisRackauckas-Claude pushed a commit to ChrisRackauckas-Claude/PreallocationTools.jl that referenced this pull request Dec 29, 2025
This release includes the following changes since v0.4.34:
- Improve explicit imports hygiene (SciML#147)
- Improve precompilation: extend workload coverage (SciML#146)
- Migrate to Dependabot (SciML#144)
- Bump actions/checkout from 4 to 6 (SciML#143)

Fixes SciML#148

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants