-
Notifications
You must be signed in to change notification settings - Fork 20
feat: implement dependency-aware download scheduling #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Review Round 2All previous feedback has been addressed:
Minor note: The PR description still mentions "Simple bubble sort: Good enough for typical package counts (<200)" under Design Decisions, but the implementation now correctly uses The implementation looks good - clean use of the standard library, proper memoization, and comprehensive tests including stability verification. LGTM 👍 |
Optimize remote cache downloads by sorting packages by dependency depth, ensuring critical path packages are downloaded first. This reduces overall build time by allowing dependent builds to start earlier. Algorithm: - Calculate dependency depth for each package (max distance from leaf nodes) - Sort packages by depth in descending order (deepest first) - Download in sorted order using existing worker pool (30 workers) Performance Impact: - Tested with 21 packages in production (gitpod-next repository) - Packages correctly sorted: depth 3 → 2 → 1 → 0 - Expected improvement: 15-25% faster builds (when cache hit rate is high) - Negligible overhead: <1ms for 200 packages Implementation: - sortPackagesByDependencyDepth(): Main sorting function - calculateDependencyDepth(): Recursive depth calculation with memoization - Integrated into build.go before RemoteCache.Download() call - No interface changes required (sorting at caller level) Testing: - Comprehensive unit tests for various dependency structures - Performance benchmarks showing <500µs for 200 packages - Verified in production with real remote cache downloads Co-authored-by: Ona <no-reply@ona.com>
Co-authored-by: Cornelius A. Ludmann <cornelius@gitpod.io>
…algorithm from Go stdlib Co-authored-by: Cornelius A. Ludmann <cornelius@gitpod.io>
Co-authored-by: Ona <no-reply@ona.com>
The previous test passed by coincidence because input was already in expected order. New test verifies stability by using multiple input orderings and checking that relative order within each depth group is preserved. Also adds missing 'sort' import required by sort.SliceStable. Co-authored-by: Ona <no-reply@ona.com>
426fb0f to
03cf377
Compare
Summary
Optimize remote cache downloads by sorting packages by dependency depth, ensuring critical path packages are downloaded first. This reduces overall build time by allowing dependent builds to start earlier.
Fixes https://linear.app/ona-team/issue/CLC-2093/implement-dependency-aware-download-scheduling-for-s3-cache
Part of https://linear.app/ona-team/issue/CLC-2086/optimize-leeway-s3-cache-performance
Performance Impact
Measured improvement: 7.2% faster for a production build with 11 packages.
Expected improvements scale with build size:
How It Works
Algorithm
Example Download Order
Why This Helps
Implementation Details
Core Functions
sortPackagesByDependencyDepth(): Main sorting function usingsort.SliceStablecalculateDependencyDepth(): Recursive depth calculation with memoizationbuild.gobeforeRemoteCache.Download()callDesign Decisions
sort.SliceStablefor O(n log n) complexity and deterministic orderingComplexity
Testing
Unit Tests
Comprehensive tests for various dependency structures:
Benchmarks
Production Verification
Tested in production environment with:
Backward Compatibility
✅ Fully backward compatible:
Files Changed
pkg/leeway/build.go: Sorting logic + integrationpkg/leeway/build_sort_test.go: Tests + benchmarksRelated
This optimization complements PR #278 (S3 cache batch operations), which improved cache checks/downloads. Together, these optimizations significantly reduce build times for projects using remote cache.