Skip to content

perf(pruning): cache S3 directory listings by parent to reduce API calls#150

Merged
xe-nvdk merged 1 commit intoBasekick-Labs:mainfrom
khalid244:perf/pruning-cache-optimization
Jan 23, 2026
Merged

perf(pruning): cache S3 directory listings by parent to reduce API calls#150
xe-nvdk merged 1 commit intoBasekick-Labs:mainfrom
khalid244:perf/pruning-cache-optimization

Conversation

@khalid244
Copy link
Copy Markdown

Summary

  • Optimized S3 directory existence checks by caching ListDirectories results at parent (day) level
  • Previously each hour directory was checked individually, now all hours within a day share one API call

Results (30-day query)

  • S3 API calls: 752 → 33 (23x reduction)
  • Pruning time: ~38s → ~3s (12x faster)

Test plan

  • Tested with 30-day query on production data
  • Verified same pruning results (73 paths) before and after change

Previously, each hour directory was checked individually with a separate
ListDirectories call, resulting in ~750 S3 API calls for a 30-day query.

Now we cache by parent directory (day level), so all hours within the
same day share a single ListDirectories result.

Results when querying last 30 days:
- S3 API calls: 752 → 33 (23x reduction)
- Pruning time: ~38s → ~3s (12x faster)
@xe-nvdk xe-nvdk added the enhancement New feature or request label Jan 23, 2026
@xe-nvdk
Copy link
Copy Markdown
Member

xe-nvdk commented Jan 23, 2026

Hey @khalid244, thank you for the PR. I will review this PR.

Copy link
Copy Markdown
Member

@xe-nvdk xe-nvdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: ✅ Approved

Excellent performance optimization! Thoroughly analyzed the implementation.

What this PR does

Caches S3 ListDirectories results at the parent (day) level instead of checking each hour directory individually. For a 30-day query:

  • S3 API calls: 752 → 33 (23x reduction)
  • Pruning time: ~38s → ~3s (12x faster)

Code Quality Assessment

Aspect Status
Correctness ✅ Logic is sound
Error handling ✅ Safe fallback (assumes dir exists on error)
Cache key design ✅ Clean namespace (remote:parent:)
Thread safety ✅ Uses existing mutex-protected globCache
Tests ✅ All 28 pruning tests pass

Minor observations (not blockers)

  1. On cache hit, rebuilds child set from slice - acceptable O(24) overhead
  2. Old cache keys (remote:) expire naturally via TTL

Cost Impact

For S3-heavy workloads, this reduces LIST API costs significantly (~$0.005/1000 calls).

LGTM - merging.

@xe-nvdk xe-nvdk merged commit 58c193a into Basekick-Labs:main Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants