Skip to content

v0.1.29 - CSV Export & Cache Performance Optimization

Choose a tag to compare

@quanhua92 quanhua92 released this 18 Nov 09:53
· 751 commits to main since this release

πŸš€ Performance Improvements

This release delivers major performance optimizations for CSV export and cache operations, making the API 5.5x faster for large dataset queries.

Key Optimizations

1. Cache Lookup with Binary Search (v0.1.28+)

  • Implemented O(log n) binary search for date range filtering
  • Replaced linear O(n) scan with partition_point()
  • Result: Faster cache lookups for date-range queries

2. Representative Ticker Cache Check (v0.1.29) ⭐

  • Issue: Cache check tested ALL 286 tickers, causing disk fallback if ANY ticker had <limit records
  • Fix: Check only 3 representative tickers (VNINDEX, VCB, VIC) that always have full history
  • Impact:
    • Before: 448ms (disk read for all 286 tickers)
    • After: 79-82ms (memory cache)
    • 92% faster (5.5x speedup) πŸŽ‰

3. CSV Generation Optimization (v0.1.28)

  • Pre-allocate string buffer with exact capacity
  • Avoid unnecessary string reallocations
  • Result: 83% faster CSV string generation

4. Background Auto-Reload Cache (v0.1.28)

  • Automatically reload expired cache in background
  • Prevents cache misses during API requests
  • Ensures consistently fast response times

Performance Benchmarks

CSV Export (All Tickers, 1D, limit=100):

  • Local: 79-82ms average
  • Docker: 150-600ms average (with container overhead)

Representative Tickers (VNINDEX, VCB, VIC):

  • Local: 3ms average
  • Docker: 6ms average

Test Coverage:

  • βœ… 16/17 integration tests passed
  • βœ… 13/13 analysis API tests passed
  • βœ… 33/33 aggregated intervals tests passed
  • βœ… 7/7 performance validation tests passed

Testing Tools

Added comprehensive performance test script:

./scripts/debug_perf_csv.sh [url]

Tests 7 scenarios:

  1. All tickers with limit=100 (baseline - cache validation)
  2. Single ticker query (fast path)
  3. All tickers with limit=500 (large dataset)
  4. Representative tickers (VNINDEX, VCB, VIC) - cache check validation
  5. All tickers with limit=50 (smaller limit)
  6. Date range filtering (binary search validation)
  7. No limit (full 2-year cache - 71,500 lines)

πŸ“Š Changes Since v0.1.27

Modified Files:

  • src/services/data_store.rs: Cache check logic, binary search, auto-reload
  • src/server/api.rs: Performance logging (DEBUG:PERF)
  • src/commands/serve.rs: Background worker integration
  • scripts/debug_perf_csv.sh: New comprehensive performance test suite
  • scripts/test-integration.sh: Updated cache behavior tests

Performance Commits:

  • 3996675 - Optimize cache lookup with binary search for date filtering
  • 7ad06d0 - Optimize cache check to use representative tickers (92% faster)
  • da0cb02 - Optimize CSV export performance: 83% faster generation
  • 4387e64 - Add background auto-reload for in-memory cache

πŸ”§ Technical Details

Representative Ticker Strategy

The cache insufficiency check now tests only 3 well-established tickers:

  • VNINDEX: Market index (always has full history)
  • VCB: Vietcombank (blue-chip bank, always has data)
  • VIC: Vingroup (large-cap conglomerate, always has data)

This prevents false positives from new listings or delisted stocks while maintaining cache correctness.

Binary Search Implementation

Uses Rust's partition_point() for O(log n) date filtering:

let start_idx = data.partition_point(|d| d.time < start_date);
let end_idx = data.partition_point(|d| d.time <= end_date);

πŸ“ˆ Migration Notes

No breaking changes. All APIs remain backward compatible.

Recommended: Test your integration with the new performance test script:

./scripts/debug_perf_csv.sh https://api.aipriceaction.com

πŸ› Bug Fixes

  • Fixed CSV line count test: trim whitespace from wc -l output
  • Fixed cache expiration edge cases with background reload

Full Changelog: v0.1.27...v0.1.29