Skip to content

Further profiling and optimisation #636#647

Merged
Chemaclass merged 8 commits into
TypedDevs:mainfrom
objctp:feat/636-optimise-coverage-report-performance
Apr 30, 2026
Merged

Further profiling and optimisation #636#647
Chemaclass merged 8 commits into
TypedDevs:mainfrom
objctp:feat/636-optimise-coverage-report-performance

Conversation

@objctp
Copy link
Copy Markdown
Contributor

@objctp objctp commented Apr 30, 2026

Background

Related #636

PRs #643 and #644 addressed the is_executable_line consolidation and compute_file_coverage single-pass. This PR adds further optimisations on top: eliminating remaining sed/grep subshells, pre-loading source files into arrays, caching stats across reports, and using pure Bash 3.0 string operations in extract_functions (replacing [[ =~ ]]).

Changes

  • Eliminate sed -n per-line spawns — read source files once into indexed arrays in get_hit_lines and generate_file_html, replacing per-line sed calls with array lookups
  • Cache file stats across reportsprecompute_file_stats computes executable/hit/pct once via compute_file_coverage, shared by report_text, report_html, and get_percentage; O(1) lookup with string-based cache
  • Bash 3.0 compatible extract_functions — pure string operations replace [[ =~ ]] + BASH_REMATCH; printf '%s' replaces echo in is_executable_line; ((++var)) convention from Fix post-increment causing silent exit under set -e #619
  • Pre-load file arrays in hot pathscompute_file_coverage, report_lcov, and get_function_coverage now read files once and iterate with for instead of while IFS= read
  • Pure Bash fast-path in is_executable_linecase statements handle comments, braces, control keywords, and standalone ) without spawning grep; falls through to grep for complex patterns only
  • Added tests for precompute_file_stats and get_cached_stats

Per-Component Breakdown

Note: BEFORE = original main before any optimisation PRs (#643, #644, and this PR). AFTER = this branch, which includes all three PRs' changes. The per-component and hyperfine numbers reflect the cumulative improvement.

Test workload: 15 source files, ~1200 hit entries, full report generation (text + lcov + html)

Component BEFORE AFTER Improvement
is_executable_line (8000 calls) 231,382 ms 30,556 ms 7.6x
extract_functions (15 calls) 9,396 ms 93 ms 101.0x
report_text 74,480 ms 6,726 ms 11.1x
report_lcov 124,797 ms 6,334 ms 19.7x
report_html 213,923 ms 32,495 ms 6.6x
Full pipeline (text+lcov+html) 406,970 ms 44,797 ms 9.1x

Hyperfine Statistical Benchmark

Measured with warmup run + 3 measured runs each, on a macOS machine:

Benchmark 1: BEFORE (main)
  Time (mean ± σ):     118.053 s ±  0.930 s
  Range (min … max):   117.169 s … 119.022 s    3 runs

Benchmark 2: AFTER (optimised)
  Time (mean ± σ):      8.782 s ±  0.209 s
  Range (min … max):    8.618 s … 9.018 s    3 runs

Summary
  AFTER ran 13.44 ± 0.34 times faster than BEFORE

13.4x faster confirmed by hyperfine. The original code took ~2 minutes to generate all three reports; after the optimisations it takes ~9 seconds.

Caveats

  • Numbers from macOS with moderate fork overhead. Cygwin environments have orders-of-magnitude higher fork costs, so subshell elimination gains would be amplified further.
  • Per-component numbers use EPOCHREALTIME timestamps in a benchmark harness — directional rather than statistically rigorous. The hyperfine full-pipeline comparison is the headline number.
  • Benchmark workload (15 files, ~1200 entries) is synthetic. Real-world speedup depends on repository size and hit density.

Checklist

  • I updated the CHANGELOG.md to reflect the new feature or fix
  • I updated the documentation to reflect the changes

@Chemaclass Chemaclass assigned Chemaclass and objctp and unassigned Chemaclass Apr 30, 2026
@Chemaclass Chemaclass added the enhancement New feature or request label Apr 30, 2026
…che on init

Why:
- generate_file_html unconditionally re-ran get_coverage_class after both
  branches already assigned class, undoing part of the cache lookup.
- bashunit::coverage::init did not reset the stats cache globals, leaving
  stale data on re-init.
@Chemaclass Chemaclass added refactoring Refactoring or cleaning related and removed enhancement New feature or request labels Apr 30, 2026
Why:
- get_cached_stats already falls back to get_file_stats when the cache
  is empty, so the if/else dispatch in report_text, report_html, and
  generate_file_html was duplicating the same code path.
- Single call site shrinks three duplicated parsers down to one and
  removes a dead branch.
Copy link
Copy Markdown
Member

@Chemaclass Chemaclass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Chemaclass Chemaclass merged commit 20ab298 into TypedDevs:main Apr 30, 2026
30 checks passed
@objctp objctp deleted the feat/636-optimise-coverage-report-performance branch May 1, 2026 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring Refactoring or cleaning related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants