Skip to content

web: fast pipeline mode + audit scoping#75

Merged
PrzemyslawKlys merged 2 commits intomainfrom
codex/pipeline-fast-dev-mode
Feb 7, 2026
Merged

web: fast pipeline mode + audit scoping#75
PrzemyslawKlys merged 2 commits intomainfrom
codex/pipeline-fast-dev-mode

Conversation

@PrzemyslawKlys
Copy link
Member

Adds practical performance controls for local dev / iteration on large sites.

  • powerforge-web pipeline --fast: forces a lighter set of defaults
    • optimize: turns off heavy flags (images/hashing/headers/minify css+js) and defaults maxHtmlFiles to 50 when not set
    • audit: turns off rendered checks and defaults maxHtmlFiles to 200 when not set
  • audit: add maxHtmlFiles support in options + CLI + pipeline step schema
  • audit result: include HtmlFileCount/HtmlSelectedFileCount for visibility
  • docs: update sample README with guidance

Build: dotnet build PSPublishModule.sln -c Release
Tests: dotnet test PowerForge.Tests -c Release

Copilot AI review requested due to automatic review settings February 7, 2026 22:39
@claude
Copy link

claude bot commented Feb 7, 2026

PR Review: Fast Pipeline Mode + Audit Scoping

Summary

This PR adds a --fast flag to the pipeline command for faster local development iteration, along with maxHtmlFiles support to scope audits and optimization. The implementation is well-designed and addresses a real need for faster feedback during development.

Code Quality & Best Practices

✅ Strengths

  1. Well-tested: The new Audit_RespectsMaxHtmlFiles test provides good coverage for the core functionality

    • Tests both counting (total vs selected) and actual behavior (processes only maxHtmlFiles)
    • Proper cleanup in finally block
  2. Clear separation of concerns: Fast mode logic is isolated in dedicated blocks (WebPipelineRunner.cs:512-547, 681-696)

  3. Good UX: Fast mode logs what it overrides via logger?.Warn(), making behavior transparent to users

  4. Consistent API surface:

    • Multiple CLI aliases supported (--max-html-files, --max-html)
    • Both camelCase and kebab-case in JSON schema
  5. Schema documentation: Pipeline spec schema updated to include new properties

  6. Cache invalidation: Fingerprint computation includes "fast" salt (WebPipelineRunner.cs:1480), preventing incorrect cache hits

🔍 Areas for Improvement

1. Potential ordering issue in file selection (WebSiteAuditor.cs:166-171)

The code takes the first N files after sorting by path. This may not be representative for large sites:

var htmlFiles = allHtmlFiles;
if (options.MaxHtmlFiles > 0 && htmlFiles.Count > options.MaxHtmlFiles)
    htmlFiles = htmlFiles.Take(options.MaxHtmlFiles).ToList();

Consideration: For large sites with nested directories, alphabetical ordering could bias toward certain sections. Would a sampling strategy (e.g., evenly distributed across discovered files) or most-recently-modified provide better coverage during development?

2. Inconsistent default values between optimize/audit

  • Optimize step: defaults to maxHtmlFiles=50 in fast mode
  • Audit step: defaults to maxHtmlFiles=200 in fast mode

Question: Is this intentional due to different performance characteristics? If so, consider documenting why in code comments.

3. No validation for maxHtmlFiles upper bound

The code uses Math.Max(0, maxHtmlFiles) to prevent negatives, but doesn't cap the upper bound. For very large values, this could still cause performance issues.

Suggestion: Consider a reasonable upper limit (e.g., 10,000) or at least document expected ranges.

4. Test coverage gaps

Missing test scenarios:

  • Verify that HtmlFileCount and HtmlSelectedFileCount appear in audit result serialization
  • Edge case: maxHtmlFiles larger than available files
  • Edge case: maxHtmlFiles = 0 (should process all files)
  • Fast mode integration test (verify overrides actually apply)

5. Minor: Duplicate Math.Max calls

Both WebPipelineRunner.cs:559 and WebPipelineRunner.cs:705 have:

MaxHtmlFiles = Math.Max(0, maxHtmlFiles),

This is fine, but the validation could be done once earlier after parsing.

Performance Considerations

Good: Fast mode disables expensive operations (image optimization, hashing, minification)
Good: File limiting prevents processing entire large sites
Good: Cache fingerprinting prevents false cache hits in fast mode

Potential concern: The code still discovers all HTML files before limiting:

var allHtmlFiles = EnumerateHtmlFiles(...).ToList();  // Enumerates everything
var htmlFiles = allHtmlFiles;
if (options.MaxHtmlFiles > 0 && htmlFiles.Count > options.MaxHtmlFiles)
    htmlFiles = htmlFiles.Take(options.MaxHtmlFiles).ToList();

For sites with 10,000+ HTML files, file enumeration itself could be slow. Consider using .Take() directly on the enumerable before .ToList() if you only need counts:

var allHtmlFilesEnumerable = EnumerateHtmlFiles(...);
var htmlFilesList = options.MaxHtmlFiles > 0 
    ? allHtmlFilesEnumerable.Take(options.MaxHtmlFiles + 1).ToList()  // +1 to detect if there are more
    : allHtmlFilesEnumerable.ToList();
var htmlFileCount = htmlFilesList.Count;
var htmlFiles = htmlFilesList.Take(options.MaxHtmlFiles).ToList();

Though this changes semantics (counts would be wrong). If accurate counts are required, current approach is correct.

Security Concerns

✅ No security issues identified

  • File path handling uses existing helpers
  • Input validation via Math.Max(0, ...) prevents negative values
  • No new external input vectors

Suggested Enhancements

  1. Documentation in WebAuditOptions: Add XML doc comment explaining the selection strategy
/// <summary>
/// Maximum number of HTML files to audit (0 disables).
/// When set, files are selected in alphabetical order by path.
/// </summary>
public int MaxHtmlFiles { get; set; }
  1. Consider adding a flag to summary output: Currently shows html-scope 1/3 but only when counts differ. Consider always showing when MaxHtmlFiles is set for clarity.

  2. Add integration test: Verify that --fast actually improves performance for a realistic test site.

Verdict

✅ APPROVED with suggestions

This is a high-quality PR that solves a real problem. The implementation is sound, tested, and follows the repository's conventions. The suggestions above are minor refinements rather than blocking issues.

The fast mode is a pragmatic solution for local development, and the transparent logging of overrides is excellent UX. The addition of maxHtmlFiles is a useful feature both for fast mode and as a standalone option.


Review generated by Claude Code

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0697292faa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +525 to +528
if (cacheHeaders)
{
cacheHeaders = false;
forced.Add("cacheHeaders=false");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply fast overrides before building optimize policy

In the optimize pipeline step, --fast flips hashAssets/cacheHeaders to false here, but the AssetPolicySpec has already been populated earlier in the same block (including policy.CacheHeaders.Enabled = true when cacheHeaders was set), and WebAssetOptimizer still honors policy-driven hashing/cache-header settings (ResolveHashSpec and policy.CacheHeaders). In a fast run with cacheHeaders: true (or a config-backed policy), headers/hashing can still execute even though the log says they were disabled, so the new fast mode does not reliably enforce its advertised performance overrides.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a --fast execution mode to the powerforge-web pipeline CLI to speed up local iteration on large sites, and introduces maxHtmlFiles scoping for audits along with additional audit result counters for visibility.

Changes:

  • Add --fast flag to pipeline runner and incorporate it into pipeline cache fingerprinting.
  • Add maxHtmlFiles support across audit options, CLI/pipeline step parsing, schema validation, and audit output counters.
  • Update sample README and add a test for MaxHtmlFiles audit scoping.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
schemas/powerforge.web.pipelinespec.schema.json Adds maxHtmlFiles/max-html-files to the audit pipeline step schema.
Samples/PowerForge.Web.Sample/README.md Documents pipeline --fast mode behavior and defaults.
PowerForge.Web/Services/WebSiteAuditor.cs Implements MaxHtmlFiles scoping and records total vs selected HTML file counts.
PowerForge.Web/Models/WebAuditResult.cs Adds HtmlFileCount and HtmlSelectedFileCount to audit results.
PowerForge.Web.Cli/WebPipelineRunner.cs Adds fast pipeline mode behavior, fingerprint salting, audit scoping propagation, and enhanced audit summary output.
PowerForge.Web.Cli/WebCliHelpers.cs Updates CLI usage text for audit and pipeline to include new flags/options.
PowerForge.Web.Cli/WebCliCommandHandlers.cs Adds --max-html-files parsing and forwards it into WebAuditOptions.
PowerForge.Web.Cli/WebCliCommandHandlers.BuildCommands.cs Wires --fast into RunPipeline(...).
PowerForge.Tests/WebSiteAuditOptimizeBuildTests.cs Adds coverage to ensure audit respects MaxHtmlFiles and reports counts correctly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +497 to +501
if (cacheHeaders)
{
policy ??= new AssetPolicySpec();
policy.CacheHeaders ??= new CacheHeadersSpec { Enabled = true };
policy.CacheHeaders.Enabled = true;
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In --fast mode, setting cacheHeaders=false (and hashAssets=false) later does not actually prevent cache header generation / hashing when policy comes from --config (or when this block has already set policy.CacheHeaders.Enabled=true). WebAssetOptimizer.OptimizeDetailed writes headers based on policy.CacheHeaders.Enabled and hashing can come from policy.Hashing when options.HashAssets is false. Consider applying fast overrides before mutating/using policy, and explicitly disabling policy.CacheHeaders.Enabled and policy.Hashing.Enabled (or clearing those specs) when fast mode is active.

Copilot uses AI. Check for mistakes.
Comment on lines +207 to +208
- `optimize`: disables heavy flags (`optimizeImages`, `hashAssets`, `cacheHeaders`, `minifyCss`, `minifyJs`) and defaults `maxHtmlFiles` to 50 when not set
- `audit`: disables `rendered` checks and defaults `maxHtmlFiles` to 200 when not set
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say fast mode disables hashAssets/cacheHeaders, but the current pipeline implementation can still apply hashing/headers if config loads an AssetPolicy with those enabled. Either adjust the implementation so these are reliably disabled in fast mode, or clarify in the README that fast mode only overrides step flags and not policy-loaded hashing/headers.

Copilot uses AI. Check for mistakes.
}
if (cacheHeaders)
{
cacheHeaders = false;
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to cacheHeaders is useless, since its value is never read.

Suggested change
cacheHeaders = false;

Copilot uses AI. Check for mistakes.
@claude
Copy link

claude bot commented Feb 7, 2026

Code Review: Fast Pipeline Mode + Audit Scoping

Summary

This PR adds practical performance controls for local development on large sites through a --fast flag for pipelines and maxHtmlFiles scoping for audits. The implementation is well-structured and follows the repository's established patterns.


✅ Strengths

1. Well-Designed Feature

  • The --fast mode provides pragmatic defaults for local iteration without requiring separate config files
  • Clear separation between development speed and production quality
  • Transparent: warns users about overrides applied in fast mode

2. Code Quality

  • Follows established C# and testing conventions from CONTRIBUTING.md
  • Proper test coverage with Audit_RespectsMaxHtmlFiles() test
  • Consistent naming conventions and code structure
  • Good documentation in README sample

3. Cache Fingerprinting

  • Smart cache invalidation: adds "fast" as salt to fingerprint when fast mode is enabled (WebPipelineRunner.cs:116)
  • Prevents incorrect cache hits between fast/normal modes

4. User Experience

  • Multiple CLI aliases: --max-html-files and --max-html for flexibility
  • Clear visibility: includes HtmlFileCount/HtmlSelectedFileCount in audit results
  • Helpful summary output showing scope: html-scope {selected}/{total}

🔍 Observations & Suggestions

1. Test Coverage (Minor)
The new test Audit_RespectsMaxHtmlFiles (WebSiteAuditOptimizeBuildTests.cs:942-991) is well-written and follows proper cleanup patterns. Consider adding:

  • Test case for maxHtmlFiles = 0 (disabled) to verify default behavior
  • Test verifying alphabetical ordering is respected when files are limited
  • Edge case: maxHtmlFiles > actual file count

2. Fast Mode Defaults (Design Question)

// PowerForge.Web.Cli/WebPipelineRunner.cs:535-545
if (maxHtmlFiles <= 0)
{
    maxHtmlFiles = 50;  // optimize step
}

// Line 694-696
if (maxHtmlFiles <= 0)
{
    maxHtmlFiles = 200;  // audit step
}

Question: Should these defaults be configurable constants or documented more prominently? The hardcoded values (50 for optimize, 200 for audit) are reasonable but might need adjustment based on user feedback.

3. File Selection Strategy

// PowerForge.Web/Services/WebSiteAuditor.cs:171-173
var htmlFiles = allHtmlFiles;
if (options.MaxHtmlFiles > 0 && htmlFiles.Count > options.MaxHtmlFiles)
    htmlFiles = htmlFiles.Take(options.MaxHtmlFiles).ToList();

Consideration: Alphabetical ordering is predictable but may not always select the most important files. For large sites, consider future enhancement to support:

  • Random sampling for broader coverage
  • Priority-based selection (e.g., index pages first)
  • Pattern-based selection

Current implementation is fine for v1, just noting for future iterations.

4. Documentation Clarity
The README update (Samples/PowerForge.Web.Sample/README.md:201-208) is excellent. One suggestion:

  • Add note about cache invalidation: "Note: Fast mode creates separate cache entries to prevent mixing fast/normal builds"

🔒 Security Considerations

No security concerns identified

  • No user input is directly used in file system operations without validation
  • GUID-based temp directories prevent path traversal
  • Proper resource cleanup in test code

⚡ Performance Considerations

Excellent performance improvements

  • Fast mode disables expensive operations: image optimization, minification, hashing
  • HTML scoping prevents processing unnecessary files in large sites
  • Cache fingerprint properly includes fast mode flag
  • Alphabetical sorting is O(n log n) but acceptable for typical site sizes

Potential optimization (not required for this PR):
The sort operation happens twice for audit:

var allHtmlFiles = EnumerateHtmlFiles(...)
    .OrderBy(path => path, StringComparer.OrdinalIgnoreCase)  // Sort all
    .ToList();
var htmlFiles = allHtmlFiles;
if (options.MaxHtmlFiles > 0 && htmlFiles.Count > options.MaxHtmlFiles)
    htmlFiles = htmlFiles.Take(options.MaxHtmlFiles).ToList();  // Take subset

Could use Take before ToList() but current implementation is clearer and performance difference is negligible.


🐛 Potential Issues

None identified - Code is solid.


📋 Schema & API Consistency

✅ Schema properly updated in schemas/powerforge.web.pipelinespec.schema.json
✅ Both kebab-case and camelCase supported: maxHtmlFiles / max-html-files
✅ CLI help text updated in WebCliHelpers.cs


🧪 Test Quality Review

The test Audit_RespectsMaxHtmlFiles follows best practices:

  • ✅ Uses try/finally for cleanup
  • ✅ GUID-based unique directory naming
  • ✅ Proper assertions for both file counts
  • ✅ Tests the actual feature behavior

Minor suggestion: Add assertion comment explaining why 3 files but max 1:

// Create 3 HTML files
File.WriteAllText(Path.Combine(root, "a.html"), ...);
File.WriteAllText(Path.Combine(root, "b.html"), ...);
File.WriteAllText(Path.Combine(root, "c.html"), ...);

// Audit with maxHtmlFiles = 1 should process only first file alphabetically
var result = WebSiteAuditor.Audit(new WebAuditOptions { ... });

Assert.Equal(3, result.HtmlFileCount);           // All files discovered
Assert.Equal(1, result.HtmlSelectedFileCount);   // Only 1 processed
Assert.Equal(1, result.PageCount);               // Only 1 audited

✨ Best Practices Followed

  • ✅ Thin CLI layer, core logic in services
  • ✅ Typed models with XML documentation
  • ✅ Consistent parameter naming across CLI/JSON/C#
  • ✅ Proper use of StringComparison.OrdinalIgnoreCase
  • ✅ Math.Max(0, ...) to ensure non-negative values
  • ✅ Warning messages for user transparency

📝 Recommendations

Before Merge:

  1. ✅ All looks good - no blocking issues

Future Enhancements (not for this PR):

  1. Consider making fast mode HTML limits configurable via environment variables
  2. Add metrics/telemetry to understand typical site sizes and optimize defaults
  3. Explore sampling strategies for very large sites (1000+ pages)

🎯 Final Verdict

Approve with minor suggestions

This is a high-quality PR that:

  • Solves a real problem (slow local iteration on large sites)
  • Maintains code quality and testing standards
  • Provides good user experience with clear warnings and documentation
  • Has no security or performance concerns
  • Follows repository conventions

Great work! The implementation is production-ready.


Build & Test Instructions (from PR description)

dotnet build PSPublishModule.sln -c Release
dotnet test PowerForge.Tests -c Release

Reviewed by Claude Code 🤖

@PrzemyslawKlys PrzemyslawKlys merged commit 1a4aaee into main Feb 7, 2026
3 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the codex/pipeline-fast-dev-mode branch February 7, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants