Skip to content

Fast mode: scope optimize/audit to updated pages#79

Merged
PrzemyslawKlys merged 1 commit intomainfrom
codex/fast-incremental-optimize-audit
Feb 8, 2026
Merged

Fast mode: scope optimize/audit to updated pages#79
PrzemyslawKlys merged 1 commit intomainfrom
codex/fast-incremental-optimize-audit

Conversation

@PrzemyslawKlys
Copy link
Member

When running pipeline with --fast, automatically scope optimize/audit to the HTML pages updated by the last build step (unless the step already specifies include patterns). WebSiteBuilder now reports UpdatedFiles in WebBuildResult. This reduces optimize/audit time dramatically on large sites during local iteration.

Copilot AI review requested due to automatic review settings February 8, 2026 08:01
@claude
Copy link

claude bot commented Feb 8, 2026

PR Review: Fast mode incremental optimize/audit

Summary

This PR implements an intelligent optimization for the --fast pipeline mode by tracking which files were updated during the build step and automatically scoping subsequent optimize/audit steps to only those changed files. This reduces processing time dramatically on large sites during local iteration.

Code Quality ✅

Strengths:

  • Clean architecture: The UpdatedFiles tracking is implemented using AsyncLocal<Action<string>?> pattern, which is thread-safe and doesn't require passing callbacks through method signatures
  • Consistent pattern: Follows the existing pattern used in WebAssetOptimizer.cs (PowerForge.Web/Services/WebAssetOptimizer.cs:324)
  • Proper encapsulation: Update tracking is localized within the helper methods (WriteAllTextIfChanged, CopyFileIfChanged, WriteAllLinesIfChanged)
  • Good test coverage: Updated test validates both positive case (files updated on first build) and negative case (no files updated on second build)
  • Defensive coding: The implementation includes null-safe operators and proper empty array defaults

Code style:

  • Follows C# conventions with proper readonly field usage
  • Consistent with repository's typed model approach (per CONTRIBUTING.md)
  • Appropriate use of StringComparison for cross-platform file path handling

Potential Issues & Concerns 🔍

1. AsyncLocal cleanup timing (Minor)

PowerForge.Web/Services/WebSiteBuilder.cs:67-71

var prevSink = UpdatedSink.Value;
UpdatedSink.Value = MarkUpdated;
try { ... }
finally { UpdatedSink.Value = prevSink; }

The AsyncLocal is properly restored in the finally block, which is good. However, consider documenting why AsyncLocal is needed here instead of a simpler instance field approach, since WebSiteBuilder appears to be called synchronously.

2. Case sensitivity edge case (Low)

PowerForge.Web/Services/WebSiteBuilder.cs:18-20

private static readonly StringComparison FileSystemPathComparison =
    OperatingSystem.IsWindows() || OperatingSystem.IsMacOS()
        ? StringComparison.OrdinalIgnoreCase
        : StringComparison.Ordinal;

This is correct, but note that macOS has case-insensitive filesystems by default (APFS is configurable). The current implementation is pragmatic and matches platform defaults.

3. Path normalization consistency (Low)

PowerForge.Web/Services/WebSiteBuilder.cs:309-312
The NormalizeRootPathForSink ensures trailing separator, but IsPathWithinRoot uses StartsWith. Consider edge cases:

  • Paths like C:\site and C:\site2 - the trailing separator prevents false positives ✅
  • UNC paths or network drives - should work correctly ✅

4. Duplicate filtering in HashSet (Minor optimization)

PowerForge.Web/Services/WebSiteBuilder.cs:51

var updated = new HashSet<string>(StringComparer.OrdinalIgnoreCase);

Good use of HashSet for deduplication. The final .OrderBy() at line 118 provides deterministic output.

Performance Considerations ✅

Benefits:

  • Dramatic speedup: On large sites, scoping optimize/audit to only updated files (vs. all files) can reduce iteration time from minutes to seconds
  • Smart defaults: Only activates when --fast flag is used AND no explicit include patterns are specified
  • Minimal overhead: HashSet operations and path comparisons are O(n) with small constant factors

Observations:

  • The Path.GetFullPath() calls (PowerForge.Web.Cli/WebPipelineRunner.cs:522, 710) are acceptable overhead since they only run once per optimize/audit step
  • File path filtering happens in-memory before I/O operations, maximizing benefit

Security Concerns ✅

No security issues identified:

  • Path operations use Path.GetFullPath() which normalizes and prevents path traversal
  • The IsPathWithinRoot check (PowerForge.Web/Services/WebSiteBuilder.cs:315-319) ensures files outside the output directory aren't tracked
  • No user-controlled input flows into the tracking mechanism without sanitization

Test Coverage ✅

PowerForge.Tests/WebSiteBuilderIncrementalBuildTests.cs:80,94

Assert.Contains(build1.UpdatedFiles, p => p.Equals("index.html", ...));
Assert.Empty(build2.UpdatedFiles);

Current coverage is good:

  • ✅ Files updated on initial build are tracked
  • ✅ No files updated on subsequent build when content unchanged
  • ✅ Integration test validates end-to-end behavior

Suggested additional tests:

  1. Test that files outside output directory are not tracked (security/correctness)
  2. Test that file path casing is normalized correctly on Windows vs Linux
  3. Test that the pipeline runner correctly scopes optimize/audit to updated HTML files
  4. Test behavior when lastBuildOutPath differs from optimize/audit siteRoot

Additional Observations

Documentation

The PR body provides clear explanation of the feature, but consider adding inline comments for:

  • Why AsyncLocal is used instead of simpler approaches
  • The relationship between build step and optimize/audit steps

Edge cases handled well:

  • Empty UpdatedFiles array when no changes (line 91)
  • Null-safe build.UpdatedFiles ?? Array.Empty<string>() (line 156)
  • Only activates when paths match (line 522, 710)
  • Filters to only HTML files (lines 524-527, 710-713)

Recommendations

  1. Add XML doc comment explaining the UpdatedSink AsyncLocal pattern
  2. Consider adding tests for the pipeline integration (optimize/audit scoping)
  3. Monitor performance in real-world scenarios to validate the speedup claims
  4. Document the feature in PowerForge.Web.Pipeline.md or user-facing documentation

Verdict: ✅ Approve with minor suggestions

This is a well-implemented performance optimization that follows established patterns in the codebase. The code quality is high, with proper error handling, thread safety, and test coverage. The feature provides clear value for iterative development workflows.

The concerns raised are minor and mostly related to documentation and additional test coverage rather than correctness issues. The implementation is production-ready.


Review generated with attention to security, performance, and maintainability

@PrzemyslawKlys PrzemyslawKlys merged commit af629d8 into main Feb 8, 2026
5 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the codex/fast-incremental-optimize-audit branch February 8, 2026 08:05
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8711bb0fe9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.ToArray();
if (updatedHtml.Length > 0)
{
include = string.Join(";", updatedHtml);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use comma-delimited include list for fast audit scope

When fast mode auto-scopes audit, this line builds include with semicolons (";"), but pipeline include parsing only splits on commas (CliPatternHelper.SplitPatterns in PowerForge.Web.Cli/CliPatternHelper.cs:10). With 2+ updated HTML files and no explicit include, the audit receives a single pattern like "a.html;b.html", which matches no files, so checks are effectively skipped and can falsely pass with only the "No HTML files found" warning.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds incremental “fast mode” behavior by tracking which output files were updated during the last build, then using that list to scope subsequent optimize/audit steps to only the updated HTML pages—reducing iteration time on large sites.

Changes:

  • Track per-build updated output files in WebSiteBuilder and return them via WebBuildResult.UpdatedFiles.
  • In CLI pipeline --fast mode, auto-scope optimize/audit HTML includes to the updated pages from the most recent build step (when the step doesn’t already specify include patterns).
  • Extend incremental build tests to assert UpdatedFiles behavior across consecutive builds.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
PowerForge.Web/Services/WebSiteBuilder.cs Introduces updated-file tracking via an AsyncLocal sink and returns UpdatedFiles in the build result.
PowerForge.Web/Models/WebSitePlan.cs Extends WebBuildResult model to include UpdatedFiles.
PowerForge.Web.Cli/WebPipelineRunner.cs Uses last build’s UpdatedFiles to scope optimize/audit when running with --fast.
PowerForge.Tests/WebSiteBuilderIncrementalBuildTests.cs Updates test to validate UpdatedFiles is populated on first build and empty on unchanged rebuild.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


WriteAllTextIfChanged(planPath, JsonSerializer.Serialize(plan, jsonOptions));
WriteAllTextIfChanged(specPath, JsonSerializer.Serialize(spec, jsonOptions));
var updated = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated is created with StringComparer.OrdinalIgnoreCase, which can collapse distinct paths on case-sensitive file systems (e.g., Linux) and produce an incorrect UpdatedFiles list. Use an OS-appropriate comparer (e.g., StringComparer.FromComparison(FileSystemPathComparison) or conditional Ordinal/OrdinalIgnoreCase) to match actual path semantics.

Suggested change
var updated = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var updated = new HashSet<string>(StringComparer.FromComparison(FileSystemPathComparison));

Copilot uses AI. Check for mistakes.
Comment on lines +520 to +522
if ((htmlInclude is null || htmlInclude.Length == 0) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(Path.GetFullPath(siteRoot), lastBuildOutPath, FileSystemPathComparison))
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast-mode scoping compares Path.GetFullPath(siteRoot) to lastBuildOutPath without normalizing trailing directory separators. If one path ends with a separator and the other doesn’t, the equality check can fail and skip incremental scoping. Normalize both sides (e.g., trim trailing separators or use a shared helper) before comparing.

Suggested change
if ((htmlInclude is null || htmlInclude.Length == 0) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(Path.GetFullPath(siteRoot), lastBuildOutPath, FileSystemPathComparison))
var siteRootFullPathNormalized = Path
.GetFullPath(siteRoot)
.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
var lastBuildOutPathNormalized = lastBuildOutPath?
.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if ((htmlInclude is null || htmlInclude.Length == 0) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(siteRootFullPathNormalized, lastBuildOutPathNormalized, FileSystemPathComparison))

Copilot uses AI. Check for mistakes.
Comment on lines +706 to +708
if (string.IsNullOrWhiteSpace(include) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(Path.GetFullPath(siteRoot), lastBuildOutPath, FileSystemPathComparison))
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast-mode scoping compares Path.GetFullPath(siteRoot) to lastBuildOutPath without normalizing trailing directory separators. If one path ends with a separator and the other doesn’t, the equality check can fail and skip incremental scoping. Normalize both sides (e.g., trim trailing separators or use a shared helper) before comparing.

Suggested change
if (string.IsNullOrWhiteSpace(include) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(Path.GetFullPath(siteRoot), lastBuildOutPath, FileSystemPathComparison))
var normalizedSiteRoot = Path.GetFullPath(siteRoot)
.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
var normalizedLastBuildOutPath = lastBuildOutPath?
.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if (string.IsNullOrWhiteSpace(include) &&
lastBuildUpdatedFiles.Length > 0 &&
string.Equals(normalizedSiteRoot, normalizedLastBuildOutPath, FileSystemPathComparison))

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants