Skip to content

Improve builder reliability: encoding, install compatibility, tag policy, and fast missing-analysis#100

Merged
PrzemyslawKlys merged 3 commits intomainfrom
feature/builder-improvements
Feb 13, 2026
Merged

Improve builder reliability: encoding, install compatibility, tag policy, and fast missing-analysis#100
PrzemyslawKlys merged 3 commits intomainfrom
feature/builder-improvements

Conversation

@PrzemyslawKlys
Copy link
Member

Summary

  • Add robust UTF-8 process encoding setup across process launch paths to avoid mojibake in pipeline/build output.
  • Add legacy flat-install handling/preservation options to config + cmdlets and resolve them through pipeline plan/run.
  • Move module signing later in pipeline (after import/tests) for fail-fast behavior.
  • Improve missing-command analysis with command caches and non-command token filtering.
  • Allow PlanOnly/WhatIf publish flows to skip strict on-disk package existence checks.
  • Add GitHub tag conflict policy + extra template tokens for tag naming.
  • Replace inline PowerShell in cmdlets with embedded script resources.
  • Update schema/docs and add regression/unit coverage.

Validation

  • dotnet build .\PowerForge\PowerForge.csproj -c Release (includes net472/net8.0/net10.0): PASS
  • dotnet test .\PowerForge.Tests\PowerForge.Tests.csproj -c Release: PASS (214 tests)
  • pwsh -NoProfile -File .\Build\Build-Module.ps1: PASS (self-build complete; Unicode/emoji output rendered correctly)
  • Full solution test currently blocked by unrelated restore hash issue in PowerForge.Web.Sample.App (NU1403 on Microsoft.NET.Sdk.WebAssembly.Pack.10.0.2).

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

…vements

# Conflicts:
#	Docs/PSPublishModule.ProjectBuild.md
#	Docs/PowerForge.Web.ApiDocs.md
#	Docs/PowerForge.Web.Pipeline.md
#	Docs/PowerForge.Web.Roadmap.md
#	PSPublishModule/Cmdlets/InvokeProjectBuildCommand.Helpers.cs
#	PSPublishModule/Cmdlets/InvokeProjectBuildCommand.cs
#	PowerForge.Tests/WebPipelineRunnerApiDocsPreflightTests.cs
#	PowerForge.Web.Cli/WebCliCommandHandlers.DocsCommands.cs
#	PowerForge.Web.Cli/WebCliHelpers.cs
#	PowerForge.Web.Cli/WebPipelineRunner.Tasks.Content.cs
#	PowerForge.Web.Cli/WebPipelineRunner.cs
#	PowerForge.Web/Models/WebLlmsResult.cs
#	PowerForge.Web/Services/WebApiDocsGenerator.cs
#	schemas/powerforge.web.pipelinespec.schema.json
@claude
Copy link

claude bot commented Feb 13, 2026

Pull Request Review: #100

Summary

This is a substantial PR with 5,035 additions and 157 deletions across 75 files. The changes improve builder reliability by addressing UTF-8 encoding, installation compatibility, GitHub tag conflict handling, and missing function analysis performance.


✅ Strengths

1. UTF-8 Encoding Handling (ProcessStartInfoEncoding.cs)

  • Excellent cross-framework compatibility: Uses reflection to support both .NET Framework 4.7.2 and modern .NET
  • Proper encoding setup: Uses UTF-8 without BOM to prevent mojibake in pipeline output
  • Safe fallback: Silently handles frameworks that don't expose encoding properties

2. Missing Functions Analyzer Improvements (MissingFunctionsAnalyzer.cs)

  • Command caching: Added _currentSessionCommandCache and _moduleScopeCommandCache to avoid repeated PowerShell invocations (lines 17-21)
  • Better token filtering: LooksLikeCommandName() (lines 358-366) filters out non-command tokens (braces, parentheses, variables starting with $)
  • Reserved word filtering: Excludes PowerShell keywords like if, foreach, try, etc. (lines 193-199)
  • Script block exclusion: Properly handles nested script blocks to avoid false positives

3. GitHub Tag Conflict Policy (ConfigurationEnums.cs)

  • Flexible conflict resolution: Three strategies (Reuse/Fail/AppendUtcTimestamp) provide good control
  • Idempotent by default: Reuse policy allows safe re-runs
  • Clear documentation: Enum comments explain each option well

4. Test Coverage

  • 214 tests passing: Comprehensive regression and unit coverage
  • Good test structure: Tests validate configuration, pipeline execution, and API docs generation
  • Multiple test categories: Regression tests, unit tests, and integration tests

⚠️ Issues & Concerns

HIGH PRIORITY

1. Performance: Large File I/O (WebXrefMapMerger.cs:78)

using var doc = JsonDocument.Parse(File.ReadAllText(file));
  • Issue: Loads entire JSON file into memory before parsing
  • Impact: Large xref maps (>100MB) will cause excessive memory allocation
  • Fix: Use streaming:
using var stream = File.OpenRead(file);
using var doc = JsonDocument.Parse(stream);

2. Security: JSON DoS Risk (WebXrefMapMerger.cs:78)

  • Issue: No size limits on JSON parsing
  • Impact: Malicious xref maps could consume unbounded memory
  • Fix: Add limits:
var options = new JsonDocumentOptions { MaxDepth = 64 };
using var doc = JsonDocument.Parse(stream, options);

3. Performance: Regex Timeout Too Short (WebXrefSupport.cs:20)

private static readonly TimeSpan RegexTimeout = TimeSpan.FromSeconds(1);
  • Issue: 1-second timeout may trigger RegexMatchTimeoutException on large markdown files (>10MB)
  • Impact: Pipeline failures on legitimate large documentation files
  • Fix: Increase to 5-10 seconds or make configurable

4. Error Handling: Swallowed Exceptions (WebSiteBuilder.Xref.cs:276)

catch
{
    // If comparison fails, fall back to writing
}
  • Issue: Bare catch with no logging makes debugging impossible
  • Fix: At minimum log the exception type, or catch specific exceptions

MEDIUM PRIORITY

5. Code Quality: Dictionary-Based Serialization (WebApiDocsGenerator.Xref.cs:39-57)

var payload = new Dictionary<string, object?> { ... };
  • Issue: Manual dictionary construction is error-prone and not type-safe
  • Fix: Create proper model classes:
public class WebXrefPayload
{
    public DateTime GeneratedAtUtc { get; set; }
    public AssemblyInfo Assembly { get; set; }
    // ...
}

6. Error Handling: Generic Exception Catching (WebXrefSupport.cs:163-172)

  • Issue: Catches all exceptions including OutOfMemoryException which shouldn't be caught
  • Fix: Catch specific exceptions (IOException, JsonException, etc.)

7. Performance: Multiple Regex Passes (WebXrefSupport.cs:94-98)

CollectMarkdownReferences(scrubbed, MarkdownXrefLinkRegex, ids);
CollectMarkdownReferences(scrubbed, MarkdownXrefAutoLinkRegex, ids);
CollectMarkdownReferences(scrubbed, MarkdownHtmlXrefLinkRegex, ids);
  • Issue: Three separate regex operations on the same content
  • Fix: Combine patterns or use a single multi-pattern approach

8. Resource Management: String Allocations (WebApiDocsGenerator.Xref.cs:494)

  • Issue: string.Concat(filtered) on parameter lists could be inefficient
  • Fix: Use StringBuilder for building parameter signatures

LOW PRIORITY

9. Test Coverage Gaps

  • Missing tests for malformed JSON/XML input
  • No tests for path traversal attempts
  • Limited edge case testing (special characters, very large files)
  • No performance/benchmark tests

10. Path Traversal Protection (WebXrefSupport.cs:329-343)

  • Current: Uses Path.GetFullPath() with prefix checking
  • Better: Use Path.GetRelativePath() for more robust validation:
var relativePath = Path.GetRelativePath(normalizedRoot, full);
if (relativePath.StartsWith("..")) return null;

📊 Security Assessment

Category Severity Finding
DoS HIGH No JSON size limits - unbounded memory consumption possible
DoS MEDIUM Regex timeout (1s) may be too short but provides some protection
XSS ✅ SAFE Proper HTML encoding used (HttpUtility.HtmlEncode)
Path Traversal LOW Protection exists but could be more robust
Information Disclosure LOW File paths in log messages (acceptable for dev builds)

🧪 Test Quality: 7.5/10

Positives:

  • 214 tests passing with comprehensive coverage
  • Good assertions on file I/O, JSON parsing, and exit codes
  • Tests validate both happy paths and error conditions
  • Proper use of temp directories with cleanup

Areas for Improvement:

  • Add negative tests for malformed input
  • Test path traversal scenarios
  • Add tests for very large files (performance)
  • Improve error message assertions (currently mostly success/failure)

🎯 Recommendations

Must Fix Before Merge:

  1. Switch to streaming JSON parsing in WebXrefMapMerger.cs
  2. Add JSON size limits with JsonDocumentOptions
  3. Increase regex timeout or make configurable
  4. Fix swallowed exceptions - at minimum add logging

Should Fix:

  1. Replace dictionary-based serialization with typed models
  2. Consolidate multiple regex passes in WebXrefSupport.cs
  3. Use StringBuilder for parameter signature building
  4. Catch specific exception types instead of generic Exception

Consider:

  1. Add negative tests (malformed input, path traversal)
  2. Add performance tests for large files
  3. Improve path traversal validation with Path.GetRelativePath()

📝 Additional Notes

Positive Aspects:

  • Well-documented: XML comments are thorough and helpful
  • Consistent style: Follows C# conventions throughout
  • Good separation of concerns: Each service has a clear responsibility
  • Backward compatibility: Reflection-based encoding setup maintains .NET Framework support

Architecture:

  • The new LegacyFlatModuleHandling enum provides good migration support
  • GitHubTagConflictPolicy gives users appropriate control over tag behavior
  • Caching in MissingFunctionsAnalyzer should significantly improve performance

✅ Overall Assessment: Approve with Changes

This PR makes valuable improvements to builder reliability and adds important features (xref support, improved encoding, better missing function analysis). The code quality is generally good with proper null handling, clear structure, and comprehensive tests.

However, the HIGH priority issues (streaming JSON, DoS protection, regex timeout) should be addressed before merging to production. The changes are well-tested and the architecture is sound.

Recommendation: Request changes for the 4 HIGH priority items, then approve.

@PrzemyslawKlys PrzemyslawKlys merged commit c6a033f into main Feb 13, 2026
1 check passed
@PrzemyslawKlys PrzemyslawKlys deleted the feature/builder-improvements branch February 13, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant