Skip to content

Migrate Tasks.Git.GetUntrackedFiles to multithreaded task model#1735

Draft
jankratochvilcz wants to merge 7 commits into
dotnet:mainfrom
jankratochvilcz:mt/get-untracked-files
Draft

Migrate Tasks.Git.GetUntrackedFiles to multithreaded task model#1735
jankratochvilcz wants to merge 7 commits into
dotnet:mainfrom
jankratochvilcz:mt/get-untracked-files

Conversation

@jankratochvilcz

@jankratochvilcz jankratochvilcz commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrates Microsoft.Build.Tasks.Git.GetUntrackedFiles to the MSBuild multithreaded (MT) task model so it no longer relies on the process current working directory (CWD) for path resolution. See dotnet/msbuild#14093 for the MT task model.

Approach: reject non-fully-qualified paths (attribute-only)

Uses the same design as the sibling LocateRepository PR (#1734): instead of absolutizing paths via TaskEnvironment.GetAbsolutePath, non-fully-qualified paths are rejected. This keeps the migration attribute-only (no IMultiThreadableTask / TaskEnvironment).

Base RepositoryTask (byte-identical to the sibling PR)

GetOrCreateRepositoryInstance() rejects a non-fully-qualified initial path (via PathUtilities.IsPathFullyQualified, an #if NETFRAMEWORK polyfill of the BCL API that net472 lacks) and reports the "missing repository" warning.

GetUntrackedFiles — two ProjectDirectory consumption sites, both guarded

ProjectDirectory is used in two places, and both must be MT-safe:

  1. Repository discovery (base, only when RepositoryId is null): the initial path is ProjectDirectory; the base guard rejects it if not fully qualified.
  2. File ItemSpec base (Execute, on both paths): ProjectDirectory is the base for resolving the relative Files ItemSpecs in GitOperations.GetUntrackedFiles. When RepositoryId is set the base serves the repository from the BuildEngine4 cache and performs no initial-path validation, so GetUntrackedFiles.Execute adds its own guard rejecting a non-fully-qualified ProjectDirectory (with the same warning, honoring NoWarnOnMissingInfo). In the shipped targets RepositoryId is always set (Condition="'$(_GitRepositoryId)' != ''"), so this task-level guard is the load-bearing one.
  • Sin 1: [Output] UntrackedFiles are the original filtered Files items and keep their original ItemSpecs.

Behavior change (intentional)

A relative ProjectDirectory that previously resolved against the process CWD is now rejected → "missing repository" warning + UntrackedFiles left null (matching the established null-on-failure contract). In the shipped targets ProjectDirectory is $(MSBuildProjectDirectory) (always fully qualified), so no production scenario is affected.

Comment scope note (from review)

GitOperations.GetUntrackedFiles computes Path.GetFullPath(Path.Combine(projectDirectory, file.ItemSpec)). With projectDirectory guaranteed fully qualified this is MT-safe for relative or absolute item specs. A drive-/root-relative file.ItemSpec would cause Path.Combine to discard the base and remain CWD/drive-dependent — but @(Compile) never produces such specs, so it is out of contract (documented in the code comment).

Tests

  • AbsoluteProjectDirectory_ReportsIgnoredFileAsUntracked — historical behavior: an absolute ProjectDirectory locates the repo and reports exactly the ignored file.
  • RelativeProjectDirectory_IsRejected_IndependentOfProcessCwd — CWD pointed at a real repo, relative ProjectDirectory (no RepositoryId) → rejected end-to-end (no CWD resolution), warning, output null.
  • RelativeProjectDirectory_OnCachedRepository_IsRejected — primes the shared repository cache via a real LocateRepository run, then reuses it with RepositoryId set + a relative ProjectDirectory, exercising the task-level guard on the cached path (fails if that guard is reverted).
  • A minimal inline IBuildEngine4 mock (the shared MockEngine only implements IBuildEngine); assembly test parallelization disabled for the CWD-mutating test.

Merge coordination

PathUtilities.cs, PathUtilitiesTests.cs, RepositoryTask.cs, and the test-assembly AssemblyInfo.cs are byte-identical to #1734, so git auto-resolves "both branches added the same content" when the second PR merges — no conflict.

Validation

  • ./build.sh --build → 0 warnings, 0 errors (both net472 and modern TFM compile).
  • dotnet test src/Microsoft.Build.Tasks.Git.UnitTests -f net11.0 → 449 passed, 4 skipped (Windows-only), 0 failed.

Related: dotnet/msbuild#14093.

Migrate `Microsoft.Build.Tasks.Git.GetUntrackedFiles` to the MSBuild
multithreaded (MT) task execution model so it no longer depends on the
process current working directory for path resolution.

Base RepositoryTask:
- Implement `IMultiThreadableTask` and add
  `TaskEnvironment TaskEnvironment { get; set; } = TaskEnvironment.Fallback;`
  (default initializer preserves existing single-process/test semantics).
- In `GetOrCreateRepositoryInstance()`, resolve the initial path via
  `TaskEnvironment.GetAbsolutePath(initialPath)` and pass the resulting
  absolute path to `GitRepository.TryFindRepository`. `TryFindRepository`
  canonicalizes with `Path.GetFullPath` internally, which is MT-safe and
  behavior-preserving when given an already-absolute base.
- Sin 2: both missing-repository warning call sites keep the ORIGINAL
  (possibly relative) `initialPath` for diagnostics.
- Edge: for `[Required, NotNull]` initial paths an empty string makes
  `GetAbsolutePath("")` throw `ArgumentException`, matching the prior
  `Path.GetFullPath("")` behavior. This base change is shared with the
  sibling LocateRepository migration and is intentionally minimal.

GetUntrackedFiles + GitOperations:
- Add `[MSBuildMultiThreadableTask]` to `GetUntrackedFiles`.
- In `Execute(GitRepository)`, absolutize `ProjectDirectory` via
  `TaskEnvironment.GetAbsolutePath` and pass the absolute string to
  `GitOperations.GetUntrackedFiles`, so the inner
  `Path.GetFullPath(Path.Combine(projectDirectory, file.ItemSpec))`
  operates on an absolute base (MT-safe). The `GitOperations` signature is
  unchanged (callers now pass an already-absolute path); the internal
  testing overload is preserved.
- Sin 1: the absolutized project directory is used only for the ignore
  check; `[Output] UntrackedFiles` items are the original filtered `Files`
  items and keep their original ItemSpecs.

Call-chain audit: from `Execute` through `GitOperations.GetUntrackedFiles`,
`BuildDirectoryTree`, `GetContainingRepositoryMatcher`,
`IsNormalizedFilePathIgnored` and submodule handling, every downstream
`Path.GetFullPath` operates on an already-absolute base (the absolutized
project directory or the repository working directory derived from absolute
discovery). No `Directory.GetCurrentDirectory()`/`Environment.CurrentDirectory`
or direct env mutation remains in the chain.

Test: add `GetUntrackedFilesTests` (decoy-CWD pattern). It runs the full
task against a real on-disk git repo while the process CWD points at an
unrelated decoy directory, with `ProjectDirectory` and `Files` relative and
`TaskEnvironment` set to the repo directory. The single assertion catches
reverting either half of the migration (base discovery or call-site
absolutization). Includes a minimal `IBuildEngine4` mock and an
auto-restoring CWD helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jankratochvilcz jankratochvilcz requested a review from tmat as a code owner June 30, 2026 13:57
jankratochvilcz and others added 2 commits June 30, 2026 16:00
…llelization

Make the shared RepositoryTask base migration byte-identical to the
LocateRepository PR to avoid a needless merge conflict, and disable
assembly-level test parallelization since the new test mutates the
process current working directory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Wrap GetAbsolutePath + TryFindRepository so an empty/null initial path
  reports a missing-repository warning and returns gracefully instead of
  throwing an unhandled ArgumentException (preserves pre-migration behavior;
  ExecuteImpl's catch does not handle ArgumentException). Shared base kept
  byte-identical with mt/locate-repository (dotnet#1734).
- Rewrite the parity comment to describe the control-flow equivalence.
- Add EmptyProjectDirectory_DegradesGracefully regression test.
- Make AssemblyInfo.cs parallelization comment test-agnostic (was wrongly
  naming LocateRepositoryTests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jankratochvilcz

Copy link
Copy Markdown
Contributor Author

Review feedback addressed

Thanks for the MT + expert reviews. Pushed e133a13:

MINOR — empty/null ProjectDirectory graceful → crash divergence (shared base, Sin 6). Fixed in the shared RepositoryTask base (kept byte-identical with #1734). Post-migration GetAbsolutePath("") threw ArgumentException before TryFindRepository, escaping ExecuteImpl's IOException/InvalidDataException/NotSupportedException catch (→ MSB4018). Pre-migration the empty path reached TryFindRepository's internal Path.GetFullPath(""), which threw the same exception but had it swallowed → graceful warning + Execute() == true. Now wrapped in try { … } catch (ArgumentException) { ReportMissingRepositoryWarning(initialPath); return null; }.

I empirically verified GetAbsolutePath throws only on null/"", not whitespace-only, so the reachable case is empty string via direct Execute(). The misleading parity comment was rewritten to state the control-flow (not just exception-type) equivalence.

NIT — AssemblyInfo.cs comment named LocateRepositoryTests. Reworded test-agnostically (the directive applies to whichever test mutates CWD), keeping the file byte-identical with #1734.

Regression test. Added EmptyProjectDirectory_DegradesGracefully_DoesNotThrow (asserts Execute() == true, no errors, null output). Dual-fault verified against the base revert.

GitEnvironment.Create reading process env directly (MT call-chain gap). Acknowledged — this is pre-existing foundation/shared code (reached only when ConfigurationScope is unset; these are machine-global vars, not per-project). The test pins ConfigurationScope = "local". Out of scope for this task migration; suggest a follow-up issue to route GitEnvironment discovery through TaskEnvironment.

All 437 Microsoft.Build.Tasks.Git.UnitTests (net11.0) pass; full build clean. Sin 1 (output ItemSpecs unchanged) and Sin 5 (canonicalization preserved) verified by the reviewers.

// process CWD before it is used as the base for resolving the (relative) file ItemSpecs inside
// GitOperations.GetUntrackedFiles. The absolutized path is only used internally for the ignore
// check; the returned [Output] items keep their original ItemSpecs (Sin 1).
AbsolutePath absProjectDir = TaskEnvironment.GetAbsolutePath(ProjectDirectory);

@tmat tmat Jul 1, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ProjectDirectory is already expected to be an absolute path. An error can be reported if it is not.

jankratochvilcz and others added 4 commits July 3, 2026 13:10
…dFiles)

Align with the sibling LocateRepository PR: make the MT migration
attribute-only by rejecting non-fully-qualified paths rather than
absolutizing them via TaskEnvironment.

- Share the byte-identical RepositoryTask base and IsPathFullyQualified
  polyfill (PathUtilities) + tests with the LocateRepository PR, so the
  two PRs auto-merge without conflict.
- GetUntrackedFiles independently rejects a non-fully-qualified
  ProjectDirectory in Execute: when RepositoryId is provided the base
  serves the repository from the cache and skips initial-path validation,
  yet ProjectDirectory is still consumed as the base for resolving file
  item specs, so it must be guarded there too.
- Drop IMultiThreadableTask/TaskEnvironment; keep [MSBuildMultiThreadableTask]
  on the concrete task.
- Tests cover historical behavior (absolute dir reports ignored file),
  CWD-independence (relative dir rejected in base), and the cached-path
  guard (relative dir rejected in Execute after priming the cache).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Soften the GitOperations MT-safety comment: Path.Combine discards the
  base when a file ItemSpec is itself drive-/root-relative, so such specs
  (never produced by @(Compile)) remain out of contract.
- Correct the RelativeProjectDirectory_IsRejected_IndependentOfProcessCwd
  test docstring: on the RepositoryId-null path either the base or the
  task guard suffices, so it documents end-to-end behavior rather than
  isolating the base guard (covered by the sibling PR and the cached-path
  test).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the inline IBuildEngine4 test engine with the shared TestUtilities.MockEngine4
and convert list-based assertions to the shared engine's Log surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants