Skip to content

fix(install): write apm.cmd shim as ASCII (cmd.exe cannot parse UTF-16LE)#1522

Merged
danielmeppiel merged 1 commit into
mainfrom
danielmeppiel/fix-windows-shim-ascii-encoding
May 27, 2026
Merged

fix(install): write apm.cmd shim as ASCII (cmd.exe cannot parse UTF-16LE)#1522
danielmeppiel merged 1 commit into
mainfrom
danielmeppiel/fix-windows-shim-ascii-encoding

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

TL;DR

PR #1512 switched the Windows apm.cmd shim encoding to UTF-16LE with BOM on the assumption that cmd.exe auto-detects UTF-16 via the BOM. The Windows CI Test install.ps1 end-to-end job in run 26541947119 proves that wrong: invoking apm.cmd via PATH surfaces as garbled output (D:\a\apm\apm>��@) and exit code 1. Revert to ASCII; the %LOCALAPPDATA% literal token from #1509 keeps the embedded target ASCII-only, so the original non-ASCII profile-path bug stays fixed.

Problem (WHY)

Three failures on windows-latest in the post-merge CI for PR #1512:

[FAIL] Shim points into per-user releases dir (...Temp\apm-install-test-...\releases), not temp
[FAIL] apm.cmd --version exits 0 (got 1; output: D:\a\apm\apm>��@
[FAIL] apm.cmd --version reports v0.14.0

Two root causes:

  1. UTF-16LE BOM is not auto-detected by cmd.exe on PATH invocation. cmd.exe parses .cmd files through the system OEM/ANSI code page. The BOM bytes (FF FE) and LE-encoded @echo off get interpreted as ANSI and surface as >��@ garbage. cmd reports the prompt and exits 1.
  2. The E2E PowerShell test compared the shim text against the expanded release-dir path. On GitHub Windows runners, the test's temp prefix lives under %LOCALAPPDATA%\Temp\, so the install.ps1 if-branch fires and writes %LOCALAPPDATA%\Temp\... (the token form) instead of the expanded path. The string match fails.

Approach (WHAT)

  • Write apm.cmd with Set-Content -Encoding ASCII -NoNewline (the encoding that worked pre-fix(install): use %LOCALAPPDATA% literal in Windows shim (closes #1509) #1512).
  • The %LOCALAPPDATA% literal token from [BUG] The system cannot find the path specified. #1509 keeps the embedded shim target ASCII-only even when the user's profile directory contains non-ASCII characters -- so ASCII encoding does NOT re-introduce the [BUG] The system cannot find the path specified. #1509 bug for the default install location.
  • Custom APM_INSTALL_DIR roots with non-ASCII characters fall outside this contract (and were never tested before); the percent-escape and absolute-path fallback in the else branch are preserved as-is.
  • Relax the E2E "Shim points into per-user releases dir" assertion to accept either the expanded path or the %LOCALAPPDATA%-tokenized form.
  • Invert the encoding regression contract in test_windows_shim_template.py: trap UTF-16/UTF-8 writers, positively require -Encoding ASCII.

Implementation (HOW)

install.ps1 -- shim write call:

$shimContent = "@echo off`r`nREM Generated by install.ps1 (microsoft/apm#1509) -- do not hand-edit.`r`nREM cmd.exe expands %LOCALAPPDATA% at runtime; hand-edited paths break.`r`n`"$shimTarget`" %*`r`n"
Set-Content -Path $shimPath -Value $shimContent -Encoding ASCII -NoNewline

scripts/windows/test-install-script.ps1 -- accept tokenized form:

$expectedExpanded = $releaseRoot.Path
$expectedTokenized = $null
if ($localAppData -and $expectedExpanded.StartsWith($localAppData, ...)) {
    $expectedTokenized = "%LOCALAPPDATA%" + $expectedExpanded.Substring($localAppData.Length)
}
Assert-True ($matchesExpanded -or $matchesTokenized) ...

Validation evidence

  • All 8 regression tests in tests/unit/install/test_windows_shim_template.py pass locally.
  • Full lint chain silent: ruff check, ruff format --check, pylint R0801, scripts/lint-auth-signals.sh.
  • The actual Windows CI Test install.ps1 end-to-end job will run on this PR -- this is the only ground-truth verification for cmd.exe interpretation behavior and is what regressed previously.

Trade-offs

  • Custom APM_INSTALL_DIR roots with non-ASCII paths are not robustly supported. ASCII encoding may strip such characters. This matches pre-fix(install): use %LOCALAPPDATA% literal in Windows shim (closes #1509) #1512 behavior; the case was never tested. Users hitting it should set APM_INSTALL_DIR to an ASCII path or rely on the default %LOCALAPPDATA% location.
  • A truly bulletproof encoding for non-ASCII custom paths would require writing the shim in the active OEM code page ([System.Text.Encoding]::GetEncoding(0)), which differs between Windows PowerShell and PowerShell Core. Out of scope for this fix.

Refs #1509, #1512.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

…6LE)

PR #1512 switched the Windows shim encoding to UTF-16LE with BOM on
the assumption that cmd.exe auto-detects the encoding via the BOM.
The Windows CI 'Test install.ps1 end-to-end' job in run 26541947119
proves that assumption wrong: invoking apm.cmd via cmd /c surfaces
as garbled output ('D:\a\apm\apm>$@') and exit code 1. cmd.exe
parses .cmd files through the system OEM/ANSI code page when they
are invoked from PATH; the BOM is not enough to flip it into UTF-16
mode.

Fix: write the shim as ASCII. The %LOCALAPPDATA% literal token from
#1509 keeps the embedded shim target ASCII-only even when the user's
profile directory contains non-ASCII characters, so the original
#1509 motivation for a non-ASCII-safe encoding no longer applies for
the default install location. Custom APM_INSTALL_DIR roots with
non-ASCII paths fall outside this contract (and were not exercised
by the prior tests either) -- the percent-escape and absolute-path
fallback in the else branch are preserved.

Also relax the E2E PowerShell assertion 'Shim points into per-user
releases dir': on GitHub Windows runners the temp prefix lives under
%LOCALAPPDATA%, so the if-branch is taken and the shim text contains
the %LOCALAPPDATA% token rather than the expanded path. Accept
either form.

Regression tests updated:
- test_shim_not_written_with_utf16_encoding (new trap)
- test_shim_written_with_ascii_encoding (positive assertion)
- removed test_shim_uses_utf16le_with_bom_encoding /
  test_shim_not_written_with_ascii_encoding (inverted contract)

Lint chain silent (ruff check + format, pylint R0801,
lint-auth-signals.sh).

Refs #1509, #1512.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 27, 2026 22:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a Windows regression where apm.cmd was written in UTF-16LE (with BOM) and became unreadable by cmd.exe when invoked via PATH. It reverts the shim output to ASCII and updates tests to reflect the corrected encoding and %LOCALAPPDATA% tokenization behavior.

Changes:

  • Write apm.cmd using Set-Content -Encoding ASCII -NoNewline to keep the shim cmd.exe-compatible.
  • Relax the end-to-end Windows install test to accept either expanded or %LOCALAPPDATA%-tokenized shim targets.
  • Update the unit regression tests to forbid UTF-16/UTF-8 writers and positively assert ASCII encoding.
Show a summary per file
File Description
install.ps1 Switches shim writing back to ASCII and documents why UTF-16 is not reliable for cmd.exe batch parsing.
scripts/windows/test-install-script.ps1 Adjusts E2E assertion to accept tokenized %LOCALAPPDATA% form in shim content.
tests/unit/install/test_windows_shim_template.py Inverts regression contract: forbid UTF-16/UTF-8 writers and require -Encoding ASCII.
CHANGELOG.md Updates the prior Windows shim entry to describe ASCII encoding behavior.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

Comment thread install.ps1
# Write the shim as ASCII. cmd.exe interprets .cmd files via the system
# OEM/ANSI code page and does NOT reliably auto-detect UTF-16LE (even
# with a BOM) when batch files are invoked via PATH or double-click; a
# UTF-16 shim surfaces as garbled output like ">��@" and exit code 1.
Comment thread CHANGELOG.md
### Fixed

- `install.ps1` Windows installer now works on accounts whose profile directory contains non-ASCII characters (e.g. an accented username). Previously `apm --version` reported `The system cannot find the path specified.` because the generated `apm.cmd` shim baked in an ASCII-mangled path. The shim now embeds the literal `%LOCALAPPDATA%` token (cmd.exe expands at runtime) when the install root lives under the local profile, is written as UTF-16LE with BOM instead of ASCII, percent-escapes literal `%` characters in custom `APM_INSTALL_DIR` paths, and carries an advisory `REM` so hand-edits do not re-introduce the bug. (closes #1509) (#1512)
- `install.ps1` Windows installer now works on accounts whose profile directory contains non-ASCII characters (e.g. an accented username). Previously `apm --version` reported `The system cannot find the path specified.` because the generated `apm.cmd` shim baked in an ASCII-mangled path. The shim now embeds the literal `%LOCALAPPDATA%` token (cmd.exe expands at runtime) when the install root lives under the local profile, percent-escapes literal `%` characters in custom `APM_INSTALL_DIR` paths, and carries an advisory `REM` so hand-edits do not re-introduce the bug. The shim is written as ASCII (cmd.exe does not auto-detect UTF-16 .cmd files invoked through PATH, even with a BOM); the `%LOCALAPPDATA%` literal token keeps the embedded target ASCII-safe regardless of the user's profile path encoding. (closes #1509) (#1512)
@danielmeppiel danielmeppiel merged commit 61917d6 into main May 27, 2026
21 checks passed
@danielmeppiel danielmeppiel deleted the danielmeppiel/fix-windows-shim-ascii-encoding branch May 27, 2026 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants