diff --git a/build.gradle b/build.gradle index b04a70a0c..04298b5a4 100644 --- a/build.gradle +++ b/build.gradle @@ -288,7 +288,15 @@ tasks.register('testModule', Test) { useJUnitPlatform { includeTags 'module' } - + + // Tests that spawn a child perl (via $^X, e.g. System::Command) + // need PERLONJAVA_EXECUTABLE set to a real launcher, since under + // Gradle "jperl" isn't on $PATH. GlobalContext reads this env var + // when initializing $^X. + def jperlLauncher = org.gradle.internal.os.OperatingSystem.current().isWindows() + ? "jperl.bat" : "jperl" + environment 'PERLONJAVA_EXECUTABLE', file(jperlLauncher).absolutePath + shouldRunAfter testUnit } diff --git a/dev/modules/git_modules_support.md b/dev/modules/git_modules_support.md new file mode 100644 index 000000000..82e0d4486 --- /dev/null +++ b/dev/modules/git_modules_support.md @@ -0,0 +1,258 @@ +# Git CPAN Module Support on PerlOnJava + +## Goal + +Make Git-related CPAN modules (`Git::Wrapper`, `Git::Repository`, and +transitively `System::Command`) pass their test suites on PerlOnJava. + +`Git::Raw` is out of scope — it's an XS wrapper around libgit2 and requires +a from-scratch JGit-backed port. See the summary at the end. + +## Motivation + +Users asked about Git CPAN module support. `./jcpan -t Git::Raw` fails because +Git::Raw is XS-only. `./jcpan -t Git::Wrapper` and `./jcpan -t Git::Repository` +install cleanly (both are pure Perl) but fail at runtime: + +- **Git::Wrapper**: 5/57 subtests fail across 2 test files. +- **Git::Repository**: most tests silently skip with + `fork() not supported on this platform (Java/JVM)` because its core dependency + `System::Command` uses `fork + exec`. + +## Investigation Results + +### Root cause #1 — `local $tied_scalar = value` does not dispatch through STORE + +This is the **primary bug** blocking `Git::Wrapper`. `File::chdir` exports a +tied scalar `$CWD` whose STORE calls `chdir`. `Git::Wrapper::RUN` uses +`local $CWD = $self->dir` to scope the cwd change around each `git` invocation. + +Minimal reproduction: + +```perl +package T; +sub TIESCALAR { bless [], shift } +sub FETCH { print "FETCH\n"; $_[0][0] } +sub STORE { print "STORE: $_[1]\n"; $_[0][0] = $_[1] } +package main; +our $x; tie $x, "T"; +$x = "direct"; # STORE called — OK +{ local $x = "scoped"; } # PerlOnJava: NO STORE, NO FETCH. +``` + +| | Real Perl | PerlOnJava | +|---|---|---| +| `local $x = value` (entering scope) | FETCH → STORE "" → STORE value | (silent; tie is bypassed) | +| Inside scope, reading `$x` | FETCH returns "scoped" | returns "scoped" from plain slot | +| Leaving scope | STORE original | (silent) | + +**Consequence for Git::Wrapper**: every `local $CWD = $self->dir` is a no-op, +so every `git` subcommand runs in the process's original cwd (the build +directory, which contains 32 files). `ls_files` returns those 32 files, `add +.` stages them, and so on. This explains all 5 basic.t failures and the +2 path_class.t failures in one go. + +**Location**: `src/main/java/org/perlonjava/runtime/runtimetypes/GlobalRuntimeScalar.java` +— `dynamicSaveState()` replaces the slot in `GlobalVariable.globalVariables` +with a fresh untied `GlobalRuntimeScalar`, dropping the `TieScalar` magic. +`dynamicRestoreState()` restores the original slot verbatim. + +**Fix direction**: +1. In `dynamicSaveState()`, when the original slot's `value` is a `TieScalar`, + call `tiedFetch()` to save the current FETCH result, then call `tiedStore()` + with the empty string (matching real Perl's "clear first" behaviour). +2. Do **not** replace the slot — keep the same tied scalar throughout the + localized scope so that assignments go through `tiedStore`. +3. In `dynamicRestoreState()`, call `tiedStore()` with the saved FETCH value + to restore. +4. If instead we keep the "replace slot" approach, the new slot must be a + freshly-constructed `TieScalar` bound to the same handler object so STORE + still dispatches. This is trickier because `local` restore then has to + re-install the original tied scalar. + +Strategy (1)+(2)+(3) is closest to real Perl semantics and avoids copying +the tie magic. It's also consistent with how non-global tied scalars would +need to behave. + +### Root cause #2 — `System::Command` can't spawn without `fork` + +`System::Command::_spawn` is a hand-rolled `pipe + fork + exec` using +`Symbol::gensym`, `pipe`, `fork`, `setpgrp`, `fcntl(F_GETFD)`, and +`exec { $cmd[0] } @cmd`. PerlOnJava has no `fork`, so this path dies. +`Git::Repository` depends on it; its test suite detects the failure and +skips almost every test. + +**Fix direction — two options, not mutually exclusive**: + +**Option A (preferred): ship a patched `System/Command.pm`** in +`src/main/perl/lib/` that replaces `$_spawn` with a code path using +`IPC::Open3::open3` (already implemented on PerlOnJava via `ProcessBuilder`; +see `dev/modules/ipc_open3_fix.md`). We need: + +- `pid`, `stdin`, `stdout`, `stderr` handles that behave like the fork + versions. `open3` already gives us all four. +- `cwd`, `env` options — implement via `ProcessBuilder` directly? Current + `IPC::Open3` does not expose cwd/env. We may need a thin Java helper, or + wrap `open3` with a `chdir ... open3 ... chdir back` shim (env via + `local %ENV = (%ENV, %override)` is fine). +- `setpgrp` is a no-op on JVM — OK. +- `interactive` mode uses `system`, which already works. +- `System::Command::Reaper` wraps the handles and `waitpid`s on close. Should + still work since `open3` returns a real PID that `waitpid` understands + (verified by Git::Wrapper's working test cases — it waitpids on an open3 + PID). + +**Option B: upstream-compatible shim** — only monkey-patch `_spawn` without +changing the rest. Less invasive, easier to keep in sync with CPAN. + +We'll prototype Option B first (smallest patch), fall back to Option A if +there are integration issues. + +### Root cause #3 — `Git::Repository` calls uninitialized values + +After System::Command failure, `Git::Repository->new` continues without a +`git_dir` and prints warnings like: + +``` +Use of uninitialized value in join or string at Git/Repository.pm line 99. +Use of uninitialized value in join or string at Git/Repository.pm line 102. +``` + +These disappear once System::Command works. Not a PerlOnJava bug — just +downstream fallout. + +## Plan + +### Phase 1 — Fix `local $tied_scalar = value` (unblocks Git::Wrapper) + +1. Reproduce with a minimal unit test under `src/test/resources/unit/` that + counts STORE/FETCH calls on a tied scalar inside a `local` scope. Ensure + the test matches the sequence observed in system `perl`: + `FETCH, STORE "", STORE value, FETCH (inside), STORE original_value`. +2. Modify `GlobalRuntimeScalar.dynamicSaveState/dynamicRestoreState` to + detect `TieScalar` and route through `tiedFetch`/`tiedStore` instead of + swapping the slot. +3. Verify the tied magic is preserved inside the scope (assignments still + dispatch STORE) and restored on exit. +4. Re-run `./jcpan -t Git::Wrapper`. Expect basic.t and path_class.t to + recover. Confirm with `make` (must stay green). + +### Phase 2 — Patch `System::Command` (unblocks Git::Repository) + +1. Copy `System/Command.pm` into `src/main/perl/lib/System/Command.pm` as a + baseline. +2. Rewrite `$_spawn` (and the `MSWin32` special case) to route through + `IPC::Open3::open3`, handling `cwd`, `env`, and `input` options. +3. Audit `System::Command::Reaper` for fork-specific assumptions. If `waitpid` + on an open3 PID + `close` on the handles is sufficient, leave it alone. +4. Remove the `fork() not supported` skip guard in `System/Command.pm` (or + leave untouched if the new path doesn't trigger it). +5. Re-run `./jcpan -t System::Command` first (simpler surface), then + `./jcpan -t Git::Repository`. +6. Iterate on any remaining issues — e.g., `setpgrp`, `trace`, signal + handling — but treat these as optional if the happy path passes. + +### Phase 3 — Follow-up housekeeping + +- Update `docs/FEATURE_MATRIX.md` (or equivalent) noting Git::Wrapper and + Git::Repository as supported, Git::Raw as unsupported. +- Add a short note to `AGENTS.md` about the `local $tied = ...` fix so + others know it now works. +- Consider opening issues for any residual `System::Command` options we + don't bother supporting (e.g., `setpgrp`) so users know what's missing. + +## Out of scope — `Git::Raw` + +Git::Raw bundles libgit2 + zlib + pcre + http-parser source and requires a +C compiler. PerlOnJava would need a JGit-backed Perl module that +re-implements ~40 classes' worth of Git::Raw's API. Comparable in size to +the Crypt::OpenSSL Bouncy Castle port. Defer until asked. + +## Progress Tracking + +### Current Status: Phase 2 complete (with caveat); Phase 3 pending + +### Completed Phases + +- [x] **Phase 1 — `local $tied_scalar = value` (2026-04-22)** + - Unit test added in `src/test/resources/unit/tie_scalar.t` asserting + STORE dispatch on entry, inside, and exit of a localized scope. + - `GlobalRuntimeScalar.dynamicSaveState/dynamicRestoreState` now detects + `TIED_SCALAR` and keeps the tie in place, dispatching `STORE(undef)` + on entry and `STORE(savedValue)` on exit (matches real Perl). + - **`./jcpan -t Git::Wrapper`: 75/75 pass** (was 52/57). + - `make` stays green. + +- [x] **Phase 2 — `System::Command` IPC::Open3 fallback (2026-04-22)** + - Patched `_spawn` in `System/Command.pm`: on PerlOnJava (detected via + `$Config{perlonjava}`), route through `IPC::Open3::open3` instead of + the manual pipe+fork+exec. `cwd` and `env` are already handled by + the caller via `chdir` and `local %ENV`. + - Bundled patched `System/Command.pm` + unchanged `System/Command/Reaper.pm` + in `src/main/perl/lib/` so fresh installs get the working version. + - **`./jcpan -t Git::Repository`: 304/328 pass (93%)**, previously almost + fully skipped with `fork() not supported`. + - **`./jcpan -t System::Command`: 132/140 pass (94%)**. Remaining 8 are + a `$ENV{SHLVL}` mismatch — IPC::Open3 appears to wrap the child in a + shell, bumping SHLVL. Unrelated to fork, low priority. + +- [x] **Phase 2b — open3 exec-failure + waitpid SIG{CHLD}=IGNORE (2026-04-22)** + - Real-Perl parity for `IPC::Open3` shell-wrapping: bare single-arg + commands now exec directly instead of via `/bin/sh -c`, so exec + failures surface as `open3: exec of X failed: Y`. The bundled + `System/Command.pm` translates that into the fork-path's + `Can't exec( @cmd ): Y` so `eval`-based error tests keep working. + - `WaitpidOperator` now honours `$SIG{CHLD} eq 'IGNORE'`: when set, + waitpid on a tracked Java child returns -1 (ECHILD simulation) + and leaves `$?` untouched, so `System::Command::Reaper` reports + the `(-1, -1, -1)` BOGUS triple the POSIX semantics call for. + - Also fixed a real `ConcurrentModificationException` in + `GlobalDestruction.runGlobalDestruction` uncovered while + repro'ing t/30-exit.t: DESTROY callbacks could mutate the global + HashMaps mid-iteration. Snapshot before walking. + - **`./jcpan -t System::Command`: 230/241 pass (95.4%)**, up from 94%. + - t/11-spawn-fail.t 2/2 (was 0/2) + - t/20-zombie.t 31/32 (was 27/30) + - Remaining 11 are 9 DESTROY-scope tests (tracked on the + `weaken/DESTROY` branch) + 1 `-Ilib` test-harness artifact + + 1 related zombie-DESTROY assertion. + +### Caveat: install-time precedence + +`@INC` lists `~/.perlonjava/lib` before the JAR-bundled lib. If a user +has previously run `./jcpan -i System::Command`, their installed copy +shadows the bundled patched version. Workarounds: + +1. Manually remove `~/.perlonjava/lib/System/Command.pm` — the bundled + version will then load. +2. **Future**: teach `jcpan`/MakeMaker to apply the `_spawn` patch during + install for modules listed in a patch registry. See + `dev/modules/cpan_patch_plan.md` for a broader strategy. + +For this PR we accept that existing users need workaround (1). New +installations Just Work out of the box. + +### Next Steps + +- Phase 3 housekeeping: update `docs/FEATURE_MATRIX.md` if it mentions + Git modules. +- Optional: investigate the `SHLVL` mismatch in `System::Command` tests. +- Optional: investigate the 24 remaining Git::Repository subtest failures + (they look like minor edge cases — `hello redefined` warnings, version + parse variants, etc.). +- Optional: implement an install-time patching mechanism so that + `~/.perlonjava/lib/System/Command.pm` is auto-patched after + `jcpan -i`. + +### Open Questions + +- Should the bundled copy override the user install instead? Would require + reordering `@INC` (JAR first) for specific paths — risky as it would + break legitimate user upgrades of other modules. + +## Related Docs + +- `dev/modules/ipc_open3_fix.md` — prior work on IPC::Open3 / IO::Select. +- `dev/modules/xs_fallback.md` — XS/C handling in MakeMaker. +- `dev/modules/cpan_patch_plan.md` — broader strategy for patching CPAN + modules on PerlOnJava. diff --git a/src/main/java/org/perlonjava/core/Configuration.java b/src/main/java/org/perlonjava/core/Configuration.java index ed92af470..6a78ddb0d 100644 --- a/src/main/java/org/perlonjava/core/Configuration.java +++ b/src/main/java/org/perlonjava/core/Configuration.java @@ -33,7 +33,7 @@ public final class Configuration { * Automatically populated by Gradle/Maven during build. * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String gitCommitId = "bd326524c"; + public static final String gitCommitId = "687b74120"; /** * Git commit date of the build (ISO format: YYYY-MM-DD). @@ -48,7 +48,7 @@ public final class Configuration { * Parsed by App::perlbrew and other tools via: perl -V | grep "Compiled at" * DO NOT EDIT MANUALLY - this value is replaced at build time. */ - public static final String buildTimestamp = "Apr 22 2026 13:43:46"; + public static final String buildTimestamp = "Apr 22 2026 14:52:02"; // Prevent instantiation private Configuration() { diff --git a/src/main/java/org/perlonjava/runtime/operators/WaitpidOperator.java b/src/main/java/org/perlonjava/runtime/operators/WaitpidOperator.java index 8cb7a6dfc..3186e3600 100644 --- a/src/main/java/org/perlonjava/runtime/operators/WaitpidOperator.java +++ b/src/main/java/org/perlonjava/runtime/operators/WaitpidOperator.java @@ -10,6 +10,7 @@ import java.util.Map; import java.util.concurrent.ConcurrentHashMap; +import static org.perlonjava.runtime.runtimetypes.GlobalVariable.getGlobalHash; import static org.perlonjava.runtime.runtimetypes.GlobalVariable.getGlobalVariable; import static org.perlonjava.runtime.runtimetypes.RuntimeContextType.SCALAR; @@ -40,6 +41,23 @@ public static RuntimeScalar waitpid(int ctx, RuntimeBase... args) { } } + /** + * Returns true if $SIG{CHLD} is currently set to 'IGNORE'. Under + * that disposition POSIX mandates the kernel auto-reap children, so + * subsequent waitpid() returns -1 with errno=ECHILD (see waitpid(2) + * on Linux, and Perl's perlipc documentation). We simulate that so + * test suites like System::Command's t/20-zombie.t can observe the + * expected "BOGUS exit status" pattern instead of a clean reap. + */ + private static boolean isChldIgnored() { + try { + RuntimeScalar h = getGlobalHash("main::SIG").get("CHLD"); + return h != null && "IGNORE".equals(h.toString()); + } catch (Exception e) { + return false; + } + } + private static RuntimeScalar waitpidPosix(int pid, int flags) { if (pid > 0) { Process javaProcess = RuntimeIO.getChildProcess(pid); @@ -71,16 +89,25 @@ private static RuntimeScalar waitpidPosix(int pid, int flags) { private static RuntimeScalar waitpidJavaProcess(int pid, Process process, int flags) { boolean nonBlocking = (flags & WNOHANG) != 0; + boolean chldIgnore = isChldIgnored(); if (nonBlocking) { if (process.isAlive()) return new RuntimeScalar(0); int exitCode = process.exitValue(); RuntimeIO.removeChildProcess(pid); + if (chldIgnore) { + // Kernel auto-reap simulation: discard status, signal + // ECHILD by returning -1. Do NOT update $?. + return new RuntimeScalar(-1); + } setExitStatus(exitCode << 8); return new RuntimeScalar(pid); } try { int exitCode = process.waitFor(); RuntimeIO.removeChildProcess(pid); + if (chldIgnore) { + return new RuntimeScalar(-1); + } setExitStatus(exitCode << 8); return new RuntimeScalar(pid); } catch (InterruptedException e) { diff --git a/src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java b/src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java index b55237d3d..43b5f7953 100644 --- a/src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java +++ b/src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java @@ -61,6 +61,51 @@ private static void copyPerlEnvToProcessBuilder(ProcessBuilder processBuilder) { } } + /** + * Shell-metacharacter sniffer used to decide whether a single-arg + * command should be handed to `/bin/sh -c` or exec'd directly. + * Mirrors the heuristic real Perl uses in IPC::Open3 / system(): + * any of these characters cause shell invocation; otherwise we + * exec directly so that exec-failure can be caught by the caller. + */ + private static boolean hasShellMetacharacters(String s) { + for (int i = 0; i < s.length(); i++) { + char c = s.charAt(i); + if (Character.isWhitespace(c)) return true; + switch (c) { + case '|': case '&': case ';': case '<': case '>': + case '(': case ')': case '$': case '`': case '\\': + case '"': case '\'': case '*': case '?': case '[': + case ']': case '{': case '}': case '!': case '#': + case '~': case '=': + return true; + default: + } + } + return false; + } + + /** + * Translate a java.io.IOException from ProcessBuilder.start() into + * a short, Perl-style $! string. We pattern-match rather than chain + * through strerror because the JDK's wording varies per-platform. + */ + private static String translateIOError(java.io.IOException ioe) { + String msg = ioe.getMessage(); + if (msg == null) return "Exec failed"; + if (msg.contains("error=2") + || msg.contains("No such file or directory")) { + return "No such file or directory"; + } + if (msg.contains("error=13") || msg.contains("Permission denied")) { + return "Permission denied"; + } + // Strip the "(in directory ...)" decoration the JDK tacks on. + int cut = msg.indexOf(" (in directory "); + if (cut > 0) return msg.substring(0, cut); + return msg; + } + /** * Register child process for waitpid() - handles both Windows and POSIX. */ @@ -108,12 +153,21 @@ public static RuntimeList _open3(RuntimeArray args, int ctx) { // Build the command String[] command; if (commandList.size() == 1) { - // Single string - use shell + // Single-element @cmd: follow real Perl's IPC::Open3 + // rule — wrap in a shell only if the string contains + // shell metacharacters. Bare executable names go + // direct, which matches `exec { $cmd[0] } @cmd` in the + // fork path and lets us surface exec-failure errors + // (System::Command's t/11-spawn-fail.t depends on this). String cmd = commandList.get(0); - if (IS_WINDOWS) { - command = new String[]{"cmd.exe", "/c", cmd}; + if (hasShellMetacharacters(cmd)) { + if (IS_WINDOWS) { + command = new String[]{"cmd.exe", "/c", cmd}; + } else { + command = new String[]{"/bin/sh", "-c", cmd}; + } } else { - command = new String[]{"/bin/sh", "-c", cmd}; + command = new String[]{cmd}; } } else { // Multiple arguments - direct execution @@ -140,7 +194,21 @@ public static RuntimeList _open3(RuntimeArray args, int ctx) { } // Start the process - Process process = processBuilder.start(); + Process process; + try { + process = processBuilder.start(); + } catch (java.io.IOException ioe) { + // Match real Perl's IPC::Open3 error phrasing so callers + // (notably System::Command, which croaks $@ and matches + // `qr/^Can't exec\( ... \): /` in its test suite) can + // recognise the failure. We also preserve the Java + // errno-ish detail after the colon. + String cmd0 = commandList.get(0); + String detail = translateIOError(ioe); + getGlobalVariable("main::!").set(detail); + throw new RuntimeException( + "open3: exec of " + cmd0 + " failed: " + detail); + } long pid = process.pid(); // Register the process for waitpid() - works on both Windows and POSIX @@ -176,6 +244,10 @@ public static RuntimeList _open3(RuntimeArray args, int ctx) { return new RuntimeScalar(pid).getList(); + } catch (RuntimeException re) { + // Our inner throw already carries the correct "open3: ..." + // prefix; don't double-wrap. + throw re; } catch (Exception e) { getGlobalVariable("main::!").set(e.getMessage()); throw new RuntimeException("open3: " + e.getMessage()); @@ -336,12 +408,17 @@ public static RuntimeList _open2(RuntimeArray args, int ctx) { // Build the command String[] command; if (commandList.size() == 1) { - // Single string - use shell + // Single-element @cmd: only wrap in a shell if the + // string has shell metacharacters (same rule as open3). String cmd = commandList.get(0); - if (IS_WINDOWS) { - command = new String[]{"cmd.exe", "/c", cmd}; + if (hasShellMetacharacters(cmd)) { + if (IS_WINDOWS) { + command = new String[]{"cmd.exe", "/c", cmd}; + } else { + command = new String[]{"/bin/sh", "-c", cmd}; + } } else { - command = new String[]{"/bin/sh", "-c", cmd}; + command = new String[]{cmd}; } } else { // Multiple arguments - direct execution @@ -359,7 +436,16 @@ public static RuntimeList _open2(RuntimeArray args, int ctx) { processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT); // Start the process - Process process = processBuilder.start(); + Process process; + try { + process = processBuilder.start(); + } catch (java.io.IOException ioe) { + String cmd0 = commandList.get(0); + String detail = translateIOError(ioe); + getGlobalVariable("main::!").set(detail); + throw new RuntimeException( + "open2: exec of " + cmd0 + " failed: " + detail); + } long pid = process.pid(); // Register the process for waitpid() - works on both Windows and POSIX @@ -373,6 +459,8 @@ public static RuntimeList _open2(RuntimeArray args, int ctx) { return new RuntimeScalar(pid).getList(); + } catch (RuntimeException re) { + throw re; } catch (Exception e) { getGlobalVariable("main::!").set(e.getMessage()); throw new RuntimeException("open2: " + e.getMessage()); diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalDestruction.java b/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalDestruction.java index 40fd6ca79..69c09474b 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalDestruction.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalDestruction.java @@ -1,5 +1,7 @@ package org.perlonjava.runtime.runtimetypes; +import java.util.ArrayList; + /** * Handles global destruction at program exit. *

@@ -19,28 +21,36 @@ public static void runGlobalDestruction() { // Set ${^GLOBAL_PHASE} to "DESTRUCT" GlobalVariable.getGlobalVariable(GlobalContext.GLOBAL_PHASE).set("DESTRUCT"); + // Snapshot the collections before iterating: a DESTROY callback may + // mutate GlobalVariable.{globalVariables,globalArrays,globalHashes} + // (e.g. by creating a new tied variable, opening/closing handles, + // or installing END-like cleanup), which would otherwise raise + // ConcurrentModificationException. Real-world trigger: exit(N) + // while holding a System::Command object whose Reaper's DESTROY + // spawns further cleanup. See dev/modules/git_modules_support.md. + // Walk all global scalars - for (RuntimeScalar val : GlobalVariable.globalVariables.values()) { + for (RuntimeScalar val : new ArrayList<>(GlobalVariable.globalVariables.values())) { destroyIfTracked(val); } // Walk global arrays for blessed ref elements - for (RuntimeArray arr : GlobalVariable.globalArrays.values()) { + for (RuntimeArray arr : new ArrayList<>(GlobalVariable.globalArrays.values())) { // Skip tied arrays — iterating them calls FETCHSIZE/FETCH on the // tie object, which may already be destroyed or invalid at global // destruction time (e.g., broken ties from eval+last). if (arr.type == RuntimeArray.TIED_ARRAY) continue; - for (RuntimeScalar elem : arr) { + for (RuntimeScalar elem : new ArrayList<>(arr.elements)) { destroyIfTracked(elem); } } // Walk global hashes for blessed ref values - for (RuntimeHash hash : GlobalVariable.globalHashes.values()) { + for (RuntimeHash hash : new ArrayList<>(GlobalVariable.globalHashes.values())) { // Skip tied hashes — iterating them dispatches through FIRSTKEY/ // NEXTKEY/FETCH which may fail if the tie object is already gone. if (hash.type == RuntimeHash.TIED_HASH) continue; - for (RuntimeScalar elem : hash.values()) { + for (RuntimeScalar elem : new ArrayList<>(hash.elements.values())) { destroyIfTracked(elem); } } diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalRuntimeScalar.java b/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalRuntimeScalar.java index e4f505b65..5461db1ac 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalRuntimeScalar.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/GlobalRuntimeScalar.java @@ -46,7 +46,29 @@ public void dynamicSaveState() { // Save the current global reference var originalVariable = GlobalVariable.globalVariables.get(fullName); - localizedStack.push(new SavedGlobalState(fullName, originalVariable)); + // Tied scalars need special handling: the tie magic must stay in + // place for the duration of the localized scope, so that an + // assignment `local $tied = value` dispatches through STORE (and + // restoration on scope exit dispatches STORE with the saved + // value). This matches real Perl semantics and is required by + // modules like File::chdir whose tied $CWD actually chdir's in + // STORE. See dev/modules/git_modules_support.md. + if (originalVariable != null + && originalVariable.type == RuntimeScalarType.TIED_SCALAR) { + RuntimeScalar savedValue = originalVariable.tiedFetch(); + // Real Perl dispatches STORE(undef) on entry to localize so + // the tie handler sees the transition. Modules like + // File::chdir explicitly short-circuit on undef + // (`return unless defined $_[1];`). + originalVariable.tiedStore(RuntimeScalarCache.scalarUndef); + localizedStack.push( + new SavedGlobalState(fullName, originalVariable, savedValue)); + // Do NOT replace the slot — the tied scalar stays in place so + // that the subsequent `= value` assignment dispatches STORE. + return; + } + + localizedStack.push(new SavedGlobalState(fullName, originalVariable, null)); // Create a new variable for the localized scope. // For output separator variables, create the matching special type so that @@ -84,6 +106,16 @@ public void dynamicRestoreState() { if (saved.fullName.equals(this.fullName)) { localizedStack.pop(); + // Tied path: the slot was never replaced. Restore the + // original value by dispatching STORE on the tied scalar. + if (saved.originalVariable != null + && saved.originalVariable.type == RuntimeScalarType.TIED_SCALAR) { + if (saved.savedTiedValue != null) { + saved.originalVariable.tiedStore(saved.savedTiedValue); + } + return; + } + // Decrement refCount of the CURRENT (local) value being displaced. // Do NOT increment the restored value — it already has the correct // refCount from its original counting. @@ -117,7 +149,10 @@ public void dynamicRestoreState() { } } - private record SavedGlobalState(String fullName, RuntimeScalar originalVariable) { + private record SavedGlobalState( + String fullName, + RuntimeScalar originalVariable, + RuntimeScalar savedTiedValue) { } } diff --git a/src/main/perl/lib/System/Command.pm b/src/main/perl/lib/System/Command.pm new file mode 100644 index 000000000..d5bc1c5b2 --- /dev/null +++ b/src/main/perl/lib/System/Command.pm @@ -0,0 +1,840 @@ +package System::Command; +$System::Command::VERSION = '1.122'; +use warnings; +use strict; +use 5.006; + +use Carp; +use Cwd qw( cwd ); +use IO::Handle; +use Symbol (); +use Scalar::Util qw( blessed reftype ); +use List::Util qw( reduce ); +use System::Command::Reaper; + +use Config; +use Fcntl qw( F_GETFD F_SETFD FD_CLOEXEC ); + +# MSWin32 support +use constant MSWin32 => $^O eq 'MSWin32'; +require IPC::Run if MSWin32; + +our $QUIET = 0; + +# trace setup at startup +my $_trace_opts = sub { + my ( $trace, $file, $th ) = split /=/, shift, 2; + open $th, '>>', $file or carp "Can't open $file: $!" if $file; + $th ||= *STDERR; + return ( $trace, $th ); +}; +my @trace; +@trace = $_trace_opts->( $ENV{SYSTEM_COMMAND_TRACE} ) + if $ENV{SYSTEM_COMMAND_TRACE}; + +sub import { + my ( $class, @args ) = @_; + my %arg = ( quiet => sub { $QUIET = 1 } ); + for my $arg (@args) { + $arg =~ s/^-//; # allow dashed options + croak "Unknown option '$arg' in 'use System::Command'" + if !exists $arg{$arg}; + $arg{$arg}->(); + } +} + +# a few simple accessors +{ + no strict 'refs'; + for my $attr (qw( pid stdin stdout stderr options )) { + *$attr = sub { return $_[0]{$attr} }; + } + for my $attr (qw( exit signal core )) { + no strict 'refs'; + *$attr = sub { $_[0]->is_terminated(); return $_[0]{$attr} }; + } + for my $attr (qw( cmdline )) { + *$attr = sub { return @{ $_[0]{$attr} } }; + } +} + +# REALLY PRIVATE FUNCTIONS +# PerlOnJava detection: PerlOnJava has no fork() — use IPC::Open3 instead. +# See dev/modules/git_modules_support.md in the PerlOnJava repository. +use constant PERLONJAVA => defined $Config{perlonjava}; +require IPC::Open3 if PERLONJAVA; + +# a sub-process spawning function +my $_spawn = sub { + my ($o, @cmd) = @_; + my $pid; + + # setup filehandles + my $in = Symbol::gensym; + my $out = Symbol::gensym; + my $err = Symbol::gensym; + + # no buffering on pipes used for writing + select( ( select($in), $| = 1 )[0] ); + + # start the command + if (PERLONJAVA) { + + # PerlOnJava has no fork(). Delegate to IPC::Open3, which is + # implemented on top of java.lang.ProcessBuilder. cwd and env + # are already handled by the caller (System::Command::new) + # via chdir + `local %ENV`. + $pid = eval { IPC::Open3::open3($in, $out, $err, @cmd) }; + if (my $err_msg = $@) { + # Translate IPC::Open3's "open3: exec of X failed: Y" into + # System::Command's own fork-path format + # "Can't exec( @cmd ): Y", so error-handling tests (e.g. + # t/11-spawn-fail.t) see the expected string in $@. + if ($err_msg =~ /^open3: exec of \S+ failed: (.*)$/m) { + croak "Can't exec( @cmd ): $1"; + } + die $err_msg; + } + } + elsif (MSWin32) { + $pid = IPC::Run::start( + [@cmd], + ' $in, + '>pipe' => $out, + '2>pipe' => $err, + ); + } + else { + + # the code below takes inspiration from IPC::Open3 and Sys::Cmd + + # create handles for the child process (using CAPITALS) + my $IN = Symbol::gensym; + my $OUT = Symbol::gensym; + my $ERR = Symbol::gensym; + + # no buffering on pipes used for writing + select( ( select($OUT), $| = 1 )[0] ); + select( ( select($ERR), $| = 1 )[0] ); + + # connect parent and child with pipes + pipe $IN, $in or croak "input pipe(): $!"; + pipe $out, $OUT or croak "output pipe(): $!"; + pipe $err, $ERR or croak "errput pipe(): $!"; + + # an extra pipe to communicate exec() failure + pipe my ( $stat_r, $stat_w ); + + # create the child process + $pid = fork; + croak "Can't fork: $!" if !defined $pid; + + if ($pid) { + + # parent won't use those handles + close $stat_w; + close $IN; + close $OUT; + close $ERR; + + # failed to fork+exec? + my $mesg = do { local $/; <$stat_r> }; + die $mesg if $mesg; + } + else { # kid + + # use $stat_r to communicate errors back to the parent + eval { + + # child won't use those handles + close $stat_r; + close $in; + close $out; + close $err; + + # setup process group if possible + setpgrp 0, 0 if $o->{setpgrp} && $Config{d_setpgrp}; + + # close $stat_w on exec + my $flags = fcntl( $stat_w, F_GETFD, 0 ) + or croak "fcntl GETFD failed: $!"; + fcntl( $stat_w, F_SETFD, $flags | FD_CLOEXEC ) + or croak "fcntl SETFD failed: $!"; + + # associate STDIN, STDOUT and STDERR to the pipes + my ( $fd_IN, $fd_OUT, $fd_ERR ) + = ( fileno $IN, fileno $OUT, fileno $ERR ); + open \*STDIN, "<&=$fd_IN" + or croak "Can't open( \\*STDIN, '<&=$fd_IN' ): $!"; + open \*STDOUT, ">&=$fd_OUT" + or croak "Can't open( \\*STDOUT, '<&=$fd_OUT' ): $!"; + open \*STDERR, ">&=$fd_ERR" + or croak "Can't open( \\*STDERR, '<&=$fd_ERR' ): $!"; + + # and finally, exec into @cmd + exec( { $cmd[0] } @cmd ) + or do { croak "Can't exec( @cmd ): $!"; } + }; + + # something went wrong + print $stat_w $@; + close $stat_w; + + # DIE DIE DIE + eval { require POSIX; POSIX::_exit(255); }; + exit 255; + } + } + + return ( $pid, $in, $out, $err ); +}; + +my $_dump_ref = sub { + require Data::Dumper; # only load if needed + local $Data::Dumper::Indent = 0; + local $Data::Dumper::Purity = 0; + local $Data::Dumper::Maxdepth = 0; + local $Data::Dumper::Quotekeys = 0; + local $Data::Dumper::Sortkeys = 1; + local $Data::Dumper::Useqq = 1; + local $Data::Dumper::Terse = 1; + return Data::Dumper->Dump( [shift] ); +}; + +my $_do_trace = sub { + my ( $trace, $th, $pid, $cmd, $o ) = @_; + print $th "System::Command cmd[$pid]: ", + join( ' ', map /\s/ ? $_dump_ref->($_) : $_, @$cmd ), "\n"; + print $th map "System::Command opt[$pid]: $_->[0] => $_->[1]\n", + map [ $_ => $_dump_ref->( $o->{$_} ) ], + grep { $_ ne 'env' } sort keys %$o + if $trace > 1; + print $th map "System::Command env[$pid]: $_->[0] => $_->[1]\n", + map [ $_ => $_dump_ref->( $o->{env}{$_} ) ], + keys %{ $o->{env} || {} } + if $trace > 2; +}; + +# module methods +sub new { + my ( $class, @cmd ) = @_; + + # split the args + my @o = { setpgrp => 1 }; + @cmd = grep { !( ref eq 'HASH' ? push @o, $_ : 0 ) } @cmd; + + # merge the option hashes + my $o = reduce { + +{ %$a, %$b, + exists $a->{env} && exists $b->{env} + ? ( env => { %{ $a->{env} }, %{ $b->{env} } } ) + : () + }; + } + @o; + + # open the trace file before changing directory + my ( $trace, $th ); + ( $trace, $th ) = $_trace_opts->( $o->{trace} ) if $o->{trace}; + ( $trace, $th ) = @trace if @trace; # environment override + + # chdir to the expected directory + my $orig = cwd; + my $dest = defined $o->{cwd} ? $o->{cwd} : undef; + if ( defined $dest ) { + chdir $dest or croak "Can't chdir to $dest: $!"; + } + + # keep changes to the environment local + local %ENV = %ENV; + + # update the environment + if ( exists $o->{env} ) { + croak "ENV variables cannot be empty strings on Win32" + if MSWin32 and grep { defined and !length } values %{ $o->{env} }; + @ENV{ keys %{ $o->{env} } } = values %{ $o->{env} }; + delete $ENV{$_} + for grep { !defined $o->{env}{$_} } keys %{ $o->{env} }; + } + + # interactive mode requested + if ( $o->{interactive} ) { + croak "Can't run command in interactive mode: not a terminal" + unless -t STDIN; + + system { $cmd[0] } @cmd; + + my $self = bless { + cmdline => [@cmd], + options => $o, + stdin => IO::Handle->new, + stdout => IO::Handle->new, + stderr => IO::Handle->new, + exit => $? >> 8, + signal => $? & 127, + core => $? & 128, + }, $class; + + defined reftype( $o->{$_} ) + and reftype( $o->{$_} ) eq 'SCALAR' + and ${ $o->{$_} } = $self->{$_} + for qw( exit signal core ); + + return $self; + } + + # start the command + my ( $pid, $in, $out, $err ) = eval { $_spawn->( $o, @cmd ); }; + + # FIXME - better check error conditions + if ( !defined $pid ) { + $_do_trace->( $trace, $th, '!', \@cmd, $o ) if $trace; + croak $@; + } + + # trace is mostly a debugging tool + $_do_trace->( $trace, $th, $pid, \@cmd, $o ) if $trace; + + # some input was provided + if ( defined $o->{input} ) { + local $SIG{PIPE} + = sub { croak "Broken pipe when writing to: @cmd" } + if $Config{sig_name} =~ /\bPIPE\b/; + print {$in} $o->{input} if length $o->{input}; + $in->close; + } + + # chdir back to origin + if ( defined $dest ) { + chdir $orig or croak "Can't chdir back to $orig: $!"; + } + + # create the object + my $self = bless { + cmdline => [@cmd], + options => $o, + pid => MSWin32 ? $pid->{KIDS}[0]{PID} : $pid, + stdin => $in, + stdout => $out, + stderr => $err, + ( _ipc_run => $pid )x!! MSWin32, + }, $class; + + # create the subprocess reaper and link the handles and command to it + ${*$in} = ${*$out} = ${*$err} = $self->{reaper} # typeglobs FTW + = System::Command::Reaper->new( $self, { trace => $trace, th => $th } ); + + return $self; +} + +sub spawn { + my ( $class, @cmd ) = @_; + return @{ $class->new(@cmd) }{qw( pid stdin stdout stderr )}; +} + +sub loop_on { + my $self = shift; + + # handle options and defaults + my %args = ( + stderr => sub { print STDERR shift }, + @_ + ); + for my $which ( grep exists $args{$_}, qw( stdout stderr ) ) { + if ( $args{$which} ) { + croak "'$which' option must be a CODE reference" + if reftype $args{$which} ne 'CODE'; + } + else { + delete $args{$which}; + } + } + + # create an object for the class method + if ( !ref $self ) { + die "'command' attribute required by loop_on when used as a class method" + if !exists $args{command}; + $self = $self->new( @{ $args{command} } ); + } + + require IO::Select; + my $select = IO::Select->new( $self->stdout, $self->stderr ); + + local $/ = $args{input_record_separator} + if exists $args{input_record_separator}; + + # loop until end of streams + while ( my @ready = $select->can_read ) { + for my $fh (@ready) { + my $which = $fh == $self->stdout ? 'stdout' : 'stderr'; + if ( defined( my $line = <$fh> ) ) { + my $ret = 1; + $ret = $args{$which}->($line) + if exists $args{$which}; + return if !$ret; + } + else { + $select->remove($fh); + $fh->close; + } + } + } + + # close all pipes and wait for the child to terminate + $self->close; + + # success in the Unix sense + return defined $self->exit && $self->exit == 0; +} + +# delegate those to the reaper (when there's one) +sub is_terminated { + return $_[0]{options}{interactive} + ? 1 + : $_[0]{reaper}->is_terminated(); +} + +sub close { + $_[0]{reaper}->close() unless $_[0]{options}{interactive}; + return $_[0]; +} + +1; + +__END__ + +=pod + +=head1 NAME + +System::Command - Object for running system commands + +=head1 VERSION + +version 1.122 + +=head1 SYNOPSIS + + use System::Command; + + # invoke an external command, and return an object + $cmd = System::Command->new( @cmd ); + + # options can be passed as a hashref + $cmd = System::Command->new( @cmd, \%option ); + + # $cmd is basically a hash, with keys / accessors + $cmd->stdin(); # filehandle to the process stdin (write) + $cmd->stdout(); # filehandle to the process stdout (read) + $cmd->stderr(); # filehandle to the process stdout (read) + $cmd->pid(); # pid of the child process + + # find out if the child process died + if ( $cmd->is_terminated() ) { + # the handles are not closed yet + # but $cmd->exit() et al. are available if it's dead + } + + # done! + $cmd->close(); + + # exit information + $cmd->exit(); # exit status + $cmd->signal(); # signal + $cmd->core(); # core dumped? (boolean) + + # cut to the chase + my ( $pid, $in, $out, $err ) = System::Command->spawn(@cmd); + +=head1 DESCRIPTION + +System::Command is a class that launches external system commands +and return an object representing them, allowing to interact with them +through their C, C and C handles. + +=head1 METHODS + +System::Command supports the following methods: + +=head2 new + + my $cmd = System::Command->new( @cmd ) + +Runs an external command using the list in C<@cmd>. + +If C<@cmd> contains a hash reference, it is taken as an I