Fix silent Copilot MDM hook-install failure#171
Conversation
Copilot was the only MDM tool installing per-user by downloading unbound.py to a root-owned staging dir and copyfile-ing it into each home after privilege-drop. If the dropped user could not read the staged file, the copy failed silently and main() still exited 0, so the orchestrator marked Copilot installed when no hook was written. - Read the hook script once as root and write it as the target user, so the only cross-privilege op is a write into the user's own home (no staged read). - Add modular _fetch_hook_script / _apply_gateway_url / _write_file helpers; _write_file uses O_NOFOLLOW + fchmod. - main() now returns success and exits non-zero when no user gets installed, so the MDM orchestrator stops reporting a silent failure as success. - Add outermost-layer tests for install_hooks_for_user and main() exit semantics. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y tests - Only call notify_setup_complete when every user was installed, so the backend is not told setup completed on a partial/failed run (exit code already reflects this to the orchestrator). - Add tests: O_NOFOLLOW refuses a symlinked unbound.py path; _apply_gateway_url rewrite vs identity; notify is/ isn't called on success vs partial failure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) 🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
… read - _fetch_hook_script catches a read failure (corrupt/transient) and returns None so main() prints the clean download-failed message instead of a raw traceback. - _install debug_prints the underlying reason before re-raising, so a per-user install failure is diagnosable (it is swallowed in the _run_as_user fork). - _write_file closes the fd if os.fdopen raises before taking ownership. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed all three Greptile P2s in
All checks green; full copilot suite |
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) 🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
…/copilot-mdm-install-failure
🛡️ Automated Security Review (consensus)0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. ✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) 🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
The MDM orchestrator (mdm/onboard.py) gates each tool's success on the subprocess returncode. Every tool's setup.py main() returned None on its error paths and __main__ exited 0 unless an exception was raised, so any mid-run failure (missing privileges, MDM key fetch, env var write, hook config) reported success to the orchestrator. This mirrors the exit-code contract landed for copilot in #171: - main() returns True on success / clear, False on every error path - __main__ captures the result and exits 1 on KeyboardInterrupt, on exception, or when main() returns falsy; 0 only on real success Applied to augment, claude-code (gateway + hooks), codex (gateway + hooks), cursor, and gemini-cli. The copilot per-user read-as-root/ write-as-user refactor is not ported: those tools download the hook script as the dropped target user into the user's own home, so they never had copilot's cross-privilege read failure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🛡️ Automated Security Review (consensus)0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. ✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) 🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
🛡️ Automated Security Review (consensus)0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. ✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks) Previously acknowledged (not re-flagged)
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit c0ef166. Configure here.
🛡️ Automated Security Review (consensus)2 findings — 1 high-confidence, 1 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🔴 Backfill runs after failed hook install
🟡 Codex hooks partial install still reports success
Previously acknowledged (not re-flagged)
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
…fix/copilot-mdm-install-failure
Cursor Bugbot (medium): run_backfill ran whenever --backfill was set, even when hook install failed (success=False). notify_setup_complete is already gated on success; backfill was not, so a failed setup could still upload sessions and advance per-user cutoffs — creating a capture gap since the hooks aren't installed to record going forward. Gate backfill on success too. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🛡️ Automated Security Review (consensus)1 finding — 1 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🔴 Backfill runs after failed hook install
Previously acknowledged (not re-flagged)
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head |
🛡️ Automated Security Review (consensus)2 findings — 1 high-confidence, 1 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks. Findings🔴 Codex hooks: partial per-user install still reported as success
🟡
|
Security consensus review (2 findings): 1. codex hooks: a partial per-user install (e.g. 1 of N users) still returned True, called notify_setup_complete/run_backfill, and the orchestrator would not retry — leaving some users unmonitored. Mirror the Copilot gating: success = bool(user_homes) and installed == len, and gate notify, backfill, and the return value on success. 2. --clear reported success unconditionally (clear_setup(); return True), so a failed teardown still exited 0 and stale hooks/env vars/API-key material could persist undetected during offboarding. clear_setup now returns a bool (False on admin-denied or any teardown failure) and main propagates it. Applied across all 8 MDM setup.py scripts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Both findings addressed in 🔴 Codex hooks — partial install reported as success ( success = bool(user_homes) and installed == len(user_homes)
...
print("Setup Complete!" if success else "Setup Failed")
if success:
notify_setup_complete(...)
if success and backfill_mode:
run_backfill(...)
return successA partial per-user install now exits non-zero and skips 🟡
Verified: all 8 files |
vigneshsubbiah16
left a comment
There was a problem hiding this comment.
🛡️ Automated Security Review (consensus)
0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks.
✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks)
🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head b6b5a0f7 · 2026-07-03T07:02Z

Summary
unbound.pyper-user by downloading it to a root-owned staging dir andshutil.copyfile-ing it into each home after privilege-drop. The dropped user had to read the staged file; when it couldn't (swallowedchmodwidening, or a non-traversable/var/folderstemp path on macOS undersudo), the copy failed silently andmain()still exited 0 — so the orchestrator (mdm/onboard.py) marked Copilot installed when no hook was written.claude-code/codex/cursoralready behave and removes a root-owned, world-readable file from disk entirely.main()now returns a bool and__main__exits non-zero when no/partial user install succeeds, so a silent failure can no longer be reported as success.Changes
copilot/hooks/mdm/setup.py:_fetch_hook_script(gateway_url)— root-only download into an unpredictablemkdtempdir, read into memory with the gateway URL applied, cleaned up infinally._apply_gateway_url(text, gateway_url)— pure string rewrite (replaces the old path-basedrewrite_gateway_url_in_file)._write_file(path, data, mode)—O_NOFOLLOWopen +fchmod(umask-proof), reused for bothunbound.pyandunbound.json.install_hooks_for_user(username, home_dir, script_text)— writes both files via_write_fileinside_run_as_user; no moreshutil.copyfilefrom a staged path.0o711/0o644widening (dead once nothing reads the file post-drop).notify_setup_completeis only called on full success, so the backend isn't told setup completed on a failed run.Test plan
install_hooks_for_user,main()): files actually land + are executable;O_NOFOLLOWrefuses a symlinked path; gateway rewrite vs identity;main()returns False / exits non-zero on no-users, partial-failure, and download-failure, and notifies only on success.26 passed(9 new + 17 existing)./security-review(no High/Critical; net-strengthens the root↔user boundary).Notes
mdm/onboard.py:89passessupports_backfill=Truefor Copilot while the comment at:82-85says Copilot has no backfill — harmless, left as-is.🤖 Generated with Claude Code
Note
Medium Risk
Changes MDM install/teardown contracts and exit codes that onboarders rely on; Copilot’s write path is security-sensitive but is a net hardening versus staged root-owned copies.
Overview
Fixes GitHub Copilot MDM installs that could exit 0 with no hooks on disk, and aligns all eight
setup.pyentrypoints so orchestrators likemdm/onboard.pycan treat failures as failures.Copilot no longer downloads
unbound.pyto a root-owned staging file andcopyfiles it after privilege-drop (which could fail silently when the dropped user could not read the staged path). The script is fetched once as root into memory (_fetch_hook_script), the gateway URL is applied in-process, and each user getsunbound.py/unbound.jsonwritten via_write_file(O_NOFOLLOW,fchmod) inside_run_as_user.main()now requires every user home to install successfully;notify_setup_completeand--backfillrun only on full success.Codex hooks gets the same all-users-or-fail success semantics and gates completion notification / backfill on success.
Across augment, claude-code (gateway + hooks), codex (gateway + hooks), cursor, gemini-cli, and copilot:
main()/clear_setup()returnbool,--clearpropagates teardown success, early exits returnFalse, and__main__ends withsys.exit(0 if ok else 1)(includingKeyboardInterrupt→ 1).clear_setup()also returns non-zero success when env or managed-artifact teardown partially fails.Reviewed by Cursor Bugbot for commit b6b5a0f. Bugbot is set up for automated code reviews on this repo. Configure here.
Greptile Summary
This PR fixes two independent bugs affecting all seven MDM
setup.pyprovisioning scripts. First, everymain()function previously exited 0 on early failure (barereturnwith no explicit exit), so the MDM orchestrator treated every run as successful; all early-failure returns are nowreturn Falseand__main__callssys.exit(0 if ok else 1). Second, Copilot's hook installation staged a root-downloadedunbound.pyin a temp directory and then tried to read it after privilege-drop — which could silently fail — leavingunbound.pyunwritten whilemain()still reported success.copilot/hooks/mdm/setup.py):_fetch_hook_scriptdownloads as root and reads the script into memory with the gateway URL already applied;install_hooks_for_userreceives thestrand writes bothunbound.pyandunbound.jsonvia a new_write_filehelper (O_NOFOLLOW+fchmod) running as the target user — no staged read after privilege-drop, no world-readable temp file.clear_setup()andmain()now returnbool, all intermediate failures returnFalse, andnotify_setup_complete/ backfill are gated on full per-user success — partial installs no longer mark the tool as provisioned.codex/hooks,copilot): success criterion upgraded from "at least one user installed" to "every enumerated user installed", matching the semantics the orchestrator already expected.Confidence Score: 5/5
Safe to merge — the copilot hook delivery and the cross-script exit-code changes are correct and address real silent-failure bugs.
The copilot in-memory delivery eliminates the staged-file read-after-privilege-drop race cleanly. Exit-code propagation is implemented consistently across all seven scripts and covers every early-return path. The
_write_filefd-transfer guard and the_fetch_hook_scripttry/finally cleanup are both correct. No pre-existing behaviour is made worse; the one minor cosmetic inconsistency (the "Clear Complete!" banner on partial teardown failure) is pre-existing and does not affect the correctness of the exit code the orchestrator consumes.No files require special attention. The copilot
setup.pyhas the most logic added and is also the most thoroughly changed; the remaining six files follow an identical, mechanical pattern.Important Files Changed
_fetch_hook_script+_write_file; addsO_NOFOLLOW/fchmodto both files;notify_setup_completenow gated on all-user success.clear_setup()andmain()now returnbool; all early failures returnFalse;__main__callssys.exit(0 if ok else 1).clear_setup()andmain()returnbool,__main__exits on the return value.clear_setup()gainsteardown_failedtracking including the per-userremove_codex_config_base_url_for_userloop.notify_setup_completeandrun_backfillare now gated on full success.clear_setup()gainsteardown_failedtracking across hooks.json removal, hooks directory removal, and env-var clearing.clear_setup()gainsteardown_failedtracking including Windows-specific settings file/dir removal failures.Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant MDM as MDM Orchestrator participant main as main() [root] participant fetch as _fetch_hook_script() participant tmp as Temp Dir (root-owned) participant run as _run_as_user(username) participant write as _write_file() [as user] participant notify as notify_setup_complete() MDM->>main: run setup.py --api-key ... main->>fetch: _fetch_hook_script(gateway_url) fetch->>tmp: mkdtemp() → staging_dir fetch->>tmp: download_file(SCRIPT_URL, source_script) fetch->>fetch: source_script.read_text() → script_text fetch->>tmp: shutil.rmtree(staging_dir) [finally] fetch-->>main: script_text: str (in memory) loop for each username, home_dir main->>run: _run_as_user(username, _install) run->>write: _write_file(script_path, script_text, 0o755) note over write: O_NOFOLLOW + fchmod — as target user run->>write: _write_file(hooks_json, config_json, 0o644) write-->>run: success / raises run-->>main: True / None end alt all users installed main->>notify: notify_setup_complete(...) main-->>MDM: exit 0 else partial or zero success main-->>MDM: exit 1 end%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant MDM as MDM Orchestrator participant main as main() [root] participant fetch as _fetch_hook_script() participant tmp as Temp Dir (root-owned) participant run as _run_as_user(username) participant write as _write_file() [as user] participant notify as notify_setup_complete() MDM->>main: run setup.py --api-key ... main->>fetch: _fetch_hook_script(gateway_url) fetch->>tmp: mkdtemp() → staging_dir fetch->>tmp: download_file(SCRIPT_URL, source_script) fetch->>fetch: source_script.read_text() → script_text fetch->>tmp: shutil.rmtree(staging_dir) [finally] fetch-->>main: script_text: str (in memory) loop for each username, home_dir main->>run: _run_as_user(username, _install) run->>write: _write_file(script_path, script_text, 0o755) note over write: O_NOFOLLOW + fchmod — as target user run->>write: _write_file(hooks_json, config_json, 0o644) write-->>run: success / raises run-->>main: True / None end alt all users installed main->>notify: notify_setup_complete(...) main-->>MDM: exit 0 else partial or zero success main-->>MDM: exit 1 endReviews (10): Last reviewed commit: "Gate codex-hooks partial install + make ..." | Re-trigger Greptile