Always-on LD_LIBRARY_PATH stripping causes 11x performance regression for CUDA/MKL users

### What version of Codex is running?

0.00

### What subscription do you have?

Pro

### Which model were you using?

gpt-5.2

### What platform is your computer?

Linux 6.17.9-arch1-1 x86_64

### What issue are you seeing?

Pre-main hardening in release builds strips LD_LIBRARY_PATH unconditionally, causing severe performance regressions and broken functionality for:

- CUDA workloads: 11-300x slower (GPU libraries not found, fallback to CPU)
- Intel MKL workloads: 11x slower (optimized BLAS not found, slow fallback)
- Conda environments: Legacy installations relying on LD_LIBRARY_PATH fail
- Enterprise deployments: Custom DB drivers, Oracle clients fail to load
- HPC clusters: Module systems and custom toolchains broken

This regression was introduced in PR #4521 (commit b8e1fe60c) which made codex_process_hardening::pre_main_hardening() always-on in release builds.

The hardening strips all LD_* environment variables before main(), and because Codex spawns child processes with the parent environment, every child inherits the stripped values.


### What steps can reproduce the bug?

1. Install Codex release build (non-debug)
2. Set LD_LIBRARY_PATH to custom CUDA installation:
   export LD_LIBRARY_PATH=/opt/cuda/lib64:$LD_LIBRARY_PATH
3. Run a CUDA workload:
   codex exec "python3 -c 'import torch; print(torch.cuda.is_available())'"
4. Observe: Returns False (CUDA libraries not found)
5. Verify LD_LIBRARY_PATH is stripped:
   codex exec "echo \$LD_LIBRARY_PATH"
6. Output: (empty or different from parent)

Performance regression can be measured with:
   time codex exec "python3 -c 'import numpy as np; np.dot(np.random.rand(1000,1000), np.random.rand(1000,1000))'"


### What is the expected behavior?

LD_LIBRARY_PATH should be preserved by default to support legitimate use cases (CUDA, MKL, custom libraries).

Users who need maximum security can opt-in via environment variable:
CODEX_SECURE_MODE=1 codex exec "..."

This approach:
- Restores performance and functionality for CUDA/MKL users
- Maintains strong security through existing Landlock/Seatbelt boundaries
- Allows opt-in maximum hardening when truly needed
- Has zero breaking changes for existing users

### Additional information

I have a tested fix ready for PR submission:
- Makes pre_main_hardening() opt-in via CODEX_SECURE_MODE=1 environment variable
- Default mode: Preserves LD_LIBRARY_PATH (fast, compatible)
- Secure mode: Strips LD_* variables (maximum hardening)
- Only 2 files modified: cli/src/main.rs and responses-api-proxy/src/main.rs
- Security maintained through Landlock/Seatbelt (primary boundaries)

This affects users in ML/AI research, scientific computing, HPC, and enterprise environments for the most part. 

---
TL;DR: The pre-main hardening regression is literally a "ghost" - it executes before `main()`, strips the environment, and then disappears, leaving no trace except confused users reporting slowness.

Reference documentation:

- [GITHUB_ISSUE_DETAILED.md](https://github.com/user-attachments/files/24510983/GITHUB_ISSUE_DETAILED.md) (comprehensive analysis)
- [Ghost in the Codex Machine](https://docs.google.com/document/d/1fDJc1e0itJdh0MXMFJtkRiBcxGEFtye6Xc6Ui7eMX4o/edit) (Hypothesis & Theory)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Always-on LD_LIBRARY_PATH stripping causes 11x performance regression for CUDA/MKL users #8945

What version of Codex is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Always-on LD_LIBRARY_PATH stripping causes 11x performance regression for CUDA/MKL users #8945

Description

What version of Codex is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions