Skip to content

chore: improve agent development setup#1642

Merged
hassiebp merged 1 commit intomainfrom
codex/improve-agent-development-setup
Apr 27, 2026
Merged

chore: improve agent development setup#1642
hassiebp merged 1 commit intomainfrom
codex/improve-agent-development-setup

Conversation

@hassiebp
Copy link
Copy Markdown
Contributor

@hassiebp hassiebp commented Apr 27, 2026

What does this PR do?

Improves the repo setup for productive agent-assisted development without adding project MCP servers.

Changes include:

  • Add a Codex environment setup pointer plus reusable scripts/codex/* bootstrap, maintenance, and quick-check scripts.
  • Expand AGENTS.md with concise agent workflow, verification, CI, and PR expectations.
  • Refresh CONTRIBUTING.md with current uv --locked / uv --frozen commands, test topology, Codex Cloud setup, PR rules, and generated API guidance.
  • Add a reusable code_review.md checklist for self-review and review tasks.
  • Add a PR template and semantic PR title validation workflow aligned with the main Langfuse repo conventions.
  • Ignore local Codex runtime state under .codex/.

Verification

git diff --check
bash -n scripts/codex/setup.sh scripts/codex/maintenance.sh scripts/codex/quick-check.sh
uv run --frozen ruff check .
uv run --frozen ruff format --check .
bash scripts/codex/setup.sh
bash scripts/codex/maintenance.sh
bash scripts/codex/quick-check.sh

bash scripts/codex/quick-check.sh passed with Ruff, mypy, and unit tests (403 passed, 2 skipped).

Notes

The project MCP config was intentionally not added. The only .codex/ file committed is the environment setup pointer.

Disclaimer: Experimental PR review

Greptile Summary

This PR improves the agent-assisted development setup by adding reusable Codex bootstrap scripts (scripts/codex/), a PR template, a Conventional Commits validation workflow, a self-review checklist (code_review.md), and refreshed CONTRIBUTING.md / AGENTS.md documentation. The changes are additive and do not touch any SDK source code. All three inline findings are P2 style/consistency issues in the new shell scripts.

Confidence Score: 4/5

Safe to merge — no SDK source changes; all findings are minor P2 style/consistency issues in the new shell scripts.

Only P2 findings are present (pinned uv version rationale undocumented, missing ruff format check in quick-check.sh, silently suppressed uv cache prune flag). No logic bugs in production code and no security concerns.

scripts/codex/setup.sh — pinned uv version should be validated against the project lockfile format version.

Important Files Changed

Filename Overview
scripts/codex/setup.sh Bootstrap script for Codex environments; installs uv at a hardcoded version (0.11.2) which may conflict with the project's lockfile format
scripts/codex/quick-check.sh Local confidence-check script; omits ruff format --check that is documented as a quality gate in CONTRIBUTING.md and AGENTS.md
scripts/codex/maintenance.sh Sync and cache-prune script; uv cache prune --ci is fully suppressed, so an invalid flag would silently be a no-op
.github/workflows/validate-pr-title.yml New workflow enforcing Conventional Commits on PR titles; action is pinned to a commit hash, permissions are minimal and correctly scoped
.github/PULL_REQUEST_TEMPLATE.md Adds standard PR template with checklist and verification section; straightforward documentation change
AGENTS.md Expanded with verification matrix, CI contract notes, and tightened PR rules; content is accurate and consistent
CONTRIBUTING.md Refreshed with current uv commands, Codex Cloud setup guide, and PR expectations; significant improvement over previous sparse content
code_review.md New self-review checklist covering correctness, generated-code rules, CI topology, and Python style; concise and actionable

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Developer / Codex Agent] --> B{uv installed?}
    B -- No --> C[pip install uv==0.11.2]
    C --> D[export PATH]
    B -- Yes --> E[uv sync --locked]
    D --> E
    E --> F[uv run --frozen python --version]
    F --> G[Environment Ready]

    G --> H[Daily Maintenance]
    H --> H1[uv sync --locked]
    H1 --> H2[uv cache prune --ci]

    G --> I[Quick Check]
    I --> I1[ruff check .]
    I1 --> I2[mypy langfuse]
    I2 --> I3[pytest tests/unit]
    I3 --> J{All Pass?}
    J -- Yes --> K[Open PR]
    J -- No --> L[Fix Issues]
    L --> I

    K --> M[validate-pr-title workflow]
    M --> N{Conventional Commits?}
    N -- Yes --> O[PR Ready for Review]
    N -- No --> P[Block merge]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: scripts/codex/setup.sh
Line: 8

Comment:
**Pinned uv version may conflict with lockfile format**

`uv==0.11.2` is installed as the fallback, but `uv sync --locked` is called immediately after with the repo's existing `uv.lock`. If the lockfile was generated with a different uv version (older or newer), the sync can fail due to lockfile format incompatibilities. Consider either pinning to the same version used to generate the lockfile, or adding a comment explaining why `0.11.2` was chosen and how it should be kept in sync with the project's own uv dependency.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: scripts/codex/quick-check.sh
Line: 7-9

Comment:
**Missing `ruff format --check` step**

`CONTRIBUTING.md` lists `ruff format .` under "Quality checks" and the AGENTS.md verification matrix calls out `ruff format --check .` for docs/Python changes, but `quick-check.sh` only runs `ruff check .`. This means formatting regressions will pass the quick-check script undetected. Adding `uv run --frozen ruff format --check .` before `ruff check` would make the script consistent with the documented quality gate.

```suggestion
uv run --frozen ruff format --check .
uv run --frozen ruff check .
uv run --frozen mypy langfuse --no-error-summary
uv run --frozen pytest -n auto --dist worksteal tests/unit
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: scripts/codex/maintenance.sh
Line: 8

Comment:
**`--ci` flag silently swallowed**

`uv cache prune --ci` is redirected to `/dev/null` and followed by `|| true`, so if `--ci` is not a recognised flag for `uv cache prune` it will fail silently and skip pruning entirely without any indication. Verify that `--ci` is a valid option; if it isn't, removing the flag (or at least not suppressing stderr) would make failures visible.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "chore: improve agent development setup" | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

@blacksmith-sh
Copy link
Copy Markdown

blacksmith-sh Bot commented Apr 27, 2026

Found 1 test failure on Blacksmith runners:

Failure

Test View Logs
TestMultiProjectSetup/test_context_isolation_between_projects View Logs

Fix in Cursor

@hassiebp hassiebp marked this pull request as ready for review April 27, 2026 06:00
@github-actions
Copy link
Copy Markdown

@claude review

@hassiebp hassiebp merged commit 5ef17a0 into main Apr 27, 2026
18 of 21 checks passed
@hassiebp hassiebp deleted the codex/improve-agent-development-setup branch April 27, 2026 06:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d2ffc8f9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

name: "Validate PR Title"

on:
pull_request:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Switch PR-title workflow to fork-safe trigger

Using pull_request here makes this action fail for fork-based pull requests (the action’s own docs state this trigger only works for branches in the same repository). In any workflow that accepts external contributors, this causes the required PR-title check to error on fork PRs and can block merges until a maintainer intervenes. Trigger this workflow with pull_request_target instead (with minimal permissions) so title validation runs reliably for forks.

Useful? React with 👍 / 👎.

Comment thread scripts/codex/setup.sh
cd "$repo_root"

if ! command -v uv >/dev/null 2>&1; then
python3 -m pip install --user "uv==0.11.2"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Pinned uv version may conflict with lockfile format

uv==0.11.2 is installed as the fallback, but uv sync --locked is called immediately after with the repo's existing uv.lock. If the lockfile was generated with a different uv version (older or newer), the sync can fail due to lockfile format incompatibilities. Consider either pinning to the same version used to generate the lockfile, or adding a comment explaining why 0.11.2 was chosen and how it should be kept in sync with the project's own uv dependency.

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/codex/setup.sh
Line: 8

Comment:
**Pinned uv version may conflict with lockfile format**

`uv==0.11.2` is installed as the fallback, but `uv sync --locked` is called immediately after with the repo's existing `uv.lock`. If the lockfile was generated with a different uv version (older or newer), the sync can fail due to lockfile format incompatibilities. Consider either pinning to the same version used to generate the lockfile, or adding a comment explaining why `0.11.2` was chosen and how it should be kept in sync with the project's own uv dependency.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pure dev tooling and docs change (Codex bootstrap scripts, PR template/title validation, AGENTS.md and CONTRIBUTING.md updates) with no production code impact.

Extended reasoning...

Overview

This PR adds repo-owned Codex agent setup scripts (scripts/codex/setup.sh, maintenance.sh, quick-check.sh), a Codex environment pointer (.codex/environments/environment.toml), a GitHub Actions workflow that runs amannn/action-semantic-pull-request (pinned to a commit SHA) for Conventional Commits PR-title validation, a PR template, a code_review.md checklist, and refreshes to AGENTS.md and CONTRIBUTING.md. The only ignore-list addition is local Codex runtime state under .codex/. No langfuse/ source, generated API client, or test code is touched.

Security risks

Minimal. The new workflow is pinned to an immutable commit SHA, declares permissions: {} at the top level and only grants statuses: write and pull-requests: read to the single job — no write access to repo contents. The bootstrap scripts run uv sync --locked against the committed lockfile and don't fetch arbitrary code from the internet (the pip install --user uv==0.11.2 fallback uses a pinned version from PyPI). No secrets handling, auth, or data-path code is involved.

Level of scrutiny

Low. This is dev-tooling and documentation only — it cannot break the SDK runtime or CI for users, and the worst-case failure mode is a Codex environment that needs hand-fixing. The PR title validation workflow only affects the PR-status check, not merge gating beyond what the repo configures. The Blacksmith CI failure noted in the timeline (TestMultiProjectSetup/test_context_isolation_between_projects) is unrelated to anything in this diff.

Other factors

The bug-hunting system flagged a [Nit] about the setup.sh fallback path being fragile on PEP 668-marked distros and not persisting PATH for sibling scripts. The concern is real but only triggers when uv is not preinstalled; the inline comment captures the suggested switch to curl -LsSf https://astral.sh/uv/install.sh | sh. The author can address it as a follow-up — it does not justify blocking a docs/tooling PR. Author's verification log shows the scripts were exercised successfully (403 passed, 2 skipped).

Comment thread scripts/codex/setup.sh
Comment on lines +7 to +13
if ! command -v uv >/dev/null 2>&1; then
python3 -m pip install --user "uv==0.11.2"
export PATH="$HOME/.local/bin:$PATH"
fi

uv sync --locked
uv run --frozen python --version
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The fallback bootstrap in scripts/codex/setup.sh (lines 7-10) has two compounding issues when uv is not preinstalled. First, python3 -m pip install --user "uv==0.11.2" aborts with error: externally-managed-environment on PEP 668-marked system Pythons (Debian 12+, Ubuntu 23.04+, recent Fedora/Alpine, many slim images), and under set -euo pipefail the bootstrap dies with no useful context. Second, export PATH="$HOME/.local/bin:$PATH" is process-local: scripts/codex/maintenance.sh and scripts/codex/quick-check.sh run as independent shells (also under set -euo pipefail) and call uv directly, so they fail with uv: command not found whenever ~/.local/bin is not on the inherited PATH. Suggested fix: switch the fallback to the canonical curl -LsSf https://astral.sh/uv/install.sh | sh (or pipx install uv), which sidesteps PEP 668 and writes shell-rc entries so subsequent script invocations find uv.

Extended reasoning...

What the bug is

scripts/codex/setup.sh is the documented Codex Cloud setup hook (CONTRIBUTING.md now lists bash scripts/codex/setup.sh as the recommended setup script and .codex/environments/environment.toml wires it into the environment). Its fallback path for installing uv is:

if ! command -v uv >/dev/null 2>&1; then
  python3 -m pip install --user "uv==0.11.2"
  export PATH="$HOME/.local/bin:$PATH"
fi

Two things go wrong when uv is not preinstalled:

  1. PEP 668 aborts the install. On Debian 12+, Ubuntu 23.04+, recent Fedora/Alpine, and many slim base images, the system Python is marked externally-managed. python3 -m pip install --user ... then exits non-zero with error: externally-managed-environment. With set -euo pipefail, this kills setup.sh with no diagnostic beyond the pip error.
  2. PATH does not persist. Even if pip succeeds, export PATH="$HOME/.local/bin:$PATH" only mutates the env of the current setup.sh process. scripts/codex/maintenance.sh and scripts/codex/quick-check.sh are subsequent independent shells (each #!/usr/bin/env bash with set -euo pipefail) that invoke uv sync --locked / uv run --frozen ... directly, with no PATH prepend and no fallback. ~/.local/bin is only added to PATH by ~/.profile / ~/.bash_profile, which non-login shells (typical in CI runners and minimal images) do not source.

Why existing code does not prevent it

The author clearly anticipated the missing-uv case for setup.sh itself, but the fix is incomplete: neither maintenance.sh nor quick-check.sh re-prepend $HOME/.local/bin, and setup.sh does not write the PATH update into a profile/rc file the next shell would inherit. The author's local Codex testing presumably had uv already on PATH (the conditional was skipped) or used a non-PEP 668 base image, masking the problem.

Impact

This only triggers when (a) the fallback branch fires (uv not preinstalled) AND (b) the active distro is PEP 668-marked (issue 1) or ~/.local/bin is not on the inherited PATH (issue 2). In a fresh Codex Cloud environment using a recent Debian/Ubuntu slim image — exactly the case the fallback was written for — this is the first thing that runs, and the bootstrap aborts before ever calling uv sync. Even on non-PEP 668 systems, a successful pip install can still leave maintenance.sh failing on uv: command not found in a fresh shell. Scope is narrow (dev tooling for Codex agents), but the fallback path is exactly the case it needs to handle.

Step-by-step proof

Concrete reproducer on a typical PEP 668 base image (e.g. python:3.13-slim or debian:12):

  1. Container starts with no uv and ~/.local/bin not on PATH.
  2. Codex runs bash scripts/codex/setup.sh per .codex/environments/environment.toml.
  3. command -v uv fails, so the fallback branch executes.
  4. python3 -m pip install --user "uv==0.11.2" exits 1 with error: externally-managed-environment.
  5. set -euo pipefail aborts setup.sh; uv sync --locked and the version-print line never run.

Alternative path (non-PEP 668 system, e.g. older base image):

  1. python3 -m pip install --user "uv==0.11.2" succeeds, installing uv into ~/.local/bin/uv.
  2. setup.sh's process PATH now includes $HOME/.local/bin; uv sync --locked succeeds; setup.sh exits 0.
  3. Codex later runs bash scripts/codex/maintenance.sh in a fresh non-login shell that does not source ~/.profile.
  4. maintenance.sh calls uv sync --locked directly; the shell cannot find uv on its PATH; set -euo pipefail aborts with uv: command not found.
  5. Same failure for any developer running bash scripts/codex/quick-check.sh from a fresh shell.

How to fix

Switch the fallback to the canonical uv installer, which sidesteps PEP 668 and writes shell-rc entries so subsequent shells inherit the PATH:

if ! command -v uv >/dev/null 2>&1; then
  curl -LsSf https://astral.sh/uv/install.sh | sh
  export PATH="$HOME/.local/bin:$PATH"
fi

Alternatives: pipx install uv, or python3 -m pip install --user --break-system-packages uv plus defensively prepending $HOME/.local/bin to PATH at the top of maintenance.sh and quick-check.sh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant