Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .codex/environments/environment.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version = 1
name = "langfuse-python"

[setup]
script = "bash scripts/codex/setup.sh"
30 changes: 30 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## What does this PR do?

> PR title must follow Conventional Commits, for example `feat: add dataset scoring helper` or `fix(openai): preserve trace context`.

Fixes #

## Type of change

- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Refactor
- [ ] Documentation update
- [ ] Tooling, CI, or repo maintenance

## Verification

List the main commands you ran:

```bash

```

## Checklist

- [ ] I self-reviewed the diff using `code_review.md`.
- [ ] I added or updated tests for behavior changes.
- [ ] I updated docs, examples, or `.env.template` if needed.
- [ ] I did not hand-edit generated files; if generated files changed, I used the upstream regeneration path.
- [ ] I did not commit secrets or credentials.
45 changes: 45 additions & 0 deletions .github/workflows/validate-pr-title.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
name: "Validate PR Title"

on:
pull_request:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Switch PR-title workflow to fork-safe trigger

Using pull_request here makes this action fail for fork-based pull requests (the action’s own docs state this trigger only works for branches in the same repository). In any workflow that accepts external contributors, this causes the required PR-title check to error on fork PRs and can block merges until a maintainer intervenes. Trigger this workflow with pull_request_target instead (with minimal permissions) so title validation runs reliably for forks.

Useful? React with 👍 / 👎.

branches:
- "**"
types:
- opened
- edited
- synchronize
- reopened

permissions: {}

jobs:
validate-pr-title:
runs-on: ubuntu-latest
permissions:
statuses: write
pull-requests: read
steps:
- name: Validate PR title follows conventional commits
uses: amannn/action-semantic-pull-request@48f256284bd46cdaab1048c3721360e808335d50 # v6.1.1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
types: |
feat
fix
docs
style
refactor
perf
test
build
ci
chore
revert
security
requireScope: false
validateSingleCommit: false
ignoreLabels: |
bot
ignore-semantic-pull-request
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,8 @@ docs
tests/mocks/llama-index-storage

*.local.*

# Codex local runtime state
.codex/log/
.codex/sessions/
.codex/tmp/
24 changes: 22 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ This repository contains the Langfuse Python SDK.
- `tests/live_provider/`: live OpenAI / LangChain provider tests
- `tests/support/`: shared helpers for e2e tests
- `scripts/select_e2e_shard.py`: CI shard selector for `tests/e2e`
- `scripts/codex/`: Codex cloud/worktree bootstrap and shared quick checks

## Working Style

Expand All @@ -34,6 +35,8 @@ This repository contains the Langfuse Python SDK.
- Keep repo-shared instructions here. Keep personal or machine-specific notes out of version control.
- Keep tests independent and parallel-safe by default.
- For bug fixes, prefer writing or identifying the failing test first, confirm the failure, then implement the fix.
- For complex or ambiguous tasks, plan first, identify the likely verification path, then implement.
- Before final handoff, review the diff for correctness, regressions, missing tests, and accidental generated-file edits.

## Setup And Quality Commands

Expand All @@ -43,6 +46,7 @@ uv run pre-commit install
uv run --frozen ruff check .
uv run --frozen ruff format .
uv run --frozen mypy langfuse --no-error-summary
bash scripts/codex/quick-check.sh
```

## Test Commands
Expand All @@ -66,6 +70,18 @@ uv run --frozen pytest -n 4 --dist worksteal tests/live_provider -m "live_provid
uv run --frozen pytest tests/unit/test_resource_manager.py::test_pause_signals_score_consumer_shutdown
```

Minimum verification matrix:

| Change scope | Minimum verification |
| --- | --- |
| Docs or comments only | `uv run --frozen ruff format --check .` if Python files changed |
| Python source only | `uv run --frozen ruff check .` + `uv run --frozen mypy langfuse --no-error-summary` + targeted unit tests |
| Unit-test-only change | targeted `uv run --frozen pytest ...` for the changed tests |
| Shutdown, flushing, worker-thread, or OTEL-heavy change | targeted resource-manager/OTEL tests plus affected integration tests when relevant |
| OpenAI or LangChain instrumentation | targeted unit tests using exporter-local assertions; add e2e/live-provider coverage only when unit tests cannot cover behavior |
| Generated API client or public API contract | upstream Fern/OpenAPI regeneration path plus targeted SDK serialization/deserialization tests |
| CI, sharding, or bootstrap | relevant script test plus CI workflow review against this file's CI contract |

## Test Topology

### `tests/unit`
Expand Down Expand Up @@ -96,6 +112,7 @@ The main CI workflow currently runs:
- `tests/unit` on a Python 3.10-3.14 matrix
- `tests/e2e` in 2 mechanical shards plus a serial subset inside each shard
- `tests/live_provider` as one always-on suite
- PR title validation for Conventional Commits

If you change the e2e split:

Expand All @@ -113,16 +130,19 @@ If you change CI bootstrap:
- Keep changes scoped. Avoid unrelated refactors.
- Prefer `LANGFUSE_BASE_URL`; `LANGFUSE_HOST` is deprecated and is only kept for compatibility tests.
- If you touch `langfuse/api/`, regenerate it from the upstream Fern/OpenAPI source instead of hand-editing files.
- If you change public SDK behavior, update examples, README snippets, or generated reference docs when they would otherwise become stale.
- If you touch shutdown, flushing, or worker-thread behavior, run the relevant resource-manager and OTEL-heavy tests.
- If you change OpenAI or LangChain instrumentation, keep as much coverage as possible in `tests/unit` using exporter-local assertions, and leave only the minimal necessary coverage in `tests/e2e` / `tests/live_provider`.
- Never commit secrets or credentials.
- Keep `.env.template` in sync with required local-development environment variables.

## Commit And PR Rules

- Commit messages and PR titles should follow Conventional Commits: `type(scope): description` or `type: description`.
- Commit messages and PR titles must follow Conventional Commits: `type(scope): description` or `type: description`.
- Allowed common types include `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`, and `security`.
- Keep commits focused and atomic.
- In PR descriptions, list the main verification commands you ran.
- Before opening a PR, self-review the diff and check `code_review.md` for the repo-specific review checklist.
- In PR descriptions, list the main verification commands you ran and call out any skipped checks with the reason.

## Python-Specific Notes

Expand Down
96 changes: 74 additions & 22 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,100 @@

### Install dependencies

```
uv sync
```bash
uv sync --locked
```

### Add Pre-commit
### Add pre-commit

```
```bash
uv run pre-commit install
```

### Type Checking
### Quality checks

To run type checking on the langfuse package, run:
```sh
uv run mypy langfuse --no-error-summary
```bash
uv run --frozen ruff check .
uv run --frozen ruff format .
uv run --frozen mypy langfuse --no-error-summary
```

For a broad local confidence check, run:

```bash
bash scripts/codex/quick-check.sh
```

### Tests

#### Setup
Unit tests do not require a running Langfuse server:

- Add .env based on .env.template
```bash
uv run --frozen pytest -n auto --dist worksteal tests/unit
```

#### Run
E2E tests require a running Langfuse server and environment variables based on `.env.template`:

- Run all
```bash
uv run --frozen pytest -n 4 --dist worksteal tests/e2e -m "not serial_e2e"
uv run --frozen pytest tests/e2e -m "serial_e2e"
```

Live-provider tests make real provider calls and require provider API keys:

```bash
uv run --frozen pytest -n 4 --dist worksteal tests/live_provider -m "live_provider"
```

Run a specific test with:

```
uv run --env-file .env pytest -s -v --log-cli-level=INFO
```
```bash
uv run --frozen pytest tests/unit/test_resource_manager.py::test_pause_signals_score_consumer_shutdown
```

## Codex Cloud Setup

This repository includes repo-owned Codex setup so agents can start from a reproducible environment.

Recommended Codex UI configuration:

1. Create a Codex cloud environment for this repository.
2. Set the setup script to:

```bash
bash scripts/codex/setup.sh
```

3. Set the maintenance script to:

```bash
bash scripts/codex/maintenance.sh
```

4. Keep agent internet access disabled by default, or allow only the domains required for the task.
5. Add secrets and environment variables in the Codex UI instead of committing them.

## Pull Requests

PR titles and commit messages must follow Conventional Commits:

```text
type(scope): description
type: description
```

- Run a specific test
Common types include `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`, and `security`.

```
uv run --env-file .env pytest -s -v --log-cli-level=INFO tests/test_core_sdk.py::test_flush
```
Before opening a PR:

- E2E tests involving OpenAI and Serp API are usually skipped, remove skip decorators in [tests/test_langchain.py](tests/test_langchain.py) to run them.
- Self-review the diff and use `code_review.md` for the repo-specific checklist.
- Keep changes focused and avoid unrelated refactors.
- Add or update tests for behavior changes.
- List the verification commands you ran in the PR description.

### Update openapi spec
### Update OpenAPI spec

A PR with the changes is automatically created upon changing the Spec in the langfuse repo.
The generated API client in `langfuse/api/` must not be hand-edited. Regenerate it from the upstream Fern/OpenAPI source.

### Publish release

Expand Down
38 changes: 38 additions & 0 deletions code_review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Langfuse Python SDK Review Checklist

Use this checklist for `/review`, PR review, or self-review before handoff.

## Priorities

- Findings first: correctness bugs, regressions, security/privacy risks, performance issues with real impact, and missing tests for risky behavior.
- Keep line references tight and actionable.
- If no findings, say so explicitly and mention any residual risk or unrun verification.

## SDK Correctness

- Public SDK behavior should remain backwards compatible unless the PR is explicitly breaking.
- Prefer `LANGFUSE_BASE_URL`; `LANGFUSE_HOST` is deprecated and should only appear in compatibility paths or tests.
- Check shutdown, flushing, background task, and resource-manager changes for races, dropped events/scores/media, daemon-thread leaks, and hanging interpreter shutdown.
- OpenTelemetry changes should preserve context propagation, span parenting, exporter-local testability, and idempotent instrumentation setup.
- OpenAI and LangChain instrumentation should avoid brittle assertions on provider internals; prefer stable exporter-local behavior in unit tests.

## API And Generated Code

- Do not hand-edit `langfuse/api/`; regenerate it from the upstream Fern/OpenAPI source.
- Public API or serialization changes should include tests for request shape, response shape, and backwards-compatible aliases when relevant.
- Update README examples, `.env.template`, or generated reference docs when changed behavior would make them stale.

## Tests And CI

- Unit tests must not require a running Langfuse server.
- E2E tests should use bounded polling helpers from `tests/support/`, not raw `sleep()`.
- New e2e files must be named `tests/e2e/test_*.py` so mechanical CI sharding includes them.
- Use `serial_e2e` only for tests that are unsafe with shared-server concurrency.
- Live-provider tests should assert stable provider-facing behavior, not exact observation counts unless counts are the behavior under test.

## Python Style

- Exception messages should not inline f-string literals in `raise` statements; build the message in a variable first.
- Keep edits ASCII-only unless the file already uses Unicode or Unicode is clearly required.
- Keep changes scoped; avoid opportunistic refactors.
- Never commit secrets or credentials.
8 changes: 8 additions & 0 deletions scripts/codex/maintenance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/usr/bin/env bash
set -euo pipefail

repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
cd "$repo_root"

uv sync --locked
uv cache prune --ci >/dev/null 2>&1 || true
9 changes: 9 additions & 0 deletions scripts/codex/quick-check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail

repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
cd "$repo_root"

uv run --frozen ruff check .
uv run --frozen mypy langfuse --no-error-summary
uv run --frozen pytest -n auto --dist worksteal tests/unit
13 changes: 13 additions & 0 deletions scripts/codex/setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash
set -euo pipefail

repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
cd "$repo_root"

if ! command -v uv >/dev/null 2>&1; then
python3 -m pip install --user "uv==0.11.2"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Pinned uv version may conflict with lockfile format

uv==0.11.2 is installed as the fallback, but uv sync --locked is called immediately after with the repo's existing uv.lock. If the lockfile was generated with a different uv version (older or newer), the sync can fail due to lockfile format incompatibilities. Consider either pinning to the same version used to generate the lockfile, or adding a comment explaining why 0.11.2 was chosen and how it should be kept in sync with the project's own uv dependency.

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/codex/setup.sh
Line: 8

Comment:
**Pinned uv version may conflict with lockfile format**

`uv==0.11.2` is installed as the fallback, but `uv sync --locked` is called immediately after with the repo's existing `uv.lock`. If the lockfile was generated with a different uv version (older or newer), the sync can fail due to lockfile format incompatibilities. Consider either pinning to the same version used to generate the lockfile, or adding a comment explaining why `0.11.2` was chosen and how it should be kept in sync with the project's own uv dependency.

How can I resolve this? If you propose a fix, please make it concise.

export PATH="$HOME/.local/bin:$PATH"
fi

uv sync --locked
uv run --frozen python --version

Check warning on line 13 in scripts/codex/setup.sh

View check run for this annotation

Claude / Claude Code Review

Fragile uv bootstrap fallback in setup.sh

The fallback bootstrap in `scripts/codex/setup.sh` (lines 7-10) has two compounding issues when `uv` is not preinstalled. First, `python3 -m pip install --user "uv==0.11.2"` aborts with `error: externally-managed-environment` on PEP 668-marked system Pythons (Debian 12+, Ubuntu 23.04+, recent Fedora/Alpine, many slim images), and under `set -euo pipefail` the bootstrap dies with no useful context. Second, `export PATH="$HOME/.local/bin:$PATH"` is process-local: `scripts/codex/maintenance.sh` and
Comment on lines +7 to +13
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The fallback bootstrap in scripts/codex/setup.sh (lines 7-10) has two compounding issues when uv is not preinstalled. First, python3 -m pip install --user "uv==0.11.2" aborts with error: externally-managed-environment on PEP 668-marked system Pythons (Debian 12+, Ubuntu 23.04+, recent Fedora/Alpine, many slim images), and under set -euo pipefail the bootstrap dies with no useful context. Second, export PATH="$HOME/.local/bin:$PATH" is process-local: scripts/codex/maintenance.sh and scripts/codex/quick-check.sh run as independent shells (also under set -euo pipefail) and call uv directly, so they fail with uv: command not found whenever ~/.local/bin is not on the inherited PATH. Suggested fix: switch the fallback to the canonical curl -LsSf https://astral.sh/uv/install.sh | sh (or pipx install uv), which sidesteps PEP 668 and writes shell-rc entries so subsequent script invocations find uv.

Extended reasoning...

What the bug is

scripts/codex/setup.sh is the documented Codex Cloud setup hook (CONTRIBUTING.md now lists bash scripts/codex/setup.sh as the recommended setup script and .codex/environments/environment.toml wires it into the environment). Its fallback path for installing uv is:

if ! command -v uv >/dev/null 2>&1; then
  python3 -m pip install --user "uv==0.11.2"
  export PATH="$HOME/.local/bin:$PATH"
fi

Two things go wrong when uv is not preinstalled:

  1. PEP 668 aborts the install. On Debian 12+, Ubuntu 23.04+, recent Fedora/Alpine, and many slim base images, the system Python is marked externally-managed. python3 -m pip install --user ... then exits non-zero with error: externally-managed-environment. With set -euo pipefail, this kills setup.sh with no diagnostic beyond the pip error.
  2. PATH does not persist. Even if pip succeeds, export PATH="$HOME/.local/bin:$PATH" only mutates the env of the current setup.sh process. scripts/codex/maintenance.sh and scripts/codex/quick-check.sh are subsequent independent shells (each #!/usr/bin/env bash with set -euo pipefail) that invoke uv sync --locked / uv run --frozen ... directly, with no PATH prepend and no fallback. ~/.local/bin is only added to PATH by ~/.profile / ~/.bash_profile, which non-login shells (typical in CI runners and minimal images) do not source.

Why existing code does not prevent it

The author clearly anticipated the missing-uv case for setup.sh itself, but the fix is incomplete: neither maintenance.sh nor quick-check.sh re-prepend $HOME/.local/bin, and setup.sh does not write the PATH update into a profile/rc file the next shell would inherit. The author's local Codex testing presumably had uv already on PATH (the conditional was skipped) or used a non-PEP 668 base image, masking the problem.

Impact

This only triggers when (a) the fallback branch fires (uv not preinstalled) AND (b) the active distro is PEP 668-marked (issue 1) or ~/.local/bin is not on the inherited PATH (issue 2). In a fresh Codex Cloud environment using a recent Debian/Ubuntu slim image — exactly the case the fallback was written for — this is the first thing that runs, and the bootstrap aborts before ever calling uv sync. Even on non-PEP 668 systems, a successful pip install can still leave maintenance.sh failing on uv: command not found in a fresh shell. Scope is narrow (dev tooling for Codex agents), but the fallback path is exactly the case it needs to handle.

Step-by-step proof

Concrete reproducer on a typical PEP 668 base image (e.g. python:3.13-slim or debian:12):

  1. Container starts with no uv and ~/.local/bin not on PATH.
  2. Codex runs bash scripts/codex/setup.sh per .codex/environments/environment.toml.
  3. command -v uv fails, so the fallback branch executes.
  4. python3 -m pip install --user "uv==0.11.2" exits 1 with error: externally-managed-environment.
  5. set -euo pipefail aborts setup.sh; uv sync --locked and the version-print line never run.

Alternative path (non-PEP 668 system, e.g. older base image):

  1. python3 -m pip install --user "uv==0.11.2" succeeds, installing uv into ~/.local/bin/uv.
  2. setup.sh's process PATH now includes $HOME/.local/bin; uv sync --locked succeeds; setup.sh exits 0.
  3. Codex later runs bash scripts/codex/maintenance.sh in a fresh non-login shell that does not source ~/.profile.
  4. maintenance.sh calls uv sync --locked directly; the shell cannot find uv on its PATH; set -euo pipefail aborts with uv: command not found.
  5. Same failure for any developer running bash scripts/codex/quick-check.sh from a fresh shell.

How to fix

Switch the fallback to the canonical uv installer, which sidesteps PEP 668 and writes shell-rc entries so subsequent shells inherit the PATH:

if ! command -v uv >/dev/null 2>&1; then
  curl -LsSf https://astral.sh/uv/install.sh | sh
  export PATH="$HOME/.local/bin:$PATH"
fi

Alternatives: pipx install uv, or python3 -m pip install --user --break-system-packages uv plus defensively prepending $HOME/.local/bin to PATH at the top of maintenance.sh and quick-check.sh.

Loading