Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .claude/commands/red.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,21 @@ This phase is **not part of the regular TDD workflow** and must only be applied
- Once sufficient understanding is achieved, all spike code is discarded, and normal TDD resumes starting from the **Red Phase**.
- A Spike is justified only when it is impossible to define a meaningful failing test due to technical uncertainty or unknown system behavior.

### If a New Test Passes Immediately

If a newly written test passes without any implementation change, do not assume it is correct. Verify it actually exercises the intended behavior:

1. Identify the implementation line most likely responsible for the pass
2. Temporarily remove that line
3. Run the **full test suite** (not just the new test)

Then interpret the result:

- **Only the new test fails** — the line was never driven by a prior test. This is accidental over-implementation: delete the line permanently and proceed to the green phase to reintroduce it properly.
- **Other existing tests also fail** — the line was already legitimately required by prior work. The new test is valid regression coverage. Restore the line; the test is confirmed correct as written.

In both cases, confirm the new test fails for the expected reason before proceeding (the right assertion, not a syntax or import error).
Comment on lines +101 to +110
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reorder validation flow to run the targeted test before the full suite.

At Line 103, requiring a full-suite run immediately conflicts with the project’s targeted-first test iteration rule. After temporarily removing the line, first run the new test alone to confirm it turns red for the expected assertion reason; then run the full suite to assess prior coverage impact.

Suggested wording update
-1. Identify the implementation line most likely responsible for the pass
-2. Temporarily remove that line
-3. Run the **full test suite** (not just the new test)
+1. Identify the implementation line most likely responsible for the pass
+2. Temporarily remove that line
+3. Run the **new test in isolation** and confirm it fails for the expected assertion reason
+4. Then run the **full test suite** to check whether prior tests also depend on that line

Based on learnings: "When iterating on a single test, run that test in isolation first... Only run the full suite once the target test behaves as expected."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/red.md around lines 101 - 110, Update the validation flow
so the targeted test is run before the full suite by changing the instruction
that currently reads "Run the **full test suite** (not just the new test)";
instead, instruct to first run the new/targeted test in isolation to confirm it
fails for the expected assertion, then run the full suite to check broader
impact, and adjust the surrounding wording to reflect "When iterating on a
single test, run that test in isolation first; only run the full suite once the
target test behaves as expected."


### General Information

- Sometimes the test output shows as no tests have been run when a new test is failing due to a missing import or constructor. In such cases, allow the agent to create simple stubs. Ask them if they forgot to create a stub if they are stuck.
Expand Down
2 changes: 1 addition & 1 deletion .copier-answers.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Changes here will be overwritten by Copier
_commit: v0.0.109
_commit: v0.0.110
_src_path: gh:LabAutomationAndScreening/copier-base-template.git
description: Copier template for creating Python libraries and executables
install_claude_cli: true
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,5 +65,5 @@
"initializeCommand": "sh .devcontainer/initialize-command.sh",
"onCreateCommand": "sh .devcontainer/on-create-command.sh",
"postStartCommand": "sh .devcontainer/post-start-command.sh"
// Devcontainer context hash (do not manually edit this, it's managed by a pre-commit hook): d77a3ff3 # spellchecker:disable-line
// Devcontainer context hash (do not manually edit this, it's managed by a pre-commit hook): 087a93d5 # spellchecker:disable-line
}
14 changes: 10 additions & 4 deletions .devcontainer/manual-setup-deps.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

REPO_ROOT_DIR = Path(__file__).parent.parent.resolve()
ENVS_CONFIG = REPO_ROOT_DIR / ".devcontainer" / "envs.json"
PULUMI_CLI_INSTALL_SCRIPT = REPO_ROOT_DIR / ".devcontainer" / "install-pulumi-cli.sh"
UV_PYTHON_ALREADY_CONFIGURED = "UV_PYTHON" in os.environ
parser = argparse.ArgumentParser(description="Manual setup for dependencies in the repo")
_ = parser.add_argument(
Expand Down Expand Up @@ -140,10 +141,15 @@ def main():
and env.lock_file.exists()
and '"pulumi"' in env.lock_file.read_text()
):
_ = subprocess.run(
["sh", str(REPO_ROOT_DIR / ".devcontainer" / "install-pulumi-cli.sh"), str(env.lock_file)],
check=True,
)
if not PULUMI_CLI_INSTALL_SCRIPT.exists():
print(
f"Pulumi CLI install script not found at {PULUMI_CLI_INSTALL_SCRIPT}, skipping Pulumi CLI installation"
)
else:
_ = subprocess.run(
["sh", str(PULUMI_CLI_INSTALL_SCRIPT), str(env.lock_file)],
check=True,
)
elif env.package_manager == PackageManager.PNPM:
pnpm_command = ["pnpm", "install", "--dir", str(env.path)]
if env_check_lock:
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ jobs:

check-skip-duplicate:
runs-on: ubuntu-24.04
timeout-minutes: 2
outputs:
should-run: ${{ steps.check.outputs.should-run }}
steps:
Expand Down
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ This project is a Copier template used to generate other copier templates. It is
- Avoid magic values in comparisons in tests in all languages (like ruff rule PLR2004 specifies)
- Prefer using random values in tests rather than arbitrary ones (e.g. the faker library, uuids, random.randint) when possible. For enums, pick randomly rather than hardcoding one value.
- Avoid loops in tests — assert each item explicitly so failures pinpoint the exact element. When verifying a condition across all items in a collection, collect the violations into a list and assert it's empty (e.g., assert [x for x in items if bad_condition(x)] == []).
- When a test's final assertion is an absence (e.g., element is `null`, list is empty, modal is closed), include a prior presence assertion confirming the expected state existed before the action that removed it. A test whose only assertion is an absence check can pass vacuously if setup silently failed.
- When asserting a mock or spy was called with specific arguments, always constrain as tightly as possible. In order of preference: (1) assert called exactly once with those args (`assert_called_once_with` in Python, `toHaveBeenCalledExactlyOnceWith` in Vitest/Jest); (2) if multiple calls are expected, assert the total call count and use a positional or last-call assertion (`nthCalledWith`, `lastCalledWith` / `assert_has_calls` with `call_args_list[n]`); (3) plain "called with at any point" (`toHaveBeenCalledWith`, `assert_called_with`) is a last resort only when neither the call count nor the call order can reasonably be constrained.

### Python Testing
Expand Down
15 changes: 15 additions & 0 deletions template/.claude/commands/red.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,21 @@ This phase is **not part of the regular TDD workflow** and must only be applied
- Once sufficient understanding is achieved, all spike code is discarded, and normal TDD resumes starting from the **Red Phase**.
- A Spike is justified only when it is impossible to define a meaningful failing test due to technical uncertainty or unknown system behavior.

### If a New Test Passes Immediately

If a newly written test passes without any implementation change, do not assume it is correct. Verify it actually exercises the intended behavior:

1. Identify the implementation line most likely responsible for the pass
2. Temporarily remove that line
3. Run the **full test suite** (not just the new test)

Then interpret the result:

- **Only the new test fails** — the line was never driven by a prior test. This is accidental over-implementation: delete the line permanently and proceed to the green phase to reintroduce it properly.
- **Other existing tests also fail** — the line was already legitimately required by prior work. The new test is valid regression coverage. Restore the line; the test is confirmed correct as written.

In both cases, confirm the new test fails for the expected reason before proceeding (the right assertion, not a syntax or import error).
Comment on lines +101 to +110
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep template guidance aligned with targeted-first test execution.

Line 103 currently pushes a full-suite run too early. To match established workflow, validate the new test’s red state in isolation first, then run the full suite for dependency/regression signal.

Suggested wording update
-1. Identify the implementation line most likely responsible for the pass
-2. Temporarily remove that line
-3. Run the **full test suite** (not just the new test)
+1. Identify the implementation line most likely responsible for the pass
+2. Temporarily remove that line
+3. Run the **new test in isolation** and confirm it fails for the expected assertion reason
+4. Then run the **full test suite** to check whether prior tests also depend on that line

Based on learnings: "When iterating on a single test, run that test in isolation first... Only run the full suite once the target test behaves as expected."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@template/.claude/commands/red.md` around lines 101 - 110, Update the
procedure so the "Run the **full test suite** (not just the new test)" step is
not executed before validating the new test in isolation: change the sequence in
the checklist to first run only the targeted/new test to confirm it fails for
the expected reason, then, after confirming the isolated failure, run the full
test suite to detect regressions; edit the wording around the three steps to
make this order explicit and preserve the guidance about interpreting results
and confirming the correct failure mode.


### General Information

- Sometimes the test output shows as no tests have been run when a new test is failing due to a missing import or constructor. In such cases, allow the agent to create simple stubs. Ask them if they forgot to create a stub if they are stuck.
Expand Down
14 changes: 10 additions & 4 deletions template/.devcontainer/manual-setup-deps.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

REPO_ROOT_DIR = Path(__file__).parent.parent.resolve()
ENVS_CONFIG = REPO_ROOT_DIR / ".devcontainer" / "envs.json"
PULUMI_CLI_INSTALL_SCRIPT = REPO_ROOT_DIR / ".devcontainer" / "install-pulumi-cli.sh"
UV_PYTHON_ALREADY_CONFIGURED = "UV_PYTHON" in os.environ
parser = argparse.ArgumentParser(description="Manual setup for dependencies in the repo")
_ = parser.add_argument(
Expand Down Expand Up @@ -140,10 +141,15 @@ def main():
and env.lock_file.exists()
and '"pulumi"' in env.lock_file.read_text()
):
_ = subprocess.run(
["sh", str(REPO_ROOT_DIR / ".devcontainer" / "install-pulumi-cli.sh"), str(env.lock_file)],
check=True,
)
if not PULUMI_CLI_INSTALL_SCRIPT.exists():
print(
f"Pulumi CLI install script not found at {PULUMI_CLI_INSTALL_SCRIPT}, skipping Pulumi CLI installation"
)
else:
_ = subprocess.run(
["sh", str(PULUMI_CLI_INSTALL_SCRIPT), str(env.lock_file)],
check=True,
)
elif env.package_manager == PackageManager.PNPM:
pnpm_command = ["pnpm", "install", "--dir", str(env.path)]
if env_check_lock:
Expand Down
1 change: 1 addition & 0 deletions template/.github/workflows/ci.yaml.jinja
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ jobs:

check-skip-duplicate:
runs-on: {% endraw %}{{ gha_linux_runner }}{% raw %}
timeout-minutes: {% endraw %}{{ gha_short_timeout_minutes }}{% raw %}
permissions:
contents: read
pull-requests: read # needed to check if PR exists for current branch
Expand Down
1 change: 1 addition & 0 deletions template/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ This project is a Python library.
- Avoid magic values in comparisons in tests in all languages (like ruff rule PLR2004 specifies)
- Prefer using random values in tests rather than arbitrary ones (e.g. the faker library, uuids, random.randint) when possible. For enums, pick randomly rather than hardcoding one value.
- Avoid loops in tests — assert each item explicitly so failures pinpoint the exact element. When verifying a condition across all items in a collection, collect the violations into a list and assert it's empty (e.g., assert [x for x in items if bad_condition(x)] == []).
- When a test's final assertion is an absence (e.g., element is `null`, list is empty, modal is closed), include a prior presence assertion confirming the expected state existed before the action that removed it. A test whose only assertion is an absence check can pass vacuously if setup silently failed.
- When asserting a mock or spy was called with specific arguments, always constrain as tightly as possible. In order of preference: (1) assert called exactly once with those args (`assert_called_once_with` in Python, `toHaveBeenCalledExactlyOnceWith` in Vitest/Jest); (2) if multiple calls are expected, assert the total call count and use a positional or last-call assertion (`nthCalledWith`, `lastCalledWith` / `assert_has_calls` with `call_args_list[n]`); (3) plain "called with at any point" (`toHaveBeenCalledWith`, `assert_called_with`) is a last resort only when neither the call count nor the call order can reasonably be constrained.

### Python Testing
Expand Down