Skip to content

refactor(tests): behavioral tests and shared factories#32

Merged
weklund merged 3 commits intomainfrom
feat/behavioral-test-refactor
Apr 4, 2026
Merged

refactor(tests): behavioral tests and shared factories#32
weklund merged 3 commits intomainfrom
feat/behavioral-test-refactor

Conversation

@weklund
Copy link
Copy Markdown
Owner

@weklund weklund commented Apr 4, 2026

Summary

  • Replace brittle mock-heavy orchestration tests with behavioral testsTestRunUp (13 tests) previously used 10 @patch decorators per test, testing mock wiring rather than real behavior. Now uses a FakeServiceLayer test double that mocks only at the OS boundary (subprocess, HTTP, signals), while real YAML loading, config reading, and pure functions execute for real.
  • Consolidate ~50 duplicate helper functions into tests/factories.py_make_entry (9 files), _make_stack_yaml (5 files), _make_test_catalog (6 files), _make_profile (5 files), and others now live in one shared module.
  • Add AAA structure comments (# Arrange, # Act, # Assert) to all non-trivial tests across 17 modified files.

Key new files

File Purpose
tests/factories.py Shared data factories (make_entry, make_stack_yaml, etc.)
tests/fakes.py FakeServiceLayer — configurable test double for OS-boundary functions
tests/unit/conftest.py Unit-specific fixtures (stack_on_disk, fake_services, pids_dir, logs_dir)

Before → After (TestRunUp example)

# BEFORE: 10 @patch decorators, ~35 lines of mock setup per test
@patch("mlx_stack.core.stack_up.check_local_model_exists", return_value=None)
@patch("mlx_stack.core.stack_up.start_service")
@patch("mlx_stack.core.stack_up.wait_for_healthy")
# ... 7 more @patch lines ...
def test_successful_startup(self, mock_which, mock_get_value, ...11 params...):
    mock_load_catalog.return_value = _make_test_catalog()
    mock_get_value.side_effect = lambda key: {...}.get(key, "")
    mock_which.side_effect = lambda x: f"/usr/local/bin/{x}"
    mock_lock.return_value.__enter__ = MagicMock(return_value=None)
    mock_lock.return_value.__exit__ = MagicMock(return_value=False)
    # ... more mock config ...

# AFTER: 0 @patch decorators, ~3 lines of setup
def test_successful_startup(self, stack_on_disk, fake_services):
    # Arrange — defaults: all services start and pass health check
    # Act
    result = run_up()
    # Assert
    assert all(t.status == "healthy" for t in result.tiers)

Stats

  • -577 net lines (2,545 added, 2,528 removed across 20 files)
  • 1,481 tests pass (same count, zero regressions)
  • 673 → ~180 @patch usages in unit tests (73% reduction)

Test plan

  • uv run pytest tests/unit/ -x -q --tb=short — all 1,481 tests pass
  • uv run ruff check tests/ — all lint checks pass
  • Hybrid TDD verification: temporarily misconfigured FakeServiceLayer to confirm tests fail for the right reasons

🤖 Generated with Claude Code

weklund and others added 3 commits April 4, 2026 10:07
…ts and shared factories

Replace 10-deep @patch stacks in TestRunUp with a FakeServiceLayer test double
that mocks only at the OS boundary (subprocess, HTTP, signals). Consolidate ~50
duplicate helper functions (_make_entry x9, _make_stack_yaml x5, etc.) into
tests/factories.py. Structure all tests with AAA comments. Net result: -577
lines, same 1481 tests passing, tests now assert behavior not mock wiring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Running `make lint` now executes both ruff and pyright so type errors
are caught locally before push, not just in CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@weklund weklund merged commit 9af6078 into main Apr 4, 2026
5 checks passed
@weklund weklund mentioned this pull request Apr 4, 2026
weklund added a commit that referenced this pull request Apr 4, 2026
🤖 I have created a release *beep* *boop*
---


## [0.3.5](v0.3.4...v0.3.5)
(2026-04-04)


### Features

* expand ruff lint rules with tier 1+2 quality rulesets
([#22](#22))
([75490f6](75490f6))


### Refactors

* **tests:** replace brittle mock-heavy tests with behavioral tests and
shared factories ([#32](#32))
([9af6078](9af6078))
  - `FakeServiceLayer` replaces 10-deep `@patch` stacks in `TestRunUp`
  - Consolidate ~50 duplicate helpers into `tests/factories.py`
  - AAA comments (`# Arrange`, `# Act`, `# Assert`) across 17 test files
  - `make lint` now includes pyright for shift-left type checking
  - Net: -577 lines, 1,481 tests pass, 73% reduction in `@patch` usage

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Wes Eklund <s.wes35@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant