test: End-to-end integration test suite for cognitive subsystems by Copilot · Pull Request #101 · Steake/GodelOS

Copilot · 2026-03-05T23:06:36Z

No runnable integration tests existed for the full cognitive pipeline (NLU → KR → Inference → NLG).

New tests (`tests/integration/test_cognitive_pipeline.py`)

14 tests across 5 classes:

TestSingleQueryRoundTrip — NLU (mocked LAP) → KR write/read → NLG output
TestKnowledgePersistence — add/retract/pattern-query with variable bindings/cross-context isolation
TestContextSwitching — multi-turn discourse context in NLU and NLG
TestInferenceChain — Socrates syllogism, two-hop deduction via ResolutionProver, failed-inference handling
TestFullPipelineRoundTrip — NLU → KR → Inference → NLG in one scenario

spaCy is mocked at the LexicalAnalyzerParser.process() boundary with synthetic SyntacticParseOutput objects — semantic interpreter, formalizer, KR, inference, and NLG all run real code. Resolution prover is called directly (bypasses a pre-existing can_handle bug in InferenceCoordinator).

`standalone` marker infrastructure

Root conftest.py auto-adds requires_backend to all tests/integration/ files, which skips them when no server is running. Added a standalone marker that suppresses this skip for in-process tests:

# tests/integration/test_cognitive_pipeline.py
pytestmark = [pytest.mark.integration, pytest.mark.standalone]

# tests/conftest.py — pytest_runtest_setup
marker_names = [mark.name for mark in item.iter_markers()]
if "requires_backend" in marker_names and "standalone" not in marker_names:
    # ... health check and skip logic

Registered in pytest.ini. All 14 tests pass with zero skips.

Original prompt

This section details on the original issue you should resolve

<issue_title>test: End-to-end integration test suite for cognitive subsystems</issue_title>
<issue_description>Unit tests exist and pass. Integration tests that exercise the full cognitive pipeline — NLU → KR → inference → NLG — do not exist in a runnable, automated form.

Scope

Design integration test harness covering at minimum: query ingestion, knowledge store write/read, inference chain, context engine round-trip, NLG output

Tests must pass in CI (dependent on infra: GitHub Actions CI pipeline + PR templates #85)

Cover at minimum the subsystems activated in feat: Activate all dormant cognitive subsystems #83

Acceptance criteria

All integration tests pass in CI without manual intervention.

Milestone: v0.4</issue_description>

Comments on the Issue (you are @copilot in this section)
@Steake @copilot Please design and implement an end-to-end integration test suite for GodelOS.
Requirements

Test the full cognitive pipeline: query ingestion (NLU) → knowledge store read/write → inference → context engine → NLG output

Use pytest; tests must pass in CI (dependent on infra: GitHub Actions CI pipeline + PR templates #85)

Cover at minimum: single-query round-trip, knowledge persistence within session, context switching, inference chain with at least one non-trivial reasoning step

Mock external dependencies (spaCy model, Z3) where necessary; do not require them in CI

Tests go in tests/integration/

Acceptance criteria

pytest tests/integration/ passes in CI with no skips on the core pipeline tests.</comment_new>

Custom agent used: Oli
Oliver Chimera Profile

Fixes test: End-to-end integration test suite for cognitive subsystems #94

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…→Inference→NLG) Add 14 integration tests in tests/integration/test_cognitive_pipeline.py covering: - Single-query round-trip (NLU → KR write/read → NLG) - Knowledge persistence within session (add/retract/query patterns) - Context switching (discourse manager across multiple utterances) - Inference chain (Socrates syllogism, two-hop reasoning via resolution) - Full pipeline round-trip (NLU → KR → Inference → NLG) All tests mock spaCy at the LexicalAnalyzerParser level so they run in CI without optional native packages. Add `standalone` marker to opt out of the `requires_backend` auto-skip for in-process tests. Co-authored-by: Steake <530040+Steake@users.noreply.github.com>

…lasses The module-level pytestmark already applies the integration marker to all tests in the module. Co-authored-by: Steake <530040+Steake@users.noreply.github.com>

Steake · 2026-03-06T00:00:27Z

Review — LGTM, ready to undraft

645-line integration test suite covering the full cognitive pipeline in-process. The standalone marker + conftest interaction is correctly implemented — tests will not be skipped by the backend-availability check in the root conftest.

The mock strategy (patching spacy.load and the Z3 binary) is correct and consistent with the existing skip guards in the unit suite. The five test classes cover the full NLU → KR → Inference → NLG round-trip at an appropriate level of abstraction.

One note: TestKnowledgePersistence uses in-session persistence only — which is correct given the current in-memory KR store. When #97 (persistent storage) lands, these tests will need a fixture update; a # TODO(#97) comment at that class would help future contributors.

Otherwise clean. Undrafting and marking ready.

Steake · 2026-03-06T00:09:34Z

Review — Integration Test Suite ✅ LGTM, ready to merge

14 tests across 5 classes, all exercising in-process pipeline components without requiring a running backend server. The token construction helpers (_make_token, _build_chase_tokens) are exactly the right approach — they allow NLU pipeline testing without the spaCy dependency, consistent with the skip-guard strategy established in #90 and #92.

Confirms:

pytest.mark.integration and pytest.mark.standalone markers applied correctly
tests/integration/conftest.py suppresses the backend check for standalone tests
pytest.ini updated with the new marker registration

One note: the tests import from godelOS.nlu_nlg.nlu.pipeline and godelOS.inference_engine.coordinator, both of which exist on main after the Phase 2 merges. Confirm no import-time failures on a clean install (i.e., run pytest tests/integration/ -v --tb=short against main + this branch). Given the harness bug on the #100 branch, this is worth verifying in isolation.

This is the most structurally honest PR in the current batch — it tests real pipeline behaviour rather than asserting that awareness levels have reached philosophically convenient thresholds. Merge after #99 (CI) lands so the results are independently verified.

Copilot

Pull request overview

Adds an in-process integration test suite that exercises GödelOS’s cognitive pipeline (NLU → KR → inference → context/discourse → NLG) and introduces a standalone marker to prevent these tests from being skipped when no backend server is running.

Changes:

Added tests/integration/test_cognitive_pipeline.py with 14 integration tests spanning NLU/KR/inference/NLG and discourse context handling.
Added a standalone marker and updated test skip logic so in-process integration tests run without a backend health check.
Registered the new marker in pytest.ini and documented the behavior in tests/integration/conftest.py.