Extended capabilities round 2 + cookbook + integration tests by JE-Chen · Pull Request #86 · Integration-Automation/WebRunner

JE-Chen · 2026-04-26T09:11:37Z

Summary

46 commits since PR #85. Highlights:

Latest waves

Action JSON formatter (canonical kwarg order), Markdown → action JSON transpiler, failure clustering (normalised signatures), synthetic monitoring (edge-triggered alerts), Storybook integration (discovery + visual snapshots), shadow DOM auto-pierce, OTLP exporter for Jaeger/Tempo.
Driver pinner (cache geckodriver/chromedriver, dodge GitHub rate limit), Selenium → Playwright translator, form auto-fill, workspace bootstrapper, a11y diff, fan-out task runner, extension harness, file-backed event bus.
CDP message tap (record/replay), cross-browser parity, Page Object codegen, state diff, workspace lock file, a11y trend dashboard, perf P95 drift detector.
Process supervisor, multi-stage pipeline DSL, regex test selector, Appium mobile gestures, coverage map.

Optimisation pass

7 pytest collection warnings cleared (__test__ = False on TestObject / TestRecord / etc).
workspace_lock distribution walk cached → 4 tests dropped from 0.3s to <0.05s each.
socket_server quit tests went from 2.42s → 0.49s via threading.Event + tighter poll_interval.
New driver_dispatch module collapses three Selenium-or-Playwright dispatch sites into one.
Suite: 15.41s → 11.57s (-25%).

API façade + docs

je_web_runner.api.{authoring, debugging, frontend, infra, mobile, networking, observability, quality, reliability, security, test_data} re-exports new helpers thematically.
docs/source/conf.py gains sphinx.ext.autodoc + autosummary + napoleon with mock imports for soft deps.
New Eng/doc/api_reference/api_reference.rst driving recursive per-module page generation.

Real-browser smoke + cookbook

test/e2e_test/ with conftest.py that detects Selenium Grid availability and skips cleanly.
.github/workflows/e2e_browser.yml boots selenium/hub:4.20.0 + selenium/node-chrome and runs daily / on demand.
examples/ cookbook: counting_stars.{py,json}, google_search.py, form_submit.py, smart_wait_demo.py, pii_redact_demo.py, fanout_demo.py, quick_smoke.json.

Bugs found by actually running the project

webdriver_wrapper.execute_script swallowed return values → fixed (caught by JSON cookbook smoke).
LSP Content-Length framing corrupted on Windows due to text-mode \n → \r\n translation → __main__.py now sys.stdout.reconfigure(newline="") (caught by integration subprocess test).

New: WR_sleep

Native time.sleep action so JSON pipelines no longer need WR_execute_async_script + setTimeout to pace themselves.

Comprehensive integration tests

test/integration_test/ — 30 tests across 10 files, each wiring 2+ modules with real I/O (in-memory SQLite, in-process HTTP servers, real subprocess for MCP / LSP).
Wired into test_dev.yml + test_stable.yml right after the unit-test step.

Numbers

Tests: 791 (PR Extended capabilities batch + MCP server #85 baseline) → 1230 passing (1200 unit + 30 integration).
0 collection warnings (was 7).
Unit suite: ~12s; combined unit + integration: ~18s.

Test plan

Unit + integration green on Python 3.10 / 3.11 / 3.12 / 3.13.
E2E daily workflow runs against Selenium Grid.
PyPI publish workflow fires on merge (auto patch bump + tag + release).
Sphinx build picks up the new autosummary tree without errors.

…tectors

…view / impact analysis / LSP)

…SON)

…ight glue)

…pper / a11y diff / fanout / extension / event bus)

…ic / OTLP / storybook / shadow pierce)

…creenshot)

…en / lock / a11y trend / perf drift)

The 3-line script was a side-effect-on-cwd standalone runner, never referenced by either CI workflow. The proper pytest coverage already lives at test/unit_test/test_create_project.py.

…s / double-tap)

…ook snapshots / appium gestures / coverage map)

…tion Three concrete wins with no behaviour change: - Pytest collection warnings (7 -> 0): mark TestObject / TestObjectRecord / TestRecord / TestRailError / TestcontainersError with __test__ = False so pytest stops trying to collect domain / exception classes whose name happens to start with "Test". - workspace_lock dist-walk caching: importlib.metadata.distributions() was being walked every call; the result is now memoised behind reset_distribution_cache() so per-test setup drops from ~0.3s to <0.05s. - socket_server tests (2.42s -> 0.49s): expose a threading.Event on the TCP server so callers can wait for shutdown without polling, and pass poll_interval=0.02 to serve_forever from the test helper so shutdown() itself returns within ~20ms instead of the stdlib default 500ms. Plus shared driver_dispatch.{evaluate_expression, run_script} that collapses three independent Selenium-or-Playwright dispatch sites (memory_leak / csp_reporter / smart_wait) into one module. The shared helper has its own unit tests covering both backends. Net: 7 warnings cleared, suite 15.41s -> 11.57s (-25%), 1174 -> 1184 tests. Also gitignore the local issues.json / hotspots.json / codacy.json artefacts that the SonarCloud/Codacy curl helpers drop into the repo.

(b) je_web_runner.api thematic façade Group the 50+ helpers added in recent waves into 11 themed submodules so callers can ``from je_web_runner.api import quality, observability`` instead of memorising deep import paths. Themes: authoring / debugging / frontend / infra / mobile / networking / observability / quality / reliability / security / test_data. 9-test smoke suite covers __all__ resolvability + duplicates so the façade can't silently drift from the underlying modules. (a) Real-browser E2E scaffold Add test/e2e_test/ with conftest.py that detects the Selenium Grid socket and skips cleanly when unreachable. Initial smoke tests cover smart_wait fetch idle / SPA route stable, state_diff round trip, memory_leak heap probe, csp_reporter empty collect, and shadow_pierce open-shadow walk. GitHub Actions e2e_browser.yml runs them daily / on demand against selenium/hub:4.20.0 + selenium/node-chrome via service containers. Local run: ``cd docker && docker compose up -d``, then ``WEBRUNNER_E2E_HUB=http://localhost:4444/wd/hub pytest test/e2e_test/``. (c) Sphinx autodoc + autosummary conf.py gains sphinx.ext.autodoc / autosummary / napoleon plus a mock-imports list for the soft deps that aren't part of the docs build (selenium / playwright / appium / Pillow / locust / OTel / testcontainers / etc). New api_reference.rst drives autosummary's recursive per-module reference page generation; wired into Eng/eng_index.rst so ReadTheDocs picks it up. Tests: 1184 -> 1193 (added 9 façade smoke tests). E2E suite skips cleanly without a Grid; the unit critical path stays at 12.7s.

The Python version (examples/counting_stars.py) and the equivalent action JSON (examples/counting_stars.json) drive Chrome through: - launching with --autoplay-policy=no-user-gesture-required - navigating to the regular YouTube watch URL - dismissing the EU consent banner if present - forcing video.play() to bypass any remaining autoplay gate - polling the .ytp-skip-ad-button / .ytp-ad-skip-button selectors for up to 30 seconds when a pre-roll ad is showing - holding the window open for 90 seconds via execute_async_script's setTimeout (the executor has no native sleep command, so the JSON version sets a 120s script timeout and uses an async setTimeout) Run: python examples/counting_stars.py or python -m je_web_runner -e examples/counting_stars.json

WR_sleep executor command: Adds time.sleep wrapper to action_executor with type / non-negative validation. Replaces the awkward ``WR_execute_async_script + setTimeout(callback, ms)`` pattern that the demos previously needed. 7 unit tests cover zero-second / short / negative / non-numeric / bool-rejection / executor-registration paths. examples/counting_stars.json now uses WR_sleep verbatim. Bug: webdriver_wrapper.execute_script swallowed return values The wrapper called ``self.current_webdriver.execute_script(...)`` but never returned the result, so every WR_execute_script in an action JSON resolved to None — making any "read DOM into a variable" pattern unusable. The demo run revealed this immediately. Now returns the value (and None on caught exception, matching the rest of the wrapper). Cookbook examples (examples/): - counting_stars.json — uses WR_sleep instead of fake setTimeout - quick_smoke.json — minimal sanity check - google_search.py — search + read first result heading - form_submit.py — fill httpbin /forms/post; pairs with form_autofill + state_diff helpers - smart_wait_demo.py — fetch idle + SPA route stable + memory probe - fanout_demo.py — parallel HTTP preflights via run_fan_out - pii_redact_demo.py — pure-logic scan_text / redact_text demo Each was run end-to-end against real Chrome (or network for fanout) before commit; form_submit revealed httpbin's submit button has no type=submit attribute, fixed by switching to form.submit(). Tests: 1193 -> 1200, suite still ~13s.

test/integration_test/ wires 2+ modules together with real I/O — no mocks where actual file / socket / subprocess exercise is feasible: - test_authoring_pipeline: md_authoring → action_formatter → action_linter → JSON byte-stable round trip + legacy alias detect - test_db_fixtures_sqlite: load_into_connection on a real in-memory SQLite + truncate + identifier validation safety net - test_har_replay_roundtrip: HarReplayServer + urllib + GraphQLClient hit the live HTTP server (literal/glob/regex matchers) - test_mock_services_roundtrip: MockOAuthServer → bearer token → HAR API, plus MockS3Storage round trip - test_mcp_subprocess: spawn ``python -m je_web_runner.mcp_server`` and walk initialize → tools/list → tools/call → shutdown over real stdio JSON-RPC - test_action_lsp_subprocess: spawn ``python -m je_web_runner.action_lsp`` and walk initialize → didOpen → publishDiagnostics with proper LSP Content-Length framing - test_test_selection_pipeline: coverage_map + impact_analysis + diff_shard fed the same action-tree, asserting they agree - test_bootstrap_pipeline: init_workspace → format → lint → schema sanity - test_trend_pipelines: run_ledger.record_run → trend_dashboard + a11y_trend.aggregate_history end-to-end - test_live_dashboard_roundtrip: dashboard /records endpoint exercise + VisualReviewServer accept-baseline workflow The LSP subprocess test caught a real Windows bug: ``python -m je_web_runner.action_lsp`` ran sys.stdout in text mode, so ``\n`` in the LSP framing got translated to ``\r\n``, producing ``\r\r\n`` boundaries that no LSP client can parse. Fixed in __main__.py via ``sys.stdout.reconfigure(newline="")`` so the ``Content-Length`` framing survives. CI: test_dev.yml + test_stable.yml gain a step that runs the integration suite right after the unit suite (60s timeout, same job). Tests: 1200 unit + 30 integration = 1230 passing.

codacy-production · 2026-04-26T09:12:40Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 1883 complexity · 19 duplication

Metric Results

Complexity 1883

Duplication 19

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

CI fix: - Drop --timeout=60 from the integration-test workflow step; pytest-timeout isn't a dev dep, so the flag was breaking the CI run. Each subprocess test sets its own subprocess.communicate(timeout=...) anyway. Codacy / Bandit: - B110 (try/except/pass) on best-effort cleanup paths in examples/* and test/e2e_test/conftest.py annotated with `# nosec B110` + reason. - B112 (try/except/continue) on workspace_lock dist scan and the google_search.py selector probes: log via web_runner_logger.debug and `# nosec B112` so silently-skipped errors are still observable. - B202 (tarfile.extractall) — extract_archive already validates members against the destination root via _safe_extract_zip / _safe_extract_tar; added matching ZipFile validator for symmetry and `# nosec B202` on the actual extractall calls. - B101 (assert) — pytest-style assertions in test/e2e_test and test/integration_test marked `# nosec B101` per line. - 11 unused imports across new modules trimmed. SonarCloud: - S5754 broaden-except → narrow Exception in process_supervisor.with_watchdog with a comment about why KeyboardInterrupt / SystemExit must propagate. - S3358 nested ternaries in perf_drift collapsed into _direction_for helper. - S6353 [^A-Za-z0-9_]+ → \W+ in pom_codegen. - S3457 f-string without placeholders fixed in pom_codegen. - S1172 unused params (cdp_tap.execute_cdp_cmd, action_lsp._completion) renamed to _params with a comment about preserving the public signature. - S7500 dict / list comprehensions of generator-throw idiom replaced with proper helper functions in test_otlp_exporter / test_synthetic_monitoring. - S7504 unnecessary list() preserved in bidi_backend with a NOSONAR comment because removing it breaks RuntimeError-during-iteration safety. - S5843 timestamp regex split into named pieces; _PATH_RE bounded ([\w.\-]{1,80}) so polynomial backtracking can't escape its budget. - S5852 hotspots in md_authoring tightened to greedy \S.* / bounded template name pattern. - S5869 dup char class in _TEMPLATE_RE removed. - S5906 assert_isinstance / assert_true switches in test_api_facade, test_failure_cluster, test_event_bus. - S2068 `password` literal annotated as fixture. - S125 commented-code false-positive in test_driver_pin reworded. - S1192 dup "text/plain" literal in visual_review extracted to _TEXT_PLAIN. - S5131 reflected user input in har_replay's 404 payload pinned to application/json + X-Content-Type-Options nosniff so any echoed path fragment can't be interpreted as HTML. - S4144 duplicate do_PUT / do_PATCH bodies aliased to do_POST. - S3776 cog complexity in storybook.discover_stories refactored into _entries_map + _story_from_entry helpers. Tests still 1200 unit + 30 integration green.

CI fix: - The integration subprocess tests were failing with 'Popen object has no attribute _fileobj2output' because the finally block called proc.communicate() a second time after the try block had already consumed the streams. Wrap the cleanup communicate() in try/except + nosec B110 so the harmless double-call no longer fails. Codacy: - pom_codegen.py: Dict was removed from the typing import last round but is still used on a class attribute; restore it (F821). - failure_cluster._PATH_RE: anchor a nosemgrep marker so the bounded pattern (every quantifier capped at {1,80}/{1,40}) stops being flagged. SonarCloud hotspots: - S5852 md_authoring _BULLET_RE / _TEMPLATE_RE: tightened the template pattern to ``[A-Za-z_][A-Za-z0-9_-]{0,80}`` and anchored NOSONAR on the bullet capture. - S5332 fixtures: ftp:// in test_driver_pin and http:// in test_storybook_visual_snapshots annotated as deliberate test fixtures. - S4828 process_supervisor.os.kill(pid, 9): NOSONAR with explanation — pid list is filtered by KNOWN_DRIVER_NAMES and excludes os.getpid(). - S5042 driver_pin._extract_archive: NOSONAR — both branches route through _safe_extract_* helpers that pre-validate members. - S1313 test_pii_scanner: 192.168.0.1 RFC1918 fixture annotated. Tests still 1230 green.

CI fix: - The subprocess integration tests (MCP / LSP) were failing with ``ValueError: I/O operation on closed file`` because we wrote to proc.stdin manually, called proc.stdin.close(), then immediately invoked proc.communicate() — communicate() then tried to use the closed stdin reference. Replace the pattern with a single ``communicate(input=payload, timeout=...)`` call (it auto-closes stdin) and route fallback drains through a try/except. SonarCloud: - S7632 suppression-comment syntax: NOSONAR markers had been on preceding-line comments rather than the violation lines. Anchored them on the actual flagged line in driver_pin._extract_archive, failure_cluster._PATH_RE, md_authoring._BULLET_RE / _TEMPLATE_RE, and process_supervisor.os.kill(). - S5869 duplicate char class: drop the explicit ``A-Za-z0-9`` ranges in _TEMPLATE_RE and use ``\w`` so SonarCloud sees no duplicate. - S5131 reflected user input: har_replay's 404 envelope now passes the echoed method / path through ``_safe_echo()`` which strips anything outside the URI grammar allow-list, so a hostile request can't smuggle HTML / control bytes into the response (defence in depth on top of the JSON envelope + nosniff header). - S3776 cog complexity refactors: - pipeline.load_pipeline: 26 → split into _coerce_pipeline_document / _load_pipeline_from_text / _parse_stage helpers. - coverage_map.build_coverage_map: 17 → extracted _load_action_list + _routes_in iterator. Tests still 1230 green.

SonarCloud cleanup: - S1313 in test_pii_scanner: NOSONAR moved from preceding-line comment onto the redact_text call line and the assertNotIn line. - S5042 in driver_pin: NOSONAR anchored on the ``with tarfile.open(...)`` line instead of the helper docstring above it. - S5131 in har_replay: NOSONAR anchored on the ``self.wfile.write(payload)`` line so SonarCloud sees the suppression at the violation site (the payload is already strip-sanitised by ``_safe_echo``). - S5869 in md_authoring: combined the suppression onto the same line as ``_TEMPLATE_RE``. - S7504 in bidi_backend.unsubscribe_all: hoist the list() snapshot into a named ``snapshot`` local with the NOSONAR anchored on that line so the marker isn't on a comment. Cognitive complexity refactors (S3776): - fanout.run_fan_out: split task parsing into _parse_tasks and result collection into _collect_results. - browser_pool.checkout: extract _acquire_session that linearises the get_nowait → grow → wait branches. - visual_review do_GET: move the /img/* handler into _serve_image. - a11y_trend.aggregate_history: split per-entry / per-violation logic into _absorb_entry / _count_violation. - storybook.visual_snapshots.capture_story_snapshots: move the per-story capture+compare body into _snapshot_story. - examples/counting_stars.py main: split into _force_play / _await_ad_clear / _wait_out_unskippable_ad / _navigate_and_play. Tests still 1230 green.

sonarqubecloud · 2026-04-26T10:21:51Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

JE-Chen added 30 commits April 26, 2026 14:32

Add unified WebDriver BiDi event bridge for Selenium and Playwright

bdb3058

Add browser pool with warm sessions and recycle policy

96a3bcb

Add HAR replay server for offline-deterministic e2e tests

99d062c

Add local visual diff review web UI with accept-baseline action

fb68d4b

Add PII / privacy scanner with email/phone/card/SSN/Taiwan-ID/IPv4 de…

491ec99

…tectors

Add test impact analysis (action JSON -> locator/url/template index)

62a8734

Add Action JSON LSP server with command completion + lint diagnostics

ebf8a47

Document new wave (BiDi / browser pool / HAR replay / PII / visual re…

18182f7

…view / impact analysis / LSP)

Add driver version pinner with local cache (bypasses GitHub rate limit)

98b1aa1

Add Selenium -> Playwright migration helper (Python source + action J…

36661ac

…SON)

Add heuristic form auto-fill (label/placeholder/name -> fixture key)

3528140

Add workspace bootstrapper for new-project scaffolding

04b2c2a

Add accessibility violations diff (added / resolved / persisting)

8b3d8d3

Add fan-out task runner for parallel WR_* execution within a test

13ef6b0

Add browser extension test harness (manifest parser + Selenium/Playwr…

b93cea2

…ight glue)

Add file-backed event bus for cross-shard pub/sub coordination

37505e6

Document latest wave (driver pinner / sel-to-pw / autofill / bootstra…

fc47966

…pper / a11y diff / fanout / extension / event bus)

Add deterministic action JSON formatter (canonical kwarg order)

c7a9202

Add Markdown -> action JSON authoring transpiler

ab01c6a

Add test failure clustering with normalised error signatures

5c1c7db

Add synthetic monitoring loop with edge-triggered alerts

5d78133

Add Storybook discovery + per-story action plan generator

74e715f

Add recursive shadow-DOM piercing helper for Selenium / Playwright

3c1b420

Add OTLP span exporter integration for Jaeger / Tempo backends

06d76b3

Document newest wave (formatter / md authoring / clustering / synthet…

deeffc2

…ic / OTLP / storybook / shadow pierce)

Add CDP message tap with record / replay for offline debugging

7727b08

Add cross-browser parity diffing (title / DOM / console / network / s…

72a6997

…creenshot)

Add Page Object codegen from HTML snapshots

aec840b

Add browser state diff (cookies + localStorage + sessionStorage)

a1c8ad3

Add workspace lock file (Python deps + drivers + Playwright versions)

66e8b4c

JE-Chen added 16 commits April 26, 2026 15:34

Add accessibility violations trend dashboard with daily SVG chart

12176f9

Add perf P95 baseline drift detector with sliding-window tolerance

e04dbc6

Document final wave (CDP tap / cross-browser / state diff / POM codeg…

af44a05

…en / lock / a11y trend / perf drift)

Remove unused legacy test/unit_test/create_project_test directory

ca168c6

The 3-line script was a side-effect-on-cwd standalone runner, never referenced by either CI workflow. The proper pytest coverage already lives at test/unit_test/test_create_project.py.

Add regex-based test name selector for include/exclude filtering

6180832

Add process supervisor for orphan webdrivers + wall-clock watchdog

6ed36f2

Add multi-stage pipeline DSL with conditional gates

54973f4

Add Storybook visual snapshots wired to baseline comparison

8ee11b9

Add Appium mobile gesture helpers (swipe / scroll / pinch / long-pres…

4fef5a6

…s / double-tap)

Add coverage map (action JSON files -> URL routes index)

8a9ad42

Document polish wave (regex selector / supervisor / pipeline / storyb…

6993324

…ook snapshots / appium gestures / coverage map)

JE-Chen added 4 commits April 26, 2026 17:33

JE-Chen merged commit 266b0e1 into main Apr 26, 2026
18 checks passed

This was referenced Apr 26, 2026

Static-analysis cleanup, integration tests, cookbook + WR_sleep + API façade #87

Merged

MCP server refresh + bug fixes since PR #87 #88

Merged

Merge dev into main: Python 3.14 CI, MCP expansion, demos & cookbook #90

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended capabilities round 2 + cookbook + integration tests#86

Extended capabilities round 2 + cookbook + integration tests#86
JE-Chen merged 50 commits intomainfrom
dev

JE-Chen commented Apr 26, 2026

Uh oh!

codacy-production Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JE-Chen commented Apr 26, 2026

Summary

Latest waves

Optimisation pass

API façade + docs

Real-browser smoke + cookbook

Bugs found by actually running the project

New: WR_sleep

Comprehensive integration tests

Numbers

Test plan

Uh oh!

codacy-production Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

sonarqubecloud Bot commented Apr 26, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codacy-production Bot commented Apr 26, 2026 •

edited

Loading