Remove Mechanical Markdown from examples/ validation#977
Remove Mechanical Markdown from examples/ validation#977
Conversation
Signed-off-by: Sergio Herrera <627709+seherv@users.noreply.github.com>
b92a68e to
65fc9ae
Compare
There was a problem hiding this comment.
Pull request overview
This PR migrates example validation away from Mechanical Markdown to pytest-based integration tests, so the examples/ directory is validated via native Python tests that run the examples and assert on their output.
Changes:
- Replaces the
tox -e examplesMechanical Markdown workflow with a newtox -e integrationpytest environment. - Adds
tests/integration/with aDaprRunnerhelper + one pytest file per example to validate expected output. - Updates project docs (
README.md,AGENTS.md,examples/AGENTS.md) and removesexamples/validate.sh.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tox.ini | Removes Mechanical Markdown example envs and adds integration env running pytest. |
| pyproject.toml | Registers the custom example_dir pytest marker. |
| README.md | Updates developer instructions to run tox -e integration. |
| AGENTS.md | Updates repo contributor guidance to refer to pytest integration tests instead of Mechanical Markdown. |
| examples/AGENTS.md | Rewrites examples validation docs to describe pytest/DaprRunner approach. |
| examples/validate.sh | Removes the Mechanical Markdown validation entrypoint script. |
| CLAUDE.md | Adds repository development guidelines. |
| tests/integration/conftest.py | Introduces DaprRunner, output assertion helper, and dapr fixture/marker handling. |
| tests/integration/test_workflow.py | Adds integration coverage for the workflow examples (task chaining, fan-out/fan-in, simple). |
| tests/integration/test_w3c_tracing.py | Adds integration coverage for w3c tracing example with background receiver. |
| tests/integration/test_state_store.py | Adds integration coverage for state store example output. |
| tests/integration/test_state_store_query.py | Adds integration coverage for state store query example with MongoDB setup. |
| tests/integration/test_secret_store.py | Adds integration coverage for secret store example (with/without ACL config). |
| tests/integration/test_pubsub_streaming.py | Adds integration coverage for streaming pubsub example (subscriber/publisher). |
| tests/integration/test_pubsub_streaming_async.py | Adds integration coverage for async streaming pubsub example (subscriber/publisher). |
| tests/integration/test_pubsub_simple.py | Adds integration coverage for basic pubsub example (subscriber/publisher). |
| tests/integration/test_metadata.py | Adds integration coverage for metadata example output. |
| tests/integration/test_langgraph_checkpointer.py | Adds integration coverage for langgraph checkpointer example (skips if deps missing). |
| tests/integration/test_jobs.py | Adds integration coverage for jobs example (management + processing). |
| tests/integration/test_invoke_simple.py | Adds integration coverage for invoke-simple using direct sidecar HTTP call to avoid infinite loop. |
| tests/integration/test_invoke_custom_data.py | Adds integration coverage for invoke-custom-data (receiver + caller). |
| tests/integration/test_invoke_binding.py | Adds integration coverage for invoke-binding with Kafka docker-compose setup. |
| tests/integration/test_grpc_proxying.py | Adds integration coverage for gRPC proxying example (receiver + caller). |
| tests/integration/test_error_handling.py | Adds integration coverage for error handling example output. |
| tests/integration/test_distributed_lock.py | Adds integration coverage for distributed lock example output. |
| tests/integration/test_demo_actor.py | Adds integration coverage for demo actor example (service + client). |
| tests/integration/test_crypto.py | Adds integration coverage for crypto examples with key generation/cleanup fixture. |
| tests/integration/test_conversation.py | Adds integration coverage for conversation examples output. |
| tests/integration/test_configuration.py | Adds integration coverage for configuration example with Redis seeding + update trigger. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Sergio Herrera <627709+seherv@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 30 out of 30 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Sergio Herrera <627709+seherv@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 30 out of 30 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def mongodb(): | ||
| subprocess.run( | ||
| 'docker run -d --rm -p 27017:27017 --name mongodb mongo:5', | ||
| shell=True, | ||
| check=True, | ||
| capture_output=True, | ||
| ) |
There was a problem hiding this comment.
This fixture starts MongoDB with a fixed container name (mongodb) and fixed host port mapping (27017:27017). That will fail if the container already exists or the port is in use (e.g., developer machine, reruns after a crash); consider generating a unique container name and/or using a random available host port (and wiring it into the example/components config) to reduce flakiness.
| data=req_data, | ||
| headers={'Content-Type': 'application/json'}, | ||
| ) | ||
| with urllib.request.urlopen(req) as resp: |
There was a problem hiding this comment.
urllib.request.urlopen() is used without an explicit timeout. If the sidecar endpoint becomes unreachable or hangs, this test can block until the global socket timeout; pass a small timeout to make failures deterministic.
| with urllib.request.urlopen(req) as resp: | |
| with urllib.request.urlopen(req, timeout=5) as resp: |
| commands_pre = | ||
| pip uninstall -y dapr dapr-ext-grpc dapr-ext-fastapi dapr-ext-langgraph dapr-ext-strands dapr-ext-flask dapr-ext-langgraph dapr-ext-strands | ||
| pip install -e {toxinidir}/ \ | ||
| pip install -r {toxinidir}/dev-requirements.txt \ |
There was a problem hiding this comment.
The uninstall list contains duplicate entries (e.g., dapr-ext-langgraph and dapr-ext-strands appear twice). This makes the environment setup noisier and harder to maintain; please deduplicate the package list.
| def __init__(self, cwd: Path) -> None: | ||
| self._cwd = cwd | ||
| self._bg_process: subprocess.Popen[str] | None = None | ||
| self._bg_output_file: IO[str] | None = None | ||
|
|
||
| def _spawn(self, args: str) -> subprocess.Popen[str]: | ||
| return subprocess.Popen( | ||
| f'dapr run {args}', | ||
| shell=True, | ||
| cwd=self._cwd, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.STDOUT, | ||
| text=True, | ||
| ) | ||
|
|
||
| @staticmethod | ||
| def _terminate(proc: subprocess.Popen[str]) -> None: | ||
| if proc.poll() is not None: | ||
| return | ||
| proc.send_signal(signal.SIGTERM) | ||
| try: | ||
| proc.wait(timeout=10) | ||
| except subprocess.TimeoutExpired: | ||
| proc.kill() | ||
| proc.wait() | ||
|
|
||
| def run(self, args: str, *, timeout: int = 30, until: list[str] | None = None) -> str: | ||
| """Run a foreground command, block until it finishes, and return output. | ||
|
|
||
| Use this for short-lived processes (e.g. a publisher that exits on its | ||
| own). For long-lived background services, use ``start()``/``stop()``. | ||
|
|
||
| Args: | ||
| args: Arguments passed to ``dapr run``. | ||
| timeout: Maximum seconds to wait before killing the process. | ||
| until: If provided, the process is terminated as soon as every | ||
| string in this list has appeared in the accumulated output. | ||
| """ | ||
| proc = self._spawn(args) | ||
| lines: list[str] = [] | ||
| remaining = set(until) if until else set() | ||
| assert proc.stdout is not None | ||
|
|
||
| # Kill the process if it exceeds the timeout. A background timer is | ||
| # needed because `for line in proc.stdout` blocks indefinitely when | ||
| # the child never exits. | ||
| timer = threading.Timer(interval=timeout, function=proc.kill) | ||
| timer.start() | ||
|
|
||
| try: | ||
| for line in proc.stdout: | ||
| print(line, end='', flush=True) | ||
| lines.append(line) | ||
| if remaining: | ||
| output_so_far = ''.join(lines) | ||
| remaining = {exp for exp in remaining if exp not in output_so_far} | ||
| if not remaining: | ||
| break | ||
| finally: | ||
| timer.cancel() | ||
| self._terminate(proc) | ||
|
|
||
| return ''.join(lines) | ||
|
|
||
| def start(self, args: str, *, wait: int = 5) -> subprocess.Popen[str]: | ||
| """Start a long-lived background service and return the process handle. | ||
|
|
||
| Use this for servers/subscribers that must stay alive while a second | ||
| process runs via ``run()``. Call ``stop()`` to terminate and collect | ||
| output. Stdout is written to a temp file to avoid pipe-buffer deadlocks. | ||
| """ | ||
| output_file = tempfile.NamedTemporaryFile(mode='w+', suffix='.log', delete=False) | ||
| proc = subprocess.Popen( | ||
| f'dapr run {args}', | ||
| shell=True, | ||
| cwd=self._cwd, | ||
| stdout=output_file, | ||
| stderr=subprocess.STDOUT, | ||
| text=True, | ||
| ) | ||
| self._bg_process = proc | ||
| self._bg_output_file = output_file | ||
| time.sleep(wait) | ||
| return proc | ||
|
|
||
| def stop(self, proc: subprocess.Popen[str]) -> str: | ||
| """Stop a background process and return its captured output.""" | ||
| self._terminate(proc) | ||
| self._bg_process = None | ||
| return self._read_and_close_output() | ||
|
|
||
| def cleanup(self) -> None: | ||
| """Stop the background process if still running.""" | ||
| if self._bg_process is not None: | ||
| self._terminate(self._bg_process) | ||
| self._bg_process = None | ||
| self._read_and_close_output() | ||
|
|
There was a problem hiding this comment.
DaprRunner only tracks a single background process/output file via _bg_process/_bg_output_file. If a test (now or in the future) starts more than one background service concurrently, earlier processes can be leaked and their output can be lost or mixed; consider tracking output per process (e.g., a dict from Popen to temp file) and cleaning up all running processes in cleanup().
| def _spawn(self, args: str) -> subprocess.Popen[str]: | ||
| return subprocess.Popen( | ||
| f'dapr run {args}', | ||
| shell=True, | ||
| cwd=self._cwd, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.STDOUT, | ||
| text=True, | ||
| ) |
There was a problem hiding this comment.
DaprRunner uses shell=True with interpolated strings when spawning subprocesses. Even though current call sites are test-controlled, this makes quoting brittle and can become a command-injection footgun if any arguments ever come from variables/params; prefer passing argv lists with shell=False (and use shlex.split only when necessary).
| def _terminate(proc: subprocess.Popen[str]) -> None: | ||
| if proc.poll() is not None: | ||
| return | ||
| proc.send_signal(signal.SIGTERM) | ||
| try: | ||
| proc.wait(timeout=10) | ||
| except subprocess.TimeoutExpired: | ||
| proc.kill() | ||
| proc.wait() |
There was a problem hiding this comment.
_terminate() sends signal.SIGTERM directly, which is not portable (e.g., limited on Windows) and can behave differently across platforms. Using proc.terminate() (and falling back to kill() after a timeout) is typically more robust for tox/CI environments.
| 'operation': 'create', | ||
| 'data': {'id': n, 'message': 'hello world'}, | ||
| } | ||
| httpx.post('http://localhost:3500/v1.0/bindings/kafkaBinding', json=payload) |
There was a problem hiding this comment.
httpx.post() is called without a timeout and the response is ignored. If the sidecar/binding endpoint hangs or returns a non-2xx, this test may stall or produce misleading failures; set a small timeout and assert success (e.g., check status/raise_for_status).
| httpx.post('http://localhost:3500/v1.0/bindings/kafkaBinding', json=payload) | |
| response = httpx.post( | |
| 'http://localhost:3500/v1.0/bindings/kafkaBinding', | |
| json=payload, | |
| timeout=5.0, | |
| ) | |
| response.raise_for_status() |
|
@seherv pls see latest copilot feedback and build failures :) |
Signed-off-by: Sergio Herrera <627709+seherv@users.noreply.github.com>
Description
Removes all references to Mechanical Markdown and replaces them with pytest files that validate the output generated by the examples.
Tests are still flaky and rely on exact log output, which is not ideal. This will be addressed in a future PR.
Issue reference
We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.
Please reference the issue this PR will close: #972 (first iteration, do not close the issue yet)
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: