Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stdout/stderr redirects #25

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

kenodegard
Copy link
Contributor

@kenodegard kenodegard commented Feb 13, 2024

Resolves #24

Changes include:

  • Ruff version bump and enabled some additional lints (happy to remove these changes if undesired)
  • Add a TDD test used to showcase problematic behavior
  • Call item.ihook.pytest_runtest_call(item=item) instead of directly calling item.runtest() to benefit from pytest's automatic handling of sys.last_type, sys.last_value, sys.last_execption (apparently not handling this causes stdout/stderr to be redirected and inaccessible within tests, see https://github.com/pytest-dev/pytest/blob/3b798e54221f1895a983000c7e5bc8afdacd5011/src/_pytest/runner.py#L165-L182)
  • Switch from pytest_runtest_protocol hook to pytest_pyfunc_call hook (deeper in the call stack, pytest_pyfunc_call is invoked from runtest which in turn is called from pytest_runtest_call)

@kenodegard kenodegard marked this pull request as ready for review February 13, 2024 03:30
@jezdez
Copy link

jezdez commented Feb 13, 2024

Nice, thanks @kenodegard!

Copy link

codspeed-hq bot commented Feb 13, 2024

CodSpeed Performance Report

Merging #25 will not alter performance

Comparing kenodegard:fix-outerr-redirects (26b5d09) with master (90e639b)

Summary

✅ 5 untouched benchmarks

Copy link
Member

@art049 art049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, agree entirely, we need this :)
No problem with the ruff bump and config improvement; love it!

LMK what you think about the details

plugin.lib,
item.nodeid,
item.config,
lambda: ihook.pytest_runtest_call(item=item),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state this might introduce variance in the code we measure since we'll also measure the pytest hook.
I think it would be better to do it the other way around: have pytest_runtest_call call _run_with_instrumentation

A way to do it might be to and override runtest but don't hesitate if you have another idea!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging some more I found we can use pytest_pyfunc_call instead of pytest_runtest_protocol (runtest invokes this hook, see https://github.com/pytest-dev/pytest/blob/5bb1363435a8cb3e2010505dbeb1e015c36beed6/src/_pytest/python.py#L1762-L1764)

I got the tests to pass locally but I may be missing something so let me know if you don't think this is the right thing to do

pyproject.toml Outdated Show resolved Hide resolved
tests/benchmarks/test_print.py Outdated Show resolved Hide resolved
@kenodegard
Copy link
Contributor Author

@art049 thanks for running the tests

pushed 59ea7ad to resolve the typing failure in Python<3.10 and 2b026dd to resolve the perf trampoline test failure on Python 3.12

@kenodegard
Copy link
Contributor Author

@art049 anything more to do here?

Co-authored-by: Edgar Ramírez Mondragón <16805946+edgarrmondragon@users.noreply.github.com>
@art049
Copy link
Member

art049 commented Mar 20, 2024

It's a bit annoying that formatter changes and logic change are in the same PR in the end.
I'll maybe try to split things out.

Copy link
Member

@art049 art049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're touching the core of the plugin, I have some additional feedback just to make sure the behavior stays the same.

Really appreciate you refactoring the test protocol thing! 🔥

Edit: thanks for splitting the formating things into #29 !

Comment on lines +241 to +251
if (
plugin.is_codspeed_enabled
and plugin.lib is not None
and plugin.should_measure
):
return wrap_pyfunc_with_instrumentation(
plugin.lib,
self._request.node.nodeid,
config,
func,
)(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used with the fixture, the fixture payload shouldn't be executed wrapped again within pyfunc_call since it's already wrapped by the test itself.

for example, this would probably fail because of the warmup:

# This doesn't have any pytest marker but defines a bench since its using the fixture
def test_bench(benchmark):
    # ... some preprocessing
    called_once = False

    @benchmark
    def _():
        nonlocal called_once
        if not called_once:
            called_once = True
        else:
            raise Exception("bench codewas called twice but actual bench context only once")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't hesitate to add that as an additional test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started to get really confused with the intention here since the above example also fails with master, see #30

I suspect we need to do something similar to what pytest-rerunfailures does to clear the cached results between the cache priming run and the instrumented run: https://github.com/pytest-dev/pytest-rerunfailures/blob/a53b9344c0d7a491a3cc53d91c7319696651d21b/src/pytest_rerunfailures.py#L565-L567

Comment on lines +210 to +226
@pytest.hookimpl(hookwrapper=True)
def pytest_pyfunc_call(pyfuncitem: pytest.Function) -> Iterator[None]:
plugin = get_plugin(pyfuncitem.config)
if (
plugin.is_codspeed_enabled
and should_benchmark_item(pyfuncitem)
and not has_benchmark_fixture(pyfuncitem)
):
plugin.benchmark_count += 1
if plugin.lib is not None and plugin.should_measure:
pyfuncitem.obj = wrap_pyfunc_with_instrumentation(
plugin.lib,
pyfuncitem.nodeid,
pyfuncitem.config,
pyfuncitem.obj,
)
yield
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this refactor!
However, when a benchmark is defined from a fixture, we should still perform the warmup at this level and not in the benchmark fixture(see the other comment I left on it).

Comment on lines +191 to +193
if SUPPORTS_PERF_TRAMPOLINE:
# Warmup CPython performance map cache
__codspeed_root_frame__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the warmup shouldn't be included in the wrapper but handled at the protocol level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this also explains the issue we ran into on our own trying to use tmp_path in benchmark tests, e.g.:

@pytest.mark.benchmark
def test_tmp_path_benchmark(tmp_path: Path):
    tmp_path.mkdir()

Comment on lines +195 to +202
lib.zero_stats()
lib.start_instrumentation()
try:
return __codspeed_root_frame__()
finally:
lib.stop_instrumentation()
uri = get_git_relative_uri(nodeid, config.rootpath)
lib.dump_stats_at(uri.encode("ascii"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you introduced this try block, can you add a test bench raising an exception?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pytest-codspeed doesn't play nice with stdout/stderr
4 participants