Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Fail when specified status code has not been returned #1147

Open
Lakitna opened this issue May 3, 2021 · 15 comments
Open

[FEATURE] Fail when specified status code has not been returned #1147

Lakitna opened this issue May 3, 2021 · 15 comments
Assignees
Labels
Core: Checks What issues Schemathesis can find Difficulty: Hard Complex, needs deep understanding Priority: Medium Planned for regular releases Type: Feature New functionalities or enhancements UX: Usability Enhances user experience

Comments

@Lakitna
Copy link

Lakitna commented May 3, 2021

Is your feature request related to a problem? Please describe.

When you test an endpoint that has not yet been implemented it will pass the test. Even though all responses had a status code of 404. I would like these situations to fail the test instead.

For maximum flexibility in this regard, I would like to be able to define one or more status codes that must be returned at least once.

Describe the solution you'd like

A Schemathesis option like expected_status_codes=(200, 201, etc.), or one with flexible matching like expected_status_code='2xx'

Describe alternatives you've considered

I've written a decorator that does this, but there are iffy things about it.

  • I can only get an after all hook by counting the number of cases that have been tested. I need this hook to assert that the expected status code has been returned at least once.
  • Hypothesis tries to shrink the last case when I throw an error. However, it can't be shrunk. This leads to very long stack traces.
  • I have to return the response in the test function. Otherwise, I don't know what the status code is.

Additional context

My current workaround below. It has not been written amazingly, I know. That's why I would like Schemathesis to support this.

# Test utils file

def assert_status_code_returned(*expected_codes: int, max_examples: int):
    """
    Test decorator to ensure that the expected status code has been returned.

    Requires you to set the max_examples. It won't know what the last test is otherwise.
    """

    def decorator(func: GenericTest):
        # Before all test cases
        returned_status_codes: Dict = {}
        execution_count = 0
        exception = None

        @functools.wraps(func)
        async def wrapper(case: schemathesis.models.Case, *args, **kwargs):
            # Before each test case
            nonlocal execution_count
            nonlocal exception
            nonlocal returned_status_codes

            if isinstance(exception, AssertionError):
                # Re-raise the exception because of shrinking
                raise exception

            execution_count = execution_count + 1

            # Execute test case
            response: schemathesis.models.Response = await func(case, *args, **kwargs)

            # After each test case
            if returned_status_codes.get(response.status_code) is None:
                returned_status_codes[response.status_code] = 0

            returned_status_codes[response.status_code] = (
                returned_status_codes[response.status_code] + 1
            )

            if execution_count > max_examples:
                # After all test cases
                for code in expected_codes:
                    if returned_status_codes.get(code) is None:
                        exception = AssertionError(
                            f"Expected status code {code} to have been returned at least once. "
                            + "Returned status codes: "
                            + ", ".join(f"{k}: {v} times" for k, v in returned_status_codes.items())
                        )
                        raise exception

        return wrapper

    return decorator
# Test file

max_examples = 50

@schema.parametrize(endpoint="/my-endpoint$")
@settings(max_examples=max_examples)
@assert_status_code_returned(200, max_examples=max_examples)
async def test_my_endpoint(case):
    response = case.call()
    case.validate_response(response)
    return response
@Lakitna Lakitna added Status: Needs Triage Requires initial assessment to categorize and prioritize Type: Feature New functionalities or enhancements labels May 3, 2021
@Stranger6667
Copy link
Member

Hi @Lakitna
It is a fascinating use case! :)

I don't think that there is a direct way to do so on the Python side, but a similar behavior may be achieved via assume:

from hypothesis import assume

@schema.parametrize(endpoint="/my-endpoint$")
async def test_my_endpoint(case):
    response = case.call()
    case.validate_response(response)
    assume(response.status_code == 200)

The test will fail with the FailedHealthCheck exception if there were no responses with 200 code (assuming that the health check is not disabled). I need more time to think if Schemathesis can provide something like this out of the box.

On the Schemathesis runner side - Schemathesis may provide an interface for checks that are happening after all Hypothesis tests. The simplest way to implement it is to add a hook inside schemathesis.runner.impl.core.run_test that will happen just before yielding an AfterExecution event. It may receive an instance of TestResult, and then the user can do any checks with all collected responses.

@Lakitna
Copy link
Author

Lakitna commented May 6, 2021

Hmmm assume is interesting. I did not know about it yet, seems very useful for my #1142 workaround. However, if I understand its purpose correctly, it would result in non-200 status codes never being tested. Testing if errors work as expected is also important. Hence my approach of expecting one or more status codes to be called at least once.

This could be a very valid situation as far as I'm concerned:

@assert_status_code_returned(200, 400)

add a hook inside schemathesis.runner.impl.core.run_test that will happen just before yielding an AfterExecution event

That sounds like a good solution to the hacky nature of my current solution :) I'd be happy with that.

If you want to include my solution in Schemathesis, feel free ;) Maybe something like:

@schema.parametrize(endpoint="/my-endpoint$", require_status_codes=[200, 400])
@schema.parametrize(endpoint="/my-endpoint$")
@schema.require_status_code(200, 400)

@Stranger6667
Copy link
Member

However, if I understand its purpose correctly, it would result in non-200 status codes never being tested. Testing if errors work as expected is also important. Hence my approach of expecting one or more status codes to be called at least once.

Such cases will be tested, as test_my_endpoint will raise an error before calling assume.

That sounds like a good solution to the hacky nature of my current solution :) I'd be happy with that.

But note that it only is relevant for the Schemathesis runner, not pytest.

If you want to include my solution in Schemathesis, feel free ;) Maybe something like:

Yep, I'll take it into account :) At the moment I think, that it might require some extra changes on the Hypothesis side for better integration

@Stranger6667
Copy link
Member

I really like this variant:

@schema.require_status_code(200, 400)

Maybe it could be generalized so that the user can supply a custom "wrapping" check.

@Lakitna
Copy link
Author

Lakitna commented May 7, 2021

Maybe it could be generalized so that the user can supply a custom "wrapping" check.

Sounds good. I guess the first step to that would be to have an afterAll hook. Wrappers would miss capabilities otherwise.

@Lakitna
Copy link
Author

Lakitna commented Oct 22, 2021

We are getting to a situation where more teams want to start using Schemathesis, but I've seen them go wrong on this a few times now. The situation is generally something like:

  • Install Schemathesis and dependencies
    This is a new tool for them and they are not very comfortable in Python. Thankfully, there are plenty of docs on installing Python and such.
  • Run Schemathesis via the CLI
    This triggers some issues they have to fix in their OpenAPI Schema. This is also not a thing they are very comfortable with, but they always make it work just fine. I have never gotten any questions about the Schemathesis docs 🎉
  • Schemathesis passes
    But nothing is actually correct. All calls have returned a 5xx or 4xx status code. I've seen situations where the server can't even be reached, but Schemathesis still claims all is well.
    The way you find this out is to dig in Schemathesis output files, for example, the cassette file, and crtrl+f for the expected 2xx status code.

This experience erodes the trust that these teams have in Schemathesis. If I'm going to keep pushing it in the org, this will need to be addressed.

I've spend some time yesterday improving my, imperfect, solution. However, I'm not sure how I should implement this into the Schemathesis codebase. Especially the event hooks are not clear to me, especially because they differ so much between CLI and pytest plugin.

In an effort to expedite the situation I'll share my solution here:

My imperfect solution

Keep in mind that I'm currently running this with the pytest plugin.

# conftest.py
# These Pytest hooks must be in a conftest file, as it will otherwise be ignored by pytest

import logging
from typing import Dict, Tuple

import pytest
from hypothesis.errors import StopTest
from schemathesis.extra.pytest_plugin import SchemathesisFunction


@pytest.hookimpl(hookwrapper=True)
def pytest_pyfunc_call(pyfuncitem):
    """
    Pytest hook that runs every test function call.

    This is the same hook used by Schemathesis. The main goal of this hook is to be able to execute
    some code after all cases have executed.
    """
    # Event: Before all cases

    outcome = yield
    outcome.get_result()

    # Event: After all cases

    # We don't want to touch normal tests (not paremeterized with Schemathesis)
    if isinstance(pyfuncitem, SchemathesisFunction):
        # We are using the attributes of the test function as context
        expected = getattr(pyfuncitem.test_function, "expected_status_codes", None)

        # If this is none, we did not use the expect_response_status_codes decorator on our test so
        # we don't need to do anything else
        if expected is not None:
            _assert_expected_status_codes(pyfuncitem)

    return outcome


def _assert_expected_status_codes(pyfuncitem: SchemathesisFunction):
    # Again: The function attributes are used as context
    expected_codes: Tuple[int] = getattr(pyfuncitem.test_function, "expected_status_codes")
    returned_status_codes: Dict[str, int] = getattr(pyfuncitem.test_function, "status_codes")

    for expected_code in expected_codes:
        code = str(expected_code)
        if code not in returned_status_codes:
            execution_count = sum(returned_status_codes.values())

            err = Exception(
                f"Expected status code {code} to have been returned at least once "
                + f"after {execution_count} calls"
            )

            logging.error(err)
            logging.error("Returned status codes: ")
            for k, v in returned_status_codes.items():
                logging.error("  %s: %s times", k, v)

            # This error does not output anything meaningful, but it does make sure
            # that Hypothesis does not try to shrink. That greatly improves the
            # readability of the output.
            raise StopTest(str(err))

And the actual test file:

# test.py

import functools
from typing import Awaitable, Union

import schemathesis
from schemathesis.models import Case, Response
from schemathesis.types import GenericTest


# I needed a way to get the call response after the case was executed. The cleanest way I could
# think of is to add the response to the Case object.
class CaseWithResponse(schemathesis.models.Case):
    response: Union[Response, None] = None

    def __init__(self, case: schemathesis.models.Case):
        # Imperfect cloning of Case
        self.operation = case.operation
        self.path_parameters = case.path_parameters
        self.headers = case.headers
        self.cookies = case.cookies
        self.query = case.query
        self.body = case.body
        self.source = case.source
        self.media_type = case.media_type
        self.data_generation_method = case.data_generation_method

    def call(self, *args, **kwargs) -> Response:
        # Use normal Case.call but also store the response in self.response
        self.response = super().call(*args, **kwargs)
        return self.response


def expect_response_status_codes(*expected_codes: int):
    """
    Test decorator to be used with Schemathesis to ensure that expected status codes have been
    returned at least once.
    """

    def decorator(func: GenericTest):
        # Event: Before all cases

        # We use the test function as a context object. Here, we initialize the values so we can
        # validate them in a pytest hook later on.
        setattr(func, "expected_status_codes", expected_codes)
        setattr(func, "status_codes", {})

        @functools.wraps(func)
        async def wrapper(case: schemathesis.models.Case, *args, **kwargs):
            # Event: Before each case
            my_case = CaseWithResponse(case)

            # Execute test case
            result = func(my_case, *args, **kwargs)
            if isinstance(result, Awaitable):
                await result

            # Event: After each case
            code = str(my_case.response.status_code)

            # Update function context
            returned_status_codes = getattr(func, "status_codes")
            if code not in returned_status_codes:
                returned_status_codes[code] = 0
            returned_status_codes[code] += 1
            setattr(func, "status_codes", returned_status_codes)

            return my_case

        return wrapper

    return decorator


######################################################################
# Actual test setup
######################################################################

schema = schemathesis.from_path("{my-oas-file-path}")


@schema.parametrize(endpoint="/{my-endpoint}$")
@expect_response_status_codes(200)
def test_example(case: Case):
    case.call_and_validate()

The one thing I'd like to add later is to default the expected response status codes to all codes in the OAS in the 2xx-range. Those are the ones you tend to always care about :)

@Stranger6667
Copy link
Member

Stranger6667 commented Nov 23, 2021

Hi @Lakitna

Sorry for being silent for so long :(

Thank you for sharing your feedback and a working solution for this issue! Much appreciated :)

This is a new tool for them and they are not very comfortable in Python. Thankfully, there are plenty of docs on installing Python and such.

What kind of tooling they are used to? I was thinking about working on multi-language hooks in Schemathesis (like Dredd supports)

This triggers some issues they have to fix in their OpenAPI Schema.

Recently I was thinking about switching off the schema validation by default & transforming schema errors into checks. If it was the cause of those issues, then such kind of change might provide a better onboarding experience. What do you think?

But nothing is actually correct. All calls have returned a 5xx or 4xx status code. I've seen situations where the server can't even be reached, but Schemathesis still claims all is well.
The way you find this out is to dig in Schemathesis output files, for example, the cassette file, and crtrl+f for the expected 2xx status code.

Could you, please, provide more context on this? Surely I'd like to avoid this kind of behavior from the Schemathesis side.

Re: Implementation

Reading through #1327 and your solution above got me inspired to make some API design to support this feature directly in Schemathesis.

Very much prototype:

import schemathesis 

def on_finish(context):
    status_codes = ...  # Make a string from `context` data
    created_count = ...  # Calculate the total number of 201 from `context`
    assert created_count > 0, f"Expected at least one 201! Found these: {status_codes}"


@schemathesis.check(on_finish=on_finish)
def expect_status_code(context, response, case):
    if response.status_code == 201:
        context.found()  # Or `context.store()`
  • The new schemathesis.check decorator which will accept some configuration for the check
  • Checks will receive context to perform actions like storing the matching test cases

During the run, this check will collect matching cases. The on_finish callback will be called with that statistic collected during the run, and the end-user will be able to raise an exception there which will be reported.

It looks to me, that this API should work both for CLI & Python tests:

  • CLI. As context is a mutable per-check container we just need to inspect it after the test run & call the callback & adjust the TestResult instance accordingly
  • pytest. As in your example above, it will require storing that context somewhere on the function to access it later (e.g. initialize upfront & pass it to Case during data generation), when Hypothesis is done & calling the callback that will cause a proper test report

On the backward-compatibility side of things, it should work the same way as before (except the decorator & context changes) if the user raises an exception in the check. So, the on_finish callback is completely optional for any check (though it could be a warning if some stats are collected but no callback is defined).

Let me know what do you think about such an API :) Thank you once again for sharing your thoughts


P.S. I am currently working on a SaaS platform that will provide more features than Schemathesis with a better UX (I hope!! especially on the reporting side). Let me know if you'd be interested in trying it out when it is ready :)

@Lakitna
Copy link
Author

Lakitna commented Nov 26, 2021

Could you, please, provide more context on this? Surely I'd like to avoid this kind of behavior from the Schemathesis side.

It is actually quite easy to create a situation where every response will be a 4xx code. Some examples:

  1. Not adding an authorization header to the schema will result in every request returning a 401. Solution: add the auth header to the schema.
  2. The schema misses a required parameter causing every call to return a 400. Solution: add the missing parameter to the schema.
  3. Not having your application's authentication setup for testing can result in every request returning a 401 (due to generated auth tokens) or a 403. Solution: Setup application so that Schemathesis can generate meaningful tests.

Schemthesis currently thinks that all of the situations above are fine and it passes the test run. These kinds of situations are the reason why I want to enforce that at least one 2xx status code is returned. Even if it's only on the first, example-based request.

Let me know what do you think about such an API

I think it would be a great start! I am curious as to how this would work in CLI mode though, but that's mainly because I don't use that right now.

This also sounds like it should open up the way for a plugin ecosystem. That might make this kind of stuff easier to expand on in the future. Maybe it's overkill though 🤔

@Stranger6667
Copy link
Member

@Lakitna

Thank you so much for providing the details! These cases are definitely something where the UX could be improved :)

I think it would be a great start! I am curious as to how this would work in CLI mode though, but that's mainly because I don't use that right now.
This also sounds like it should open up the way for a plugin ecosystem. That might make this kind of stuff easier to expand on in the future. Maybe it's overkill though thinking

I'll try to draft something in the next couple of weeks :) So, we can discuss it with some more concrete examples

@Lakitna
Copy link
Author

Lakitna commented Dec 2, 2021

P.S. I am currently working on a SaaS platform that will provide more features than Schemathesis with a better UX (I hope!! especially on the reporting side). Let me know if you'd be interested in trying it out when it is ready :)

I've discussed this internally, and we're interested in learning more about this. Can you share the scope/purpose of the SaaS platform compared to the core? The website is a bit sparse on information at the moment ;)

@Stranger6667
Copy link
Member

Sorry for not getting back earlier :)

Generally, SaaS will be able to uncover more bugs in a wider range of scenarios and display them much nicer than CLI.
Here is a list of extras that will be included in SaaS:

  • More checks - security, performance, more detailed checks on Open API & GraphQL
  • Better data generation that is more likely to uncover issues
  • Wider API specs support - gRPC, AsyncAPI, etc
  • Static analysis on API schemas + schema improvement suggestions
  • Schema coverage (including branches in JSON schema)
  • Various integrations with 3rd party services - GitHub Actions / JIRA / Slack
  • Faster (6-9x at the moment on average)

It will be free for storing/navigating CLI testing results (with reasonable data retention) & using a nicer visual representation of failures.

@thedeeno
Copy link

FWIW, our team just experienced this exact situation: #1147 (comment)

Our suite passed when a documented status code was never returned. Since we can't guarantee at least 2xx codes are returned we can't rely on schemathesis to verify our spec. Further, the limitation makes it hard to create a red/green development workflow. Even completely broken endpoints appear green; despite never being able to 2xx.

We're really excited about this tool and would love to incorporate it in our workflow. Thanks for the hard work so far. We'll try the workaround above but we agree with @Lakitna; 2xx should be required for green runs on all endpoints by default.

@thedeeno
Copy link

To add a little more, we also experienced a situation where stateful testing was failing silently too. When our POST was broken and only 400'ing, there were never any results to send to the linked GET. So the stateful test never ran but our suite still remained green - the links were just silently skipped.

@pdscopes
Copy link

pdscopes commented Nov 29, 2022

I have a slightly different solution to this problem, which is for the CLI only. You need to include a --pre-run option pointing a file containing:

import click
import schemathesis
from schemathesis import Case, GenericResponse
from schemathesis.cli.handlers import EventHandler
from schemathesis.cli.context import ExecutionContext
from schemathesis.hooks import HookContext
from schemathesis.runner import events
from typing import List


# Create a cache for storing endpoints that have been tested and the count of 2XX response
_cached_2xx_responses = dict()


@schemathesis.hooks.register
def after_call(context, case: Case, response: GenericResponse):
    """
    For every endpoint tested, store an entry in the 2XX responses cache.
    If the response is 2XX increment the count for this endpoint.
    """
    endpoint = f'{case.endpoint.method.upper()} {case.endpoint.full_path}'
    # For every endpoint tested, ensure there is an entry in the cache
    if endpoint not in _cached_2xx_responses:
        _cached_2xx_responses[endpoint] = 0
    # If this response is 2XX
    if 200 <= response.status_code < 300:
        _cached_2xx_responses[endpoint] += 1


class CheckFor2XXResponseHandler(EventHandler):
    def handle_event(self, context: ExecutionContext, event: events.ExecutionEvent) -> None:
        """
        When all tests are complete, check through the 2XX response cache that emit a failure if
        any have no matching 2XX responses.
        """
        if not isinstance(event, events.Finished):
            return
        if event.has_failures:
            return
        schemathesis.cli.output.default.display_section_name('2XX RESPONSES')
        click.echo()
        click.secho('Endpoints tested:', bold=True)
        for endpoint, count in _cached_2xx_responses.items():
            verdict = "." if count else "F"
            colour = "green" if count else "red"
            click.echo(f"  {endpoint} {click.style(verdict, fg=colour, bold=True)}")
        failed_endpoints = [e for e, v in _cached_2xx_responses.items() if v == 0]
        if len(failed_endpoints):
            event.has_failures = True
            event.failed_count += len(failed_endpoints)


@schemathesis.hooks.register
def after_init_cli_run_handlers(
    context: HookContext,
    handlers: List[EventHandler],
    execution_context: ExecutionContext,
) -> None:
    # Insert into the beginning of the handlers list
    handlers.insert(0, CheckFor2XXResponseHandler())

@Stranger6667
Copy link
Member

I am going to address the concerns and desired capabilities in the soon to be ongoing work on the new Checks API (#1689)

@Stranger6667 Stranger6667 added Priority: Medium Planned for regular releases Difficulty: Hard Complex, needs deep understanding UX: Usability Enhances user experience Core: Checks What issues Schemathesis can find and removed Status: Needs Triage Requires initial assessment to categorize and prioritize labels Oct 12, 2023
@Stranger6667 Stranger6667 modified the milestones: 4.0, 3.22 Oct 12, 2023
@Stranger6667 Stranger6667 modified the milestones: 3.21, 3.22 Oct 12, 2023
@Stranger6667 Stranger6667 removed this from the 3.24 milestone Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core: Checks What issues Schemathesis can find Difficulty: Hard Complex, needs deep understanding Priority: Medium Planned for regular releases Type: Feature New functionalities or enhancements UX: Usability Enhances user experience
Projects
None yet
Development

No branches or pull requests

4 participants