Skip to content

Testing

Nick Byrne edited this page Sep 29, 2023 · 10 revisions

Using vcrpy

VCR.py simplifies and speeds up tests that make HTTP requests. The first time you run code that is inside a VCR.py context manager or decorated function, VCR.py records all HTTP interactions that take place through the libraries it supports and serializes and writes them to a flat file (in yaml format by default). This flat file is called a cassette. When the relevant piece of code is executed again, VCR.py will read the serialized requests and responses from the aforementioned cassette file, and intercept any HTTP requests that it recognizes from the original test run and return the responses that corresponded to those requests. This means that the requests will not actually result in HTTP traffic, which confers several benefits including:

  • The ability to work offline
  • Completely deterministic tests
  • Increased test execution speed

If the server you are testing against ever changes its API, all you need to do is delete your existing cassette files, and run your tests again. VCR.py will detect the absence of a cassette file and once again record all HTTP interactions, which will update them to correspond to the new API.

Example

import vcr

from superduperdb.ext.openai.model import OpenAIImageCreation

PNG_BYTE_SIGNATURE = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR'

CASSETTE_DIR = 'test/integration/ext/openai/cassettes'

def _record_only_png_signature_in_response(response):
    '''
    VCR filter function to only record the PNG signature in the response.

    This is necessary because the response is a PNG which can be quite large.
    '''
    if PNG_BYTE_SIGNATURE in response['body']['string']:
        response['body']['string'] = PNG_BYTE_SIGNATURE
    return response

@vcr.use_cassette(
    f'{CASSETTE_DIR}/test_create_url.yaml',
    filter_headers=['authorization'],
    before_record_response=_record_only_png_signature_in_response,
)
def test_create_url():
    e = OpenAIImageCreation(
        model='dall-e', prompt='a close up, studio photographic portrait of a {context}'
    )
    resp = e.predict('', one=True, response_format='url', context=['cat'])

    # PNG 8-byte signature
    assert resp[0:16] == PNG_BYTE_SIGNATURE

What is happening here?

  1. CASSETTE_DIR refers to the directory where the 'cassettes' (YAML files) are saved
  2. PNG_BYTE_SIGNATURE refers to the first bytes in a byte sequence that denote a PNG image
  3. we decorate our test function with @vcr.use_cassette(FILEPATH) to tell vcr.py to intercept any HTTP requests for this function
  4. we pass any kwargs to this decorator for any use-case specific needs that we have. The vcr.py docs are best consulted here, but for this particular example we are filtering the headers so that vcr.py does not store any sensitive authorisation material in the YAML files, and we are also filtering the response so that only PNG_BYTE_SIGNATURE is included in the response. If we did not filter the response, we would end up with a large PNG image stored as part of our YAML.
  5. that's it! It's quite easy to get started, and it can be configured for more complex behaviour if needs require (but only the most basic behaviour is currently used in the project)

What is the workflow?

  1. The first time we run the test, an actual HTTP request is made to the OpenAI API. All the information from the request and response are stored in the YAML file at the filepath that we have indicated.
  2. All future runs of the test will then use the information from this YAML file (so long as it present; if you delete it you return to step 1), rather than submitting a request to the OpenAI API.

What else should I remember?

  1. If the API changes in some important way, you should delete the YAML and create a fresh YAML (see 'What is the workflow?' above)
  2. You need to check the YAML files into version control. If not, then they won't be present for the CI and actual HTTP requests will be made during every run of the CI.
Clone this wiki locally