testing: Service Test Suite

Establish an effective integration test suite with mocked model calls.

Each apollo service should have a suite of Service tests which run against it. A Service test is basically an integration test, but it mocks out any external service calls (which is a bit weird for an integration test!)

I am open to better names (mock integration tests?)

These are the principles of Service Tests:
* Tests always call the top `main()` function of each service: these are high-level service api tests
* They are call through python directly
* All LLM calls are mocked, so tests are free to run
* Tests assert on the resulting data structure, internal routing path, or values passed to the LLM call. They do not assert on the resulting content.
* The idea is to test the logic and flow of information within apollo, not to test the behaviour of the actual models
* Tests would be expected to run on every push to an open PR on GitHub

## Implementation notes:

The anthropic client allows the http client to be fully configured: https://platform.claude.com/docs/en/api/sdks/python#configuring-the-http-client  

We can use this as our mock layer, allowing us to simulate any LLM calls.

Tests are set up so that when the call the service entry point, they include a second argument called `options`. This object is not exposed to the HTTP service, only to direct python calls. It accepts configuration which can be used in testing

For example, the job chat signature would become:
```
def main(data_dict: dict, options) -> dict:
```

Options might include a key called `anthropicHttpClient`, which will be passed to the Anthropic client instance created by our `AnthropicClient` class. This simulates any HTTP calls.

Test code would then set up a mock HTTP client which returns fixed values. It should also allow the unit tests to interrogate the request so that we can check certain values. For example, we might check that logs were appended to the prompt, or that the api key was included in the request headers.

The options object can take any options or  value which aid in testing these functions generally. For example an option might include a `toolCalls` list, where any toolCalls get pushed into the list like breadcrumbs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: Service Test Suite #472

Implementation notes:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

testing: Service Test Suite #472

Description

Implementation notes:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions