We use various instructor clients to evaluate the quality and capabilities of extractions and reasoning.
We'll run these tests and see what ends up failing often.
pip install -r requirements.txt
pytest
To contribute a new test similar to test_classification_literals.py
, follow these steps:
-
Create a New Test File:
- Create a new test file in the
tests
directory. For example,tests/test_new_feature.py
.
- Create a new test file in the
-
Import Necessary Modules:
- Import the required modules at the beginning of your test file. You will typically need
pytest
,product
fromitertools
,Literal
fromtyping
, andclients
fromutil
.
from itertools import product from typing import Literal from util import clients from pydantic import BaseModel import pytest
- Import the required modules at the beginning of your test file. You will typically need
-
Define Your Data Model:
- Define a Pydantic data model for the expected response. For example, if you are testing a sentiment analysis feature, you might define:
class SentimentAnalysis(BaseModel): label: Literal["positive", "negative", "neutral"]
-
Prepare Your Test Data:
- Prepare a list of tuples containing the input data and the expected output. For example:
data = [ ("I love this product!", "positive"), ("This is the worst experience ever.", "negative"), ("It's okay, not great but not bad either.", "neutral"), ]
-
Write the Test Function:
- Write an asynchronous test function using
pytest.mark.asyncio_cooperative
andpytest.mark.parametrize
to iterate over the clients and data. For example:
@pytest.mark.asyncio_cooperative @pytest.mark.parametrize("client, data", product(clients, data)) async def test_sentiment_analysis(client, data): input, expected = data prediction = await client.create( response_model=SentimentAnalysis, messages=[ { "role": "system", "content": "Analyze the sentiment of this text as 'positive', 'negative', or 'neutral'.", }, { "role": "user", "content": input, }, ], ) assert prediction.label == expected
- Write an asynchronous test function using
-
Run Your Tests:
- Run your tests using
pytest
to ensure everything works as expected.
pytest
- Run your tests using
By following these steps, you can easily add new tests to evaluate different features using various instructor clients. Make sure to keep your tests asynchronous and handle any specific requirements for the feature you are testing.
When contributing just make sure everything is as async and we'll handle the rest!
We could use contributions for almost everything from the examples page! https://useinstructor.com/examples/